Blog
Skip to main content

Day 0 Support: GPT-5.5 and GPT-5.5 Pro

Mateo Wang
AI Engineer, LiteLLM
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

LiteLLM now supports GPT-5.5 and GPT-5.5 Pro on Day 0. Route traffic to OpenAI's latest frontier model through the LiteLLM AI Gateway with no code changes.

GPT-5.5 is OpenAI's "smartest and most intuitive to use model" yet, with significant gains on agentic coding, computer use, and deep research workflows. Per OpenAI, it is a faster, sharper thinker for fewer tokens compared to GPT-5.4. GPT-5.5 Pro targets the most demanding reasoning tasks.

note

No Docker image upgrade needed. GPT-5.5 routes through the existing OpenAIGPT5Config in LiteLLM, so any recent version works out of the box.

For cost tracking, hit the Reload Model Cost Map button in the Admin UI (or POST /reload/model_cost_map) to pull the latest pricing from GitHub. This feature is available on v1.76.0 and above.

Usage​

1. Setup config.yaml

model_list:
- model_name: gpt-5.5
litellm_params:
model: openai/gpt-5.5
api_key: os.environ/OPENAI_API_KEY
- model_name: gpt-5.5-pro
litellm_params:
model: openai/gpt-5.5-pro
api_key: os.environ/OPENAI_API_KEY

2. Start the proxy

docker run -d \
-p 4000:4000 \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:v1.83.7-stable \
--config /app/config.yaml

3. Test it

curl -X POST "http://0.0.0.0:4000/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LITELLM_KEY" \
-d '{
"model": "gpt-5.5",
"messages": [
{"role": "user", "content": "Write a Python function to check if a number is prime."}
]
}'

Responses API​

For agentic and multi-turn workflows, use /v1/responses to preserve reasoning state and output item metadata across turns.

curl -X POST "http://0.0.0.0:4000/v1/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LITELLM_KEY" \
-d '{
"model": "gpt-5.5",
"input": "Plan and write a Python script that scrapes a webpage and summarizes it."
}'

Reasoning Effort​

reasoning_effort controls how much thinking the model applies. Supported values per model (verified against OpenAI's live API on 2026-04-24):

ModelDefaultAllowed values
gpt-5.5mediumnone, low, medium, high, xhigh
gpt-5.5-prohighmedium, high, xhigh
from litellm import completion

response = completion(
model="openai/gpt-5.5",
messages=[{"role": "user", "content": "Solve: what is the optimal strategy for..."}],
reasoning_effort="high",
)

LiteLLM enforces these caps locally — passing an unsupported value (e.g. minimal) raises an UnsupportedParamsError instead of round-tripping to OpenAI for a 400.

Notes​

  • For cost tracking on gpt-5.5 and gpt-5.5-pro, hit the Reload Model Cost Map button in the Admin UI (or POST /reload/model_cost_map). Works on any LiteLLM version v1.76.0 or newer — no container restart or image upgrade required.
  • gpt-5.5-pro is a Responses API-only model (mode: "responses"). LiteLLM's Responses API bridge transparently translates completion() calls to /v1/responses, so the SDK example above works without code changes.
  • GPT-5.5 supports reasoning, function calling, parallel tool calls, vision (image input), PDF input, prompt caching, web search, and structured output — see the OpenAI provider docs for advanced usage.
  • Context window: 1.05M input tokens / 128K output tokens. Long-context tier pricing kicks in above 272K tokens.
  • Azure availability: not yet — this post covers OpenAI direct only.
🚅
LiteLLM Enterprise
SSO/SAML, audit logs, spend tracking, multi-team management, and guardrails — built for production.
Learn more →

We're hiring

Like what you see? Join us

Come build the future of AI infrastructure.