[PRE-RELEASE]v1.75.5-stable
Deploy this version​
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.75.5.rc.1
pip install litellm
pip install litellm==1.75.5.post1
New Models / Updated Models​
New Model Support​
Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
---|---|---|---|---|
Bedrock | bedrock/us.anthropic.claude-opus-4-1-20250805-v1:0 | 200k | $15 | $75 |
Bedrock | bedrock/openai.gpt-oss-20b-1:0 | 200k | 0.07 | 0.3 |
Bedrock | bedrock/openai.gpt-oss-120b-1:0 | 200k | 0.15 | 0.6 |
Fireworks AI | fireworks_ai/accounts/fireworks/models/glm-4p5 | 128k | 0.55 | 2.19 |
Fireworks AI | fireworks_ai/accounts/fireworks/models/glm-4p5-air | 128k | 0.22 | 0.88 |
Fireworks AI | fireworks_ai/accounts/fireworks/models/gpt-oss-120b | 131072 | 0.15 | 0.6 |
Fireworks AI | fireworks_ai/accounts/fireworks/models/gpt-oss-20b | 131072 | 0.05 | 0.2 |
Groq | groq/openai/gpt-oss-20b | 131072 | 0.1 | 0.5 |
Groq | groq/openai/gpt-oss-120b | 131072 | 0.15 | 0.75 |
OpenAI | openai/gpt-5 | 400k | 1.25 | 10 |
OpenAI | openai/gpt-5-2025-08-07 | 400k | 1.25 | 10 |
OpenAI | openai/gpt-5-mini | 400k | 0.25 | 2 |
OpenAI | openai/gpt-5-mini-2025-08-07 | 400k | 0.25 | 2 |
OpenAI | openai/gpt-5-nano | 400k | 0.05 | 0.4 |
OpenAI | openai/gpt-5-nano-2025-08-07 | 400k | 0.05 | 0.4 |
OpenAI | openai/gpt-5-chat | 400k | 1.25 | 10 |
OpenAI | openai/gpt-5-chat-latest | 400k | 1.25 | 10 |
Azure | azure/gpt-5 | 400k | 1.25 | 10 |
Azure | azure/gpt-5-2025-08-07 | 400k | 1.25 | 10 |
Azure | azure/gpt-5-mini | 400k | 0.25 | 2 |
Azure | azure/gpt-5-mini-2025-08-07 | 400k | 0.25 | 2 |
Azure | azure/gpt-5-nano-2025-08-07 | 400k | 0.05 | 0.4 |
Azure | azure/gpt-5-nano | 400k | 0.05 | 0.4 |
Azure | azure/gpt-5-chat | 400k | 1.25 | 10 |
Azure | azure/gpt-5-chat-latest | 400k | 1.25 | 10 |
Features​
- OCI
- New LLM provider - PR #13206
- JinaAI
- support multimodal embedding models - PR #13181
- GPT-5 (OpenAI/Azure)
- Anthropic
- Add claude-opus-4-1 on model cost map - PR #13384
- OpenRouter
- Add gpt-oss to model cost map - PR #13442
- Cerebras
- Add gpt-oss to model cost map - PR #13442
- Azure
- Support drop params for ‘temperature’ on o-series models - PR #13353
- GradientAI
- New LLM Provider - PR #12169
Bugs​
LLM API Endpoints​
Features​
/responses
Bugs​
/chat/completions
/responses
/v1/messages
- Added litellm claude code count tokens - PR #13261
/vector_stores
- Fix create/search vector store errors - PR #13285
MCP Gateway​
Features​
Bugs​
- Fix auth on UI for bearer token servers - PR #13312
- allow access group on mcp tool retrieval - PR #13425
Management Endpoints / UI​
Features​
- Teams
- Add team deletion check for teams with keys - PR #12953
- Models
- Keys
- Navbar
- Add logo customization for LiteLLM admin UI - PR #12958
- Logs
- Add token breakdowns on logs + session page - PR #13357
- Usage
- Ensure Usage Page loads after the DB has large entries - PR #13400
- Test Key Page
- allow uploading images for /chat/completions and /responses - PR #13445
- MCP
- Add auth tokens to local storage auth - PR #13473
Bugs​
- Custom Root Path
- Fix login route when SSO is enabled - PR #13267
- Customers/End-users
- Allow calling /v1/models when end user over budget - allows model listing to work on OpenWebUI when customer over budget - PR #13320
- Teams
- Remove user - team membership, when user removed from team - PR #13433
- Errors
- Bubble up network errors to user for Logging and Alerts page - PR #13427
- Model Hub
- Show pricing for azure models, when base model is set - PR #13418
Logging / Guardrail Integrations​
Features​
- Bedrock Guardrails
- Redacted sensitive information in bedrock guardrails error message - PR #13356
- Standard Logging Payload
- Fix ‘can’t register atextexit’ bug - PR #13436
Bugs​
- Braintrust
- Allow setting of braintrust callback base url - PR #13368
- OTEL
- Track pre_call hook latency - PR #13362
Performance / Loadbalancing / Reliability improvements​
Features​
- Team-BYOK models
- Add wildcard model support - PR #13278
- Caching
- GCP IAM auth support for caching - PR #13275
- Latency
- reduce p99 latency w/ redis enabled by 50% - only updates model usage if tpm/rpm limits set - PR #13362
General Proxy Improvements​
Features​
- Models
- Support /v1/models/{model_id} retrieval - PR #13268
- Multi-instance
- Ensure disable_llm_api_endpoints works - PR #13278
- Logs
- Add apscheduler log suppress - PR #13299
- Helm
- Add labels to migrations job template - PR #13343 s/o @unique-jakub
Bugs​
- Non-root image
- Fix non-root image for migration - PR #13379
- Get Routes
- Load get routes when using fastapi-offline - PR #13466
- Health checks
- Generate unique trace IDs for Langfuse health checks - PR #13468
- Swagger
- Allow using Swagger for /chat/completions - PR #13469
- Auth
- Fix JWTs access not working with model access groups - PR #13474
New Contributors​
- @bbartels made their first contribution in https://github.com/BerriAI/litellm/pull/13244
- @breno-aumo made their first contribution in https://github.com/BerriAI/litellm/pull/13206
- @pascalwhoop made their first contribution in https://github.com/BerriAI/litellm/pull/13122
- @ZPerling made their first contribution in https://github.com/BerriAI/litellm/pull/13045
- @zjx20 made their first contribution in https://github.com/BerriAI/litellm/pull/13181
- @edwarddamato made their first contribution in https://github.com/BerriAI/litellm/pull/13368
- @msannan2 made their first contribution in https://github.com/BerriAI/litellm/pull/12169