v1.89.0 - Claude Fable 5, A2A Agent Providers & MCP Per-Server Controls
Deploy this version​
- Docker
- Pip
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:v1.89.0
pip install litellm==1.89.0
Key Highlights​
v1.89.0 builds on v1.88.0.
- Claude Fable 5 is supported across Anthropic, Bedrock, Azure AI, and Vertex at 1M-token context with adaptive thinking and computer use.
- Agent-to-agent (A2A) gains two new agent providers - watsonx Orchestrate and LangFlow (with A2A session bridging) - plus OAuth M2M for Databricks Apps agents.
- MCP gateway adds per-server environment variables with global and per-user scopes, per-server RPM rate limiting for keys and teams, OAuth passthrough with issuer-scoped JWT auth, and
oauth2_flowpersistence on server registration. - Observability lands OpenInference rendering parity for Arize/Phoenix (tool calls, cost, passthrough I/O, sessions, multimodal, cache tokens), MCP semantic conventions on the typed OTel v2 spans, and a Galileo logger that uses the ingest-traces API.
- New search and transcription providers - APISerpent, You.com, and Soniox - join the gateway, alongside the dashboard's migration to fully typed, OpenAPI-generated API clients.
New Providers and Endpoints​
New Providers (3 new providers)​
| Provider | Supported LiteLLM Endpoints | Description |
|---|---|---|
APISerpent (apiserpent) | Search | Web search and deep-search API |
You.com (you_com) | Search | You.com web search API |
Soniox (soniox) | Audio Transcription | Async speech-to-text (stt-async-v4) |
New Models / Updated Models​
New Model Support (selected)​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| Anthropic | claude-fable-5 | 1,000,000 | $10.00 | $50.00 | Adaptive thinking, computer use, function calling, prompt caching, vision |
| Vertex AI | vertex_ai/claude-fable-5 | 1,000,000 | $10.00 | $50.00 | Same as Anthropic direct |
| Azure AI | azure_ai/claude-fable-5 | 1,000,000 | $10.00 | $50.00 | Same as Anthropic direct |
| Bedrock | anthropic.claude-fable-5 (+ global. / us. / eu. routes) | 1,000,000 | $10.00 | $50.00 | Same as Anthropic direct |
| Bedrock Mantle | bedrock_mantle/openai.gpt-5.5 | 272,000 | $5.50 | $33.00 | Responses API, reasoning, function calling, prompt caching |
| Bedrock Mantle | bedrock_mantle/openai.gpt-5.4 | 272,000 | $2.75 | $16.50 | Responses API, reasoning, function calling, prompt caching |
| Azure AI | azure_ai/kimi-k2.6 | 262,144 | $0.95 | $4.00 | Reasoning, vision, function calling, tool choice |
| MiniMax | minimax/MiniMax-M3 | 512,000 | $0.60 | $2.40 | Reasoning, prompt caching, function calling |
| Inception | inception/mercury-2 (+ mercury-edit-2) | 128,000 | $0.25 | $0.75 | Function calling, prompt caching, response schema |
Additional model-map additions: fal.ai Nano Banana and Gemini 2.5 Flash Image generation - PR #29798; mistral/ministral-8b-latest - PR #29453; a batch of new Snowflake Cortex model entries (Claude, GPT, Llama, embeddings); vertex_ai/google/gemma-4-26b-a4b-it-maas; APISerpent, You.com, and Soniox catalog entries; and a jp. regional route for Claude Opus 4.7.
Features​
- Anthropic
- Route future Claude models to the Anthropic provider via pattern matching - PR #29239
- Route Claude Opus 4.8 through adaptive thinking - PR #29702
- Emit a thinking block for
reasoning_content-only streaming chunks in the Anthropic adapter - PR #29600 - Inline legacy
$refdefs in tool schemas (Anthropic and Fireworks) - PR #28646
- Gemini
- Vertex AI
- Azure AI
- Strip tool-level extra fields on a 400 and retry - PR #29479
Bug Fixes​
- General
LLM API Endpoints​
Features​
- Batches
- Vector Stores
- Realtime
Agents (A2A)​
- watsonx Orchestrate agent provider - PR #29410
- LangFlow agent provider with A2A session bridging - PR #28963
- OAuth M2M for Databricks Apps A2A agents - PR #29586
- A2A bug fixes - PR #29566
Management Endpoints / UI​
Features​
- Virtual Keys & Auth
- Dashboard
- Generate dashboard API types from the proxy OpenAPI spec - PR #29816
- Centralize proxy base-URL resolution into a tested resolver - PR #29793
- Route networking calls through a shared, location-pinned
apiClient- PR #29723, PR #29806, PR #29815 - Migrate ESLint to flat config and bump
eslint-config-nextto 16 - PR #29626
Bug Fixes​
- Use the resolved DB
user_idfor spend on legacy email match (JWT) - PR #29217 - Preserve the 401 status for expired JWTs in OTel traces - PR #29510
- Stop team BYOK model-name corruption on model edit - PR #29731
- Drop a deleted team BYOK model name from
team.models- PR #29820 - Add
default=NonetoLiteLLM_TeamMembership.litellm_budget_table- PR #29684 - Require a new expiration when regenerating an expired key - PR #29838
- Render caller-supplied filter options in caller order (LIT-3151) - PR #29462
- Make A2A skill tags enterable and validated - PR #29512
- Persist the Tools-tab MCP OAuth token to the DB - PR #29809
- Route MCP playground auth by OAuth2 mode instead of
token_url- PR #29714 - Stop MCP playground tool calls from sending twice - PR #29821
AI Integrations​
Logging​
- Arize / Phoenix
- OpenInference rendering parity: tool calls, cost, passthrough I/O, session/user, multimodal, and cache tokens - PR #28800
- Datadog
- Split oversized batches on a 413 instead of re-queueing forever - PR #29444
- Galileo
- Use the ingest-traces API and the standard logging payload - PR #29651
- OpenTelemetry
- Allowlist
team_metadatasub-keys promoted to baggage - PR #29442 - Add MCP semantic conventions to OTel v2 - PR #29468
- Capture 401 error details in management-endpoint spans - PR #29535
- Emit the missing MCP span attributes - PR #29554
- Emit a guardrail span on passthrough, including when a guardrail blocks - PR #29552, PR #29470
- Allowlist
Guardrails​
- Sensitive Data Routing
- Route sensitive data to on-premise models - PR #29531
Spend Tracking, Budgets and Rate Limiting​
- Strip NUL bytes from spend-log payloads to prevent PostgreSQL
22P05errors - PR #29515 - Scope the session-token team-key budget exemption to a caller-supplied
team_id- PR #29641
MCP Gateway​
- Per-server environment variables with global and per-user scopes - PR #28917
- Per-MCP-server RPM rate limiting for keys and teams - PR #29482
- Support MCP OAuth passthrough and issuer-scoped JWT auth - PR #28356
- Persist
oauth2_flowon MCP server registration - PR #29690 - Clear
allowed_toolsand tool overrides on MCP server edit - PR #29411 - Gate
/public/mcp_hubstrictly onlitellm.public_mcp_servers- PR #27764
Performance / Loadbalancing / Reliability improvements​
- Native
/health/drainpreStop hook for graceful shutdown - PR #29439 - Disable proxy buffering on streaming SSE responses - PR #29557
- Populate
llm_provideron internal rate-limit errors - PR #27707 - Hot-reload
.envin dev when running with--reload- PR #29783 - Enable the Helm backend deployment to mount the gateway
config.yaml- PR #29605 - Convert the AWS and GCP Terraform stacks into reusable modules - PR #28103
- Terraform GCP: abandon the SQL user on destroy - PR #29855; prompt for
image_registryin the DeployStack one-click - PR #29852 - Dependency bumps - PR #29860
Documentation Updates​
- Clarify when to create new test files - PR #29472
- Remove fixed dimensions from the README hero image - PR #29496
- CLAUDE.md nits - PR #29504, PR #29749
PR roll-up by ownership area​
PRs by ownership area (visible, non-vehicle set; total: 101)
- UI / Dashboard: 22
- General Proxy (testing / CI / build): 22
- Models & Providers: 13
- Performance / Reliability: 10
- Logging: 9
- LLM API Endpoints: 8
- MCP: 6
- Auth & Management: 5
- Agents (A2A): 4
- Docs: 4
- Spend / Budgets / Rate Limits: 2
- Models & Providers (new providers): 3
- Guardrails: 1
New Contributors​
- @someswar177 made their first contribution in https://github.com/BerriAI/litellm/pull/26585
- @trexinc made their first contribution in https://github.com/BerriAI/litellm/pull/26597
- @navnitshukla made their first contribution in https://github.com/BerriAI/litellm/pull/26609
- @tanmay958 made their first contribution in https://github.com/BerriAI/litellm/pull/27580
- @samagana made their first contribution in https://github.com/BerriAI/litellm/pull/27810
- @DrishnaTrivedi made their first contribution in https://github.com/BerriAI/litellm/pull/28330
- @brainsparker made their first contribution in https://github.com/BerriAI/litellm/pull/28370
- @icep87 made their first contribution in https://github.com/BerriAI/litellm/pull/28846
- @adriangomez24 made their first contribution in https://github.com/BerriAI/litellm/pull/29097
- @zzw-math made their first contribution in https://github.com/BerriAI/litellm/pull/29325
- @BeginnerRudy made their first contribution in https://github.com/BerriAI/litellm/pull/29392
- @danisalvaa made their first contribution in https://github.com/BerriAI/litellm/pull/29394
- @kapelame made their first contribution in https://github.com/BerriAI/litellm/pull/29412
- @Zhao73 made their first contribution in https://github.com/BerriAI/litellm/pull/29419
- @suleimanelkhoury made their first contribution in https://github.com/BerriAI/litellm/pull/29420
- @aneeshsangvikar made their first contribution in https://github.com/BerriAI/litellm/pull/29427
- @Ar-maan05 made their first contribution in https://github.com/BerriAI/litellm/pull/29483
- @kingdoooo made their first contribution in https://github.com/BerriAI/litellm/pull/29490
- @dan2k3k4 made their first contribution in https://github.com/BerriAI/litellm/pull/29508
- @yanismiraoui made their first contribution in https://github.com/BerriAI/litellm/pull/29522
- @josx made their first contribution in https://github.com/BerriAI/litellm/pull/29532
- @1qh made their first contribution in https://github.com/BerriAI/litellm/pull/29561
- @tin-berri made their first contribution in https://github.com/BerriAI/litellm/pull/29605
- @mak2508 made their first contribution in https://github.com/BerriAI/litellm/pull/29606
- @VANDRANKI made their first contribution in https://github.com/BerriAI/litellm/pull/29620
- @andrey-dubnik made their first contribution in https://github.com/BerriAI/litellm/pull/29621
- @ErRickow made their first contribution in https://github.com/BerriAI/litellm/pull/29646
- @saswatds made their first contribution in https://github.com/BerriAI/litellm/pull/29650
- @Dinesh-Girbide made their first contribution in https://github.com/BerriAI/litellm/pull/29655
- @BWAAEEEK made their first contribution in https://github.com/BerriAI/litellm/pull/29660
- @hectorc98 made their first contribution in https://github.com/BerriAI/litellm/pull/29672
- @abhay23-AI made their first contribution in https://github.com/BerriAI/litellm/pull/29779
Full Changelog​
https://github.com/BerriAI/litellm/compare/v1.88.0...v1.89.0
06/10/2026​
- New Models / Updated Models: 16
- LLM API Endpoints: 12
- Management Endpoints / UI: 22
- AI Integrations (Logging / Guardrails): 10
- Spend Tracking, Budgets and Rate Limiting: 2
- MCP Gateway: 6
- Performance / Loadbalancing / Reliability improvements: 10
- General Proxy Improvements (testing / CI / build): 22
- Documentation Updates: 4