v1.88.0rc3 - Claude Opus 4.8, MCP Access-Group Authorization & Typed OpenTelemetry
Deploy this version​
- Docker
- Pip
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:1.88.0-rc.3
pip install litellm==1.88.0rc3
Key Highlights​
v1.88.0rc3 is the current release candidate for 1.88.0.
- Claude Opus 4.8 is supported across Anthropic, Bedrock (including
global/us/eu/auregional routes), Azure AI, and Vertex, at 1M-token context with adaptive thinking andoutput_configgoal mode. - MCP access-group authorization was reworked end to end: key and team access groups now resolve to MCP servers, grants are additive with opt-in member assignment, and clients can route through stateful or stateless sessions by session id.
- Typed OpenTelemetry instrumentation lands a semconv-aligned span model that carries
team_metadata,http.route, and model names on inference spans. - Streaming is ~30% cheaper per chunk on the Anthropic and Bedrock hot path.
- Agent-to-agent (A2A) gains well-known agent-card discovery and a LangGraph Platform mode.
New Models / Updated Models​
New Model Support (Claude Opus 4.8 across 9 provider routes)​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| Anthropic | claude-opus-4-8 | 1,000,000 | $5.00 | $25.00 | Vision, function calling, prompt caching, reasoning (adaptive + max/xhigh effort), PDF input, computer use, response schema, tool choice, output_config |
| Vertex AI | vertex_ai/claude-opus-4-8 | 1,000,000 | $5.00 | $25.00 | Same as Anthropic direct |
| Azure AI | azure_ai/claude-opus-4-8 | 200,000 | $5.00 | $25.00 | Same as Anthropic direct |
| Bedrock | anthropic.claude-opus-4-8 (+ global. / us. / eu. / au. routes) | 1,000,000 | $5.00 | $25.00 | Same, plus native structured output |
Plus a reasoning-effort flag cleanup across existing Claude catalog entries: supports_minimal_reasoning_effort removed where unsupported, supports_max_reasoning_effort normalized, and a new bedrock_output_config_effort_ceiling (high / xhigh / max) field on Bedrock entries - PR #29238.
Features​
Bug Fixes​
- Anthropic
- Stop injecting unsupported
output_config.effort=xhighfor Claude Code on Sonnet/Opus 4.6 - PR #29304
- Stop injecting unsupported
- Vertex AI
- Strip
output_config.effortfor Vertex Claude models that reject it (Haiku 4.5) - PR #29585
- Strip
- Bedrock
- Align
toolUse/toolSpecnames and allow hyphens - PR #28874
- Align
- Azure
- Preserve AD token refresh in the v1 OpenAI client path - PR #28627
- OpenAI
- Fix the double provider-prefix bug on model names - PR #28661
- General
- Hydrate wildcard model-discovery credentials - PR #28284
LLM API Endpoints​
Features​
- Realtime API
- Tool calling for the Gemini and Vertex AI live API - PR #26590
- A2A
- Well-known agent-card discovery and LangGraph Platform mode - PR #28860
- Context Management
compact_20260112polyfill so non-Anthropic providers get context compaction - PR #28868
- Video
- Vertex Veo video edit, using DB credentials in the video handlers - PR #29098
- Pass-through
- Extend
passthrough_managed_object_idsto Azure - PR #29160
- Extend
Bugs​
- Realtime API
- Send TEXT frames and a valid guardrail
session.update- PR #28848
- Send TEXT frames and a valid guardrail
- Moderations
- Wire streaming flags through to the unified dispatcher - PR #27324
- Batches
- Vector Stores
- Restrict vector store index create/delete to proxy admins - PR #29202
- Video
- Resolve managed video model ids for auth - PR #29545
- Pass-through
- Bedrock Knowledge Base pass-through: preserve SigV4 headers and the signed request body - PR #27526
- Enforce
allowed_passthrough_routesforauth=truepass-through - PR #29256 - De-duplicate pass-through endpoint logs - PR #29598
- Match pass-through registry routes bare-to-bare when
SERVER_ROOT_PATHis set, fixing pass-through 404s - PR #29658
Management Endpoints / UI​
Features​
- Virtual Keys & Teams
- Expose
keys_counton/v2/team/listand wire the UI Resources badge - PR #28502 - Allow team members to create keys on org-scoped teams - PR #29310
- Exempt UI and CLI session tokens from team-key budget ceilings, hardened so custom
default_key_generate_paramscannot re-impose them - PR #29612, PR #29639 - Record ownership for service-account keys, plus a Prisma JSON serialization fix - PR #28990
- Expose
- Deployment
Bugs​
- Virtual Keys & Teams
- UI
- Allow clearing custom pricing on wildcard models - PR #28719
- Stop
vertex_ai-anthropic_modelsfrom leaking into the Anthropic dropdown - PR #28723 - Route API Reference back to the query-param page - PR #28726
- Show 2-decimal precision for
max_budgeton the key overview - PR #28809 - Break the logout redirect loop across dev and proxy origins - PR #29360
- Internal refactors: extract auth state into
AuthContext, remove dead App Router scaffolding - PR #28910, PR #28891
AI Integrations​
Logging​
- DataDog
- Drain the cost-management queue and add an opt-in FinOps tag allowlist - PR #28487
- Galileo
- Support the hosted v2 spans API and string output extraction - PR #28771
- OpenTelemetry
- General
Guardrails​
- General
Spend Tracking, Budgets and Rate Limiting​
- Cost Tracking — OpenAI regional-processing cost uplift for EU/US data residency - PR #28626
- Rate Limiting — Cap the no-
max_tokensTPM floor at the smallest configured limit (v3 limiter) - PR #28805 - Budgets — Enforce tag budgets for key-level tags - PR #29108
- Budgets — Enforce deployment budgets for dynamically added models - PR #29273
- Budgets —
reset_budgetwrites only{spend, budget_reset_at}and stops pre-zeroing the counter - PR #29358
MCP Gateway​
- Session Routing — Stateless and stateful clients via session-id routing - PR #26857
- Access Groups — Additive key access-group grants with opt-in member assignment - PR #29313
- Access Groups — Resolve team
access_group_idsto MCP servers - PR #28997 - Access Groups — Resolve key
access_group_idsto MCP servers (ungated) - PR #29195 - Access Groups — Extend the key access-group union to MCP servers - PR #28890
- Discovery — Allow
llm_api_routesvirtual keys to list MCP servers - PR #28442 - Server CRUD — Preserve
source_urlonGET /v1/mcp/serverlist responses - PR #29249 - Server CRUD — Preserve omitted fields on
PUT /v1/mcp/serverpartial updates - PR #29253 - Virtual Keys — Ignore stale ids on key save - PR #29128
Performance / Loadbalancing / Reliability improvements​
- Streaming hot path — ~30% lower per-chunk overhead on the Anthropic and Bedrock streaming path - PR #28720
- Docker — Use system Node in the componentized builders and retry
apk add- PR #28888 - Dependencies — Routine dependency bumps, including a Starlette bad-host fix - PR #29208, PR #29373
Documentation Updates​
- Hand-written
CLAUDE.md; removeAGENTS.mdand pointGEMINI.mdat it - PR #29252 - Agent guidance: require consent before writing new third-party names - PR #28908
- Cookbook: bump the Go directive to 1.26.3 in the gollem example - PR #29234
General Proxy Improvements​
Testing, CI & build hardening:
- UI e2e coverage across roles and flows — Team-BYOK add-model, Router fallback, MCP add-server, AI Hub make-public, Team Admin, Internal User / Viewer, logout and navbar identity - PR #29068, PR #29069, PR #29070, PR #29071, PR #29072, PR #29074, PR #29075, PR #29076, PR #29077, PR #29080, PR #29083, PR #28652
- Pass-through
SERVER_ROOT_PATHlogin-redirect trailing-slash e2e - PR #29369 - Behavior-pinning harnesses for
proxy_server.py- PR #28827, PR #29309 - Deterministic Redis cassette replay and live Google OAuth token minting for VCR - PR #28826, PR #29229
- Reasoning-effort grid test covering Claude Opus 4.8 across provider routes - PR #29327
- Bedrock CI account moves and restore - PR #28728, PR #29326, PR #29245
- Keep
litellm_internal_staginggreen - PR #29344 - Regenerate the admin-ui static export with
trailingSlash: true- PR #28112
PR roll-up by ownership area​
PRs by ownership area (total: 97)
- Other (CI / tests / build hardening): 23
- UI / Auth & Management: 18
- LLM API Endpoints: 15
- MCP: 9
- Models & Providers: 9
- Logging: 8
- Spend / Budgets / Rate Limits: 5
- Performance: 4
- Documentation: 3
- Guardrails: 3
Release candidate changelog (rc.1 → rc.2 → rc.3)​
Almost everything above shipped in rc.1. The later candidates are small, targeted patches cut by cherry-pick.
rc.2 added six fixes:
- Resolve managed video model ids for auth - PR #29545
- Allow team members to create keys on org-scoped teams - PR #29310
- Strip
output_config.effortfor Vertex Claude Haiku 4.5 - PR #29585 - De-duplicate pass-through endpoint logs - PR #29598
- Exempt UI/CLI session tokens from team-key budget ceilings - PR #29612
- Harden that exemption against custom
default_key_generate_params- PR #29639
rc.3 added one fix:
- Match pass-through registry routes bare-to-bare when
SERVER_ROOT_PATHis set, fixing pass-through 404s - PR #29658
New Contributors​
No new contributors this release; all 11 authors are returning contributors.
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.87.0-rc.1...v1.88.0-rc.3
06/04/2026 (v1.88.0rc3)​
- New Models / Updated Models: 9
- LLM API Endpoints: 15
- Management Endpoints / UI: 18
- AI Integrations (Logging / Guardrails): 11
- Spend Tracking, Budgets and Rate Limiting: 5
- MCP Gateway: 9
- Performance / Loadbalancing / Reliability improvements: 4
- General Proxy Improvements (testing / CI / build): 23
- Documentation Updates: 3
Total: 97 PRs