[Preview] v1.80.8.rc.1 - Introducing A2A Agent Gateway
Deploy this version​
- Docker
- Pip
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.80.8.rc.1
pip install litellm==1.80.8
Key Highlights​
- Agent Gateway (A2A) - Invoke agents through the AI Gateway with request/response logging and access controls
- Guardrails API v2 - Generic Guardrail API with streaming support, structured messages, and tool call checks
- Customer (End User) Usage UI - Track and visualize end-user spend directly in the dashboard
- vLLM Batch + Files API - Support for batch and files API with vLLM deployments
- Dynamic Rate Limiting on Teams - Enable dynamic rate limits and priority reservation on team-level
- Google Cloud Chirp3 HD - New text-to-speech provider with Chirp3 HD voices
Agent Gateway (A2A)​
This release introduces A2A Agent Gateway for LiteLLM, allowing you to invoke and manage A2A agents with the same controls you have for LLM APIs.
As a LiteLLM Gateway Admin, you can now do the following:
- Request/Response Logging - Every agent invocation is logged to the Logs page with full request and response tracking.
- Access Control - Control which Team/Key can access which agents.
As a developer, you can continue using the A2A SDK, all you need to do is point you A2AClient to the LiteLLM proxy URL and your API key.
Works with the A2A SDK:
from a2a.client import A2AClient
client = A2AClient(
base_url="http://localhost:4000", # Your LiteLLM proxy
api_key="sk-1234" # LiteLLM API key
)
response = client.send_message(
agent_id="my-agent",
message="What's the status of my order?"
)
Get started with Agent Gateway here: Agent Gateway Documentation
Customer (End User) Usage UI​
Users can now filter usage statistics by customers, providing the same granular filtering capabilities available for teams and organizations.
Details:
- Filter usage analytics, spend logs, and activity metrics by customer ID
- View customer-level breakdowns alongside existing team and user-level filters
- Consistent filtering experience across all usage and analytics views
New Providers and Endpoints​
New Providers (5 new providers)​
| Provider | Supported LiteLLM Endpoints | Description |
|---|---|---|
| Z.AI (Zhipu AI) | /v1/chat/completions, /v1/responses, /v1/messages | Built-in support for Zhipu AI GLM models |
| RAGFlow | /v1/chat/completions, /v1/responses, /v1/messages, /v1/vector_stores | RAG-based chat completions with vector store support |
| PublicAI | /v1/chat/completions, /v1/responses, /v1/messages | OpenAI-compatible provider via JSON config |
| Google Cloud Chirp3 HD | /v1/audio/speech, /v1/audio/speech/stream | Text-to-speech with Google Cloud Chirp3 HD voices |
New LLM API Endpoints (2 new endpoints)​
| Endpoint | Method | Description | Documentation |
|---|---|---|---|
/v1/agents/invoke | POST | Invoke A2A agents through the AI Gateway | Agent Gateway |
/cursor/chat/completions | POST | Cursor BYOK endpoint - accepts Responses API input, returns Chat Completions output | Cursor Integration |
New Models / Updated Models​
New Model Support (33 new models)​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| OpenAI | gpt-5.1-codex-max | 400K | $1.25 | $10.00 | Reasoning, vision, PDF input, responses API |
| Azure | azure/gpt-5.1-codex-max | 400K | $1.25 | $10.00 | Reasoning, vision, PDF input, responses API |
| Anthropic | claude-opus-4-5 | 200K | $5.00 | $25.00 | Computer use, reasoning, vision |
| Bedrock | global.anthropic.claude-opus-4-5-20251101-v1:0 | 200K | $5.00 | $25.00 | Computer use, reasoning, vision |
| Bedrock | amazon.nova-2-lite-v1:0 | 1M | $0.30 | $2.50 | Reasoning, vision, video, PDF input |
| Bedrock | amazon.titan-image-generator-v2:0 | - | - | $0.008/image | Image generation |
| Fireworks | fireworks_ai/deepseek-v3p2 | 164K | $1.20 | $1.20 | Function calling, response schema |
| Fireworks | fireworks_ai/kimi-k2-instruct-0905 | 262K | $0.60 | $2.50 | Function calling, response schema |
| DeepSeek | deepseek/deepseek-v3.2 | 164K | $0.28 | $0.40 | Reasoning, function calling |
| Mistral | mistral/mistral-large-3 | 256K | $0.50 | $1.50 | Function calling, vision |
| Azure AI | azure_ai/mistral-large-3 | 256K | $0.50 | $1.50 | Function calling, vision |
| Moonshot | moonshot/kimi-k2-0905-preview | 262K | $0.60 | $2.50 | Function calling, web search |
| Moonshot | moonshot/kimi-k2-turbo-preview | 262K | $1.15 | $8.00 | Function calling, web search |
| Moonshot | moonshot/kimi-k2-thinking-turbo | 262K | $1.15 | $8.00 | Function calling, web search |
| OpenRouter | openrouter/deepseek/deepseek-v3.2 | 164K | $0.28 | $0.40 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-haiku-4-5 | 200K | $1.00 | $5.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-opus-4 | 200K | $15.00 | $75.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-opus-4-1 | 200K | $15.00 | $75.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-opus-4-5 | 200K | $5.00 | $25.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-sonnet-4 | 200K | $3.00 | $15.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-sonnet-4-1 | 200K | $3.00 | $15.00 | Reasoning, function calling |
| Databricks | databricks/databricks-gemini-2-5-flash | 1M | $0.30 | $2.50 | Function calling |
| Databricks | databricks/databricks-gemini-2-5-pro | 1M | $1.25 | $10.00 | Function calling |
| Databricks | databricks/databricks-gpt-5 | 400K | $1.25 | $10.00 | Function calling |
| Databricks | databricks/databricks-gpt-5-1 | 400K | $1.25 | $10.00 | Function calling |
| Databricks | databricks/databricks-gpt-5-mini | 400K | $0.25 | $2.00 | Function calling |
| Databricks | databricks/databricks-gpt-5-nano | 400K | $0.05 | $0.40 | Function calling |
| Vertex AI | vertex_ai/chirp | - | $30.00/1M chars | - | Text-to-speech (Chirp3 HD) |
| Z.AI | zai/glm-4.6 | 200K | $0.60 | $2.20 | Function calling |
| Z.AI | zai/glm-4.5 | 128K | $0.60 | $2.20 | Function calling |
| Z.AI | zai/glm-4.5v | 128K | $0.60 | $1.80 | Function calling, vision |
| Z.AI | zai/glm-4.5-flash | 128K | Free | Free | Function calling |
| Vertex AI | vertex_ai/bge-large-en-v1.5 | - | - | - | BGE Embeddings |
Features​
-
- Allow reasoning_effort='none' for Azure gpt-5.1 models - PR #17311
-
- Add Nova embedding support - PR #17253
- Add support for Bedrock Qwen 2 imported model - PR #17461
- Bedrock OpenAI model support - PR #17368
- Add support for file content download for Bedrock batches - PR #17470
- Make streaming chunk size configurable in Bedrock API - PR #17357
- Add experimental latest-user filtering for Bedrock - PR #17282
- Handle Cohere v4 embed response dictionary format - PR #17220
- Remove not compatible beta header from Bedrock - PR #17301
- Add model price and details for Global Opus 4.5 Bedrock endpoint - PR #17380
-
Gemini (Google AI Studio + Vertex AI)
- Add better handling in image generation for Gemini models - PR #17292
- Fix reasoning_content showing duplicate content in streaming responses - PR #17266
- Handle partial JSON chunks after first valid chunk - PR #17496
- Fix Gemini 3 last chunk thinking block - PR #17403
- Fix Gemini image_tokens treated as text tokens in cost calculation - PR #17554
- Make sure that media resolution is only for Gemini 3 model - PR #17137
-
- Add Z.AI as built-in provider - PR #17307
-
- Update Databricks model pricing and add new models - PR #17277
-
- Add support of audio transcription for OVHcloud - PR #17305
-
- Add Mistral Large 3 model support - PR #17547
-
- Fix missing Moonshot turbo models and fix incorrect pricing - PR #17432
-
- Add context window exception mapping for Together AI - PR #17284
-
- Support Deepseek 3.2 with Reasoning - PR #17384
-
- Add Nova Lite 2 reasoning support with reasoningConfig - PR #17371
-
- Fix auth not working with ollama.com - PR #17191
-
- Fix supports_response_schema before using json_tool_call workaround - PR #17438
-
- Fix empty response + vLLM streaming - PR #17516
-
- Add support for TwelveLabs Pegasus video understanding - PR #17193
Bug Fixes​
-
- Fix extra_headers in messages API bedrock invoke - PR #17271
- Fix Bedrock models in model map - PR #17419
- Make Bedrock converse messages respect modify_params as expected - PR #17427
- Fix Anthropic beta headers for Bedrock imported Qwen models - PR #17467
- Preserve usage from JSON response for OpenAI provider in Bedrock - PR #17589
-
- Fix acompletion throws error with SambaNova models - PR #17217
-
General
LLM API Endpoints​
Features​
-
- Add passthrough cost tracking for Veo - PR #17296
-
- Add missing OCR and aOCR to CallTypes enum - PR #17435
-
General
- Support routing to only websearch supported deployments - PR #17500
Bugs​
- General
Management Endpoints / UI​
Features​
-
New Login Page
-
Customer (End User) Usage
-
Virtual Keys
-
Models + Endpoints
-
Callbacks
-
Management Routes
-
OCI Configuration
- Enable Oracle Cloud Infrastructure configuration via UI - PR #17159
Bugs​
-
UI Fixes
- Fix Request and Response Panel JSONViewer - PR #17233
- Adding Button Loading States to Edit Settings - PR #17236
- Fix Various Text, button state, and test changes - PR #17237
- Fix Fallbacks Immediately Deleting before API resolves - PR #17238
- Remove Feature Flags - PR #17240
- Fix metadata tags and model name display in UI for Azure passthrough - PR #17258
- Change labeling around Vertex Fields - PR #17383
- Remove second scrollbar when sidebar is expanded + tooltip z index - PR #17436
- Fix Select in Edit Membership Modal - PR #17524
- Change useAuthorized Hook to redirect to new Login Page - PR #17553
-
SSO
-
Auth / JWT
- JWT Auth - Allow using regular OIDC flow with user info endpoints - PR #17324
- Fix litellm user auth not passing issue - PR #17342
- Add other routes in JWT auth - PR #17345
- Fix new org team validate against org - PR #17333
- Fix litellm_enterprise ensure imported routes exist - PR #17337
- Use organization.members instead of deprecated organization field - PR #17557
-
Organizations/Teams
AI Integrations (2 new integrations)​
Logging (1 new integration)​
New Integration​
Improvements & Fixes​
-
- Fix Datadog callback regression when ddtrace is installed - PR #17393
-
- Fix clean arize-phoenix traces - PR #16611
-
- Fix MLflow streaming spans for Anthropic passthrough - PR #17288
-
- Fix Langfuse logger test mock setup - PR #17591
-
General
- Improve PII anonymization handling in logging callbacks - PR #17207
Guardrails (1 new integration)​
New Integration​
- Generic Guardrail API
- Generic Guardrail API - allows guardrail providers to add INSTANT support for LiteLLM w/out PR to repo - PR #17175
- Guardrails API V2 - user api key metadata, session id, specify input type (request/response), image support - PR #17338
- Guardrails API - add streaming support - PR #17400
- Guardrails API - support tool call checks on OpenAI
/chat/completions, OpenAI/responses, Anthropic/v1/messages- PR #17459 - Guardrails API - new
structured_messagesparam - PR #17518 - Correctly map a v1/messages call to the anthropic unified guardrail - PR #17424
- Support during_call event type for unified guardrails - PR #17514
Improvements & Fixes​
-
- Refactor Noma guardrail to use shared Responses transformation and include system instructions - PR #17315
-
- Fix AIM guardrail tests - PR #17499
-
- Fix Bedrock Guardrail indent and import - PR #17378
-
General Guardrails
Secret Managers​
-
- Allow setting SSL verify to false - PR #17433
-
General
- Make email and secret manager operations independent in key management hooks - PR #17551
Spend Tracking, Budgets and Rate Limiting​
-
Rate Limiting
-
Spend Logs
-
Enforce User Param
- Enforce support of enforce_user_param to OpenAI post endpoints - PR #17407
MCP Gateway​
-
MCP Configuration
-
MCP Tool Results
- Preserve tool metadata in CallToolResult - PR #17561
Agent Gateway (A2A)​
-
Agent Invocation
-
Agent Access Control
- Enforce Allowed agents by key, team + add agent access groups on backend - PR #17502
-
Agent Gateway UI
Performance / Loadbalancing / Reliability improvements​
-
Audio/Speech Performance
- Fix
/audio/speechperformance by usingshared_sessions- PR #16739
- Fix
-
Memory Optimization
-
Database
-
Proxy Caching
- Fix proxy caching between requests in aiohttp transport - PR #17122
-
Session Management
-
Vector Store
- Fix vector store configuration synchronization failure - PR #17525
Documentation Updates​
-
Provider Documentation
-
Guides
-
Projects
-
Cleanup
Infrastructure / CI/CD​
-
Helm Chart
- Add ingress-only labels - PR #17348
-
Docker
-
OpenAPI Schema
- Refactor add_schema_to_components to move definitions to components/schemas - PR #17389
-
Security
New Contributors​
- @weichiet made their first contribution in PR #17242
- @AndyForest made their first contribution in PR #17220
- @omkar806 made their first contribution in PR #17217
- @v0rtex20k made their first contribution in PR #17178
- @hxomer made their first contribution in PR #17207
- @orgersh92 made their first contribution in PR #17316
- @dannykopping made their first contribution in PR #17313
- @rioiart made their first contribution in PR #17333
- @codgician made their first contribution in PR #17278
- @epistoteles made their first contribution in PR #17277
- @kothamah made their first contribution in PR #17368
- @flozonn made their first contribution in PR #17371
- @richardmcsong made their first contribution in PR #17389
- @matt-greathouse made their first contribution in PR #17384
- @mossbanay made their first contribution in PR #17380
- @mhielpos-asapp made their first contribution in PR #17376
- @Joilence made their first contribution in PR #17367
- @deepaktammali made their first contribution in PR #17357
- @axiomofjoy made their first contribution in PR #16611
- @DevajMody made their first contribution in PR #17445
- @andrewtruong made their first contribution in PR #17439
- @AnasAbdelR made their first contribution in PR #17490
- @dominicfeliton made their first contribution in PR #17516
- @kristianmitk made their first contribution in PR #17504
- @rgshr made their first contribution in PR #17130
- @dominicfallows made their first contribution in PR #17489
- @irfansofyana made their first contribution in PR #17467
- @GusBricker made their first contribution in PR #17191
- @OlivverX made their first contribution in PR #17255
- @withsmilo made their first contribution in PR #17585

