[Preview] v1.80.10.rc.1 - Agent Gateway & A2A Cost Tracking
Deploy this version​
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.80.10.rc.1
pip install litellm
pip install litellm==1.80.10
Key Highlights​
- Agent (A2A) Gateway with Cost Tracking - Track agent costs per query, per token pricing, and view agent usage in the dashboard
- 2 New Agent Providers - LangGraph Agents and Azure AI Foundry Agents for agentic workflows
- New Provider: SAP Gen AI Hub - Full support for SAP Generative AI Hub with chat completions
- New Bedrock Writer Models - Add Palmyra-X4 and Palmyra-X5 models on Bedrock
- OpenAI GPT-5.2 Models - Full support for GPT-5.2, GPT-5.2-pro, and Azure GPT-5.2 models with reasoning support
- 227 New Fireworks AI Models - Comprehensive model coverage for Fireworks AI platform
- MCP Support on /chat/completions - Use MCP servers directly via chat completions endpoint
- Performance Improvements - Reduced memory leaks by 50%
Agent (A2A) Usage UI​
Users can now filter usage statistics by agents, providing the same granular filtering capabilities available for teams, organizations, and customers.
Details:
- Filter usage analytics, spend logs, and activity metrics by agent ID
- View breakdowns on a per-agent basis
- Consistent filtering experience across all usage and analytics views
New Providers and Endpoints​
New Providers (5 new providers)​
| Provider | Supported LiteLLM Endpoints | Description |
|---|---|---|
| SAP Gen AI Hub | /chat/completions, /messages, /responses | SAP Generative AI Hub integration for enterprise AI |
| LangGraph | /chat/completions, /messages, /responses, /a2a | LangGraph agents for agentic workflows |
| Azure AI Foundry Agents | /chat/completions, /messages, /responses, /a2a | Azure AI Foundry Agents for enterprise agent deployments |
| Voyage AI Rerank | /rerank | Voyage AI rerank models support |
| Fireworks AI Rerank | /rerank | Fireworks AI rerank endpoint support |
New LLM API Endpoints (4 new endpoints)​
| Endpoint | Method | Description | Documentation |
|---|---|---|---|
/containers/{id}/files | GET | List files in a container | Docs |
/containers/{id}/files/{file_id} | GET | Retrieve container file metadata | Docs |
/containers/{id}/files/{file_id} | DELETE | Delete a file from a container | Docs |
/containers/{id}/files/{file_id}/content | GET | Retrieve container file content | Docs |
New Models / Updated Models​
New Model Support (270+ new models)​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| OpenAI | gpt-5.2 | 400K | $1.75 | $14.00 | Reasoning, vision, PDF, caching |
| OpenAI | gpt-5.2-pro | 400K | $21.00 | $168.00 | Reasoning, web search, vision |
| Azure | azure/gpt-5.2 | 400K | $1.75 | $14.00 | Reasoning, vision, PDF, caching |
| Azure | azure/gpt-5.2-pro | 400K | $21.00 | $168.00 | Reasoning, web search |
| Bedrock | us.writer.palmyra-x4-v1:0 | 128K | $2.50 | $10.00 | Function calling, PDF input |
| Bedrock | us.writer.palmyra-x5-v1:0 | 1M | $0.60 | $6.00 | Function calling, PDF input |
| Bedrock | eu.anthropic.claude-opus-4-5-20251101-v1:0 | 200K | $5.00 | $25.00 | Reasoning, computer use, vision |
| Bedrock | google.gemma-3-12b-it | 128K | $0.10 | $0.30 | Audio input |
| Bedrock | moonshot.kimi-k2-thinking | 128K | $0.60 | $2.50 | Reasoning |
| Bedrock | nvidia.nemotron-nano-12b-v2 | 128K | $0.20 | $0.60 | Vision |
| Bedrock | qwen.qwen3-next-80b-a3b | 128K | $0.15 | $1.20 | Function calling |
| Vertex AI | vertex_ai/deepseek-ai/deepseek-v3.2-maas | 164K | $0.56 | $1.68 | Reasoning, caching |
| Mistral | mistral/codestral-2508 | 256K | $0.30 | $0.90 | Function calling |
| Mistral | mistral/devstral-2512 | 256K | $0.40 | $2.00 | Function calling |
| Mistral | mistral/labs-devstral-small-2512 | 256K | $0.10 | $0.30 | Function calling |
| Cerebras | cerebras/zai-glm-4.6 | 128K | - | - | Chat completions |
| NVIDIA NIM | nvidia_nim/ranking/nvidia/llama-3.2-nv-rerankqa-1b-v2 | - | Free | Free | Rerank |
| Voyage | voyage/rerank-2.5 | 32K | $0.05/1K tokens | - | Rerank |
| Fireworks AI | 227 new models | Various | Various | Various | Full model catalog |
Features​
- OpenAI
- Azure
- Add Azure GPT-5.2 models support - PR #17866
- Azure AI
- Anthropic
- Prevent duplicate tool_result blocks with same tool - PR #17632
- Handle partial JSON chunks in streaming responses - PR #17493
- Preserve server_tool_use and web_search_tool_result in multi-turn conversations - PR #17746
- Capture web_search_tool_result in streaming for multi-turn conversations - PR #17798
- Add retrieve batches and retrieve file content support - PR #17700
- Bedrock
- Gemini
- Vertex AI
- Mistral
- Add Codestral 2508, Devstral 2512 models - PR #17801
- Cerebras
- DeepSeek
- Add native support for thinking and reasoning_effort params - PR #17712
- NVIDIA NIM Rerank
- Add llama-3.2-nv-rerankqa-1b-v2 rerank model - PR #17670
- Fireworks AI
- Add 227 new Fireworks AI models - PR #17692
- Dashscope
- Fix default base_url error - PR #17584
Bug Fixes​
- Anthropic
- Azure
- Fix error about encoding video id for Azure - PR #17708
- Azure AI
- Fix LLM provider for azure_ai in model map - PR #17805
- Watsonx
- Fix Watsonx Audio Transcription to only send supported params to API - PR #17840
- Router
LLM API Endpoints​
Features​
- Responses API
- Add usage details in responses usage object - PR #17641
- Fix error for response API polling - PR #17654
- Fix streaming tool_calls being dropped when text + tool_calls - PR #17652
- Transform image content in tool results for Responses API - PR #17799
- Fix responses api not applying tpm rate limits on api keys - PR #17707
- Containers API
- Rerank API
- Add support for forwarding client headers in /rerank endpoint - PR #17873
- Files API
- Add support for expires_after param in Files endpoint - PR #17860
- Video API
- Embeddings API
- Fix handling token array input decoding for embeddings - PR #17468
- Chat Completions API
- Add v0 target storage support - store files in Azure AI storage and use with chat completions API - PR #17758
- generateContent API
- Support model names with slashes on Gemini generateContent endpoints - PR #17743
- General
Bugs​
- General
- Fix handle string content in is_cached_message - PR #17853
Management Endpoints / UI​
Features​
- UI Settings
- Agent & Usage UI
- Daily Agent Usage Backend - PR #17781
- Agent Usage UI - PR #17797
- Add agent cost tracking on UI - PR #17899
- New Badge for Agent Usage - PR #17883
- Usage Entity labels for filtering - PR #17896
- Agent Usage Page minor fixes - PR #17901
- Usage Page View Select component - PR #17854
- Usage Page Components refactor - PR #17848
- Logs & Spend
- Virtual Keys
- Fix x-litellm-key-spend header update - PR #17864
- Models & Endpoints
- SSO & Auth
- Teams
- MCP Server Management
- Add extra_headers and allowed_tools to UpdateMCPServerRequest - PR #17940
- Notifications
- Show progress and pause on hover for Notifications - PR #17942
- General
Bugs​
- UI Fixes
- Fix links + old login page deprecation message - PR #17624
- Filtering for Chat UI Endpoint Selector - PR #17567
- Race Condition Handling in SCIM v2 - PR #17513
- Make /litellm_model_cost_map public - PR #16795
- Custom Callback on UI - PR #17522
- Add User Writable Directory to Non Root Docker for Logo - PR #17180
- Swap URL Input and Display Name inputs - PR #17682
- Change deprecation banner to only show on /sso/key/generate - PR #17681
- Change credential encryption to only affect db credentials - PR #17741
- Auth & Routes
AI Integrations​
New Integrations (4 new integrations)​
| Integration | Type | Description |
|---|---|---|
| SumoLogic | Logging | Native webhook integration for SumoLogic - PR #17630 |
| Arize Phoenix | Prompt Management | Arize Phoenix OSS prompt management integration - PR #17750 |
| Sendgrid | Sendgrid email notifications integration - PR #17775 | |
| Onyx | Guardrails | Onyx guardrail hooks integration - PR #16591 |
Logging​
- Langfuse
- Prometheus
- Add 'exception_status' to prometheus logger - PR #17847
- OpenTelemetry
- Add latency metrics (TTFT, TPOT, Total Generation Time) to OTEL payload - PR #17888
- General
- Add polling via cache feature for async logging - PR #16862
Guardrails​
- HiddenLayer
- Add HiddenLayer Guardrail Hooks - PR #17728
- Pillar Security
- Add opt-in evidence results for Pillar Security guardrail during monitoring - PR #17812
- PANW Prisma AIRS
- Add configurable fail-open, timeout, and app_user tracking - PR #17785
- Presidio
- Add support for configurable confidence score thresholds and scope in Presidio PII masking - PR #17817
- LiteLLM Content Filter
- Mask all regex pattern matches, not just first - PR #17727
- Regex Guardrails
- Add enhanced regex pattern matching for guardrails - PR #17915
- Gray Swan Guardrail
- Add passthrough mode for model response - PR #17102
Prompt Management​
- General
- New API for integrating prompt management providers - PR #17829
Spend Tracking, Budgets and Rate Limiting​
- Service Tier Pricing - Extract service_tier from response/usage for OpenAI flex pricing - PR #17748
- Agent Cost Tracking - Track agent_id in SpendLogs - PR #17795
- Tag Activity - Deduplicate /tag/daily/activity metadata - PR #16764
- Rate Limiting - Dynamic Rate Limiter - allow specifying ttl for in memory cache - PR #17679
MCP Gateway​
- Chat Completions Integration - Add support for using MCPs on /chat/completions - PR #17747
- UI Session Permissions - Fix UI session MCP permissions across real teams - PR #17620
- OAuth Callback - Fix MCP OAuth callback routing and URL handling - PR #17789
- Tool Name Prefix - Fix MCP tool name prefix - PR #17908
Agent Gateway (A2A)​
- Cost Per Query - Add cost per query for agent invocations - PR #17774
- Token Counting - Add token counting non streaming + streaming - PR #17779
- Cost Per Token - Add cost per token pricing for A2A - PR #17780
- LangGraph Provider - Add LangGraph provider for Agent Gateway - PR #17783
- Bedrock & LangGraph Agents - Allow using Bedrock AgentCore, LangGraph agents with A2A Gateway - PR #17786
- Agent Management - Allow adding LangGraph, Bedrock Agent Core agents - PR #17802
- Azure Foundry Agents - Add Azure AI Foundry Agents support - PR #17845
- Azure Foundry UI - Allow adding Azure Foundry Agents on UI - PR #17909
- Azure Foundry Fixes - Ensure Azure Foundry agents work correctly - PR #17943
Performance / Loadbalancing / Reliability improvements​
- Memory Leak Fix - Cut memory leak in half - PR #17784
- Spend Logs Memory - Reduce memory accumulation of spend_logs - PR #17742
- Router Optimization - Replace time.perf_counter() with time.time() - PR #17881
- Filter Internal Params - Filter internal params in fallback code - PR #17941
- Gunicorn Suggestion - Suggest Gunicorn instead of uvicorn when using max_requests_before_restart - PR #17788
- Pydantic Warnings - Mitigate PydanticDeprecatedSince20 warnings - PR #17657
- Python 3.14 Support - Add Python 3.14 support via grpcio version constraints - PR #17666
- OpenAI Package - Bump openai package to 2.9.0 - PR #17818
Documentation Updates​
- Contributing - Update clone instructions to recommend forking first - PR #17637
- Getting Started - Improve Getting Started page and SDK documentation structure - PR #17614
- JSON Mode - Make it clearer how to get Pydantic model output - PR #17671
- drop_params - Update litellm docs for drop_params - PR #17658
- Environment Variables - Document missing environment variables and fix incorrect types - PR #17649
- SumoLogic - Add SumoLogic integration documentation - PR #17647
- SAP Gen AI - Add SAP Gen AI provider documentation - PR #17667
- Authentication - Add Note for Authentication - PR #17733
- Known Issues - Adding known issues to 1.80.5-stable docs - PR #17738
- Supported Endpoints - Fix Supported Endpoints page - PR #17710
- Token Count - Document token count endpoint - PR #17772
- Overview - Made litellm proxy and SDK difference cleaner in overview with a table - PR #17790
- Containers API - Add docs for containers files API + code interpreter on LiteLLM - PR #17749
- Target Storage - Add documentation for target storage - PR #17882
- Agent Usage - Agent Usage documentation - PR #17931, PR #17932, PR #17934
- Cursor Integration - Cursor Integration documentation - PR #17855, PR #17939
- A2A Cost Tracking - A2A cost tracking docs - PR #17913
- Azure Search - Update azure search docs - PR #17726
- Milvus Client - Fix milvus client docs - PR #17736
- Streaming Logging - Remove streaming logging doc - PR #17739
- Integration Docs - Update integration docs location - PR #17644
- Links - Updated docs links for mistral and anthropic - PR #17852
- Community - Add community doc link - PR #17734
- Pricing - Update pricing for global.anthropic.claude-haiku-4-5-20251001-v1:0 - PR #17703
- gpt-image-1-mini - Correct model type for gpt-image-1-mini - PR #17635
Infrastructure / Deployment​
- Docker - Use python instead of wget for healthcheck in docker-compose.yml - PR #17646
- Helm Chart - Add extraResources support for Helm chart deployments - PR #17627
- Helm Versioning - Add semver prerelease suffix to helm chart versions - PR #17678
- Database Schema - Add storage_backend and storage_url columns to schema.prisma for target storage feature - PR #17936
New Contributors​
- @xianzongxie-stripe made their first contribution in PR #16862
- @krisxia0506 made their first contribution in PR #17637
- @chetanchoudhary-sumo made their first contribution in PR #17630
- @kevinmarx made their first contribution in PR #17632
- @expruc made their first contribution in PR #17627
- @rcII made their first contribution in PR #17626
- @tamirkiviti13 made their first contribution in PR #16591
- @Eric84626 made their first contribution in PR #17629
- @vasilisazayka made their first contribution in PR #16053
- @juliettech13 made their first contribution in PR #17663
- @jason-nance made their first contribution in PR #17660
- @yisding made their first contribution in PR #17671
- @emilsvennesson made their first contribution in PR #17656
- @kumekay made their first contribution in PR #17646
- @chenzhaofei01 made their first contribution in PR #17584
- @shivamrawat1 made their first contribution in PR #17733
- @ephrimstanley made their first contribution in PR #17723
- @hwittenborn made their first contribution in PR #17743
- @peterkc made their first contribution in PR #17727
- @saisurya237 made their first contribution in PR #17725
- @Ashton-Sidhu made their first contribution in PR #17728
- @CyrusTC made their first contribution in PR #17810
- @jichmi made their first contribution in PR #17703
- @ryan-crabbe made their first contribution in PR #17852
- @nlineback made their first contribution in PR #17851
- @butnarurazvan made their first contribution in PR #17468
- @yoshi-p27 made their first contribution in PR #17915

