Skip to main content

[Preview] v1.80.8.rc.1 - Introducing A2A Agent Gateway

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Deploy this version​

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.80.8.rc.1

Key Highlights​


Agent Gateway (A2A)​


This release introduces A2A Agent Gateway for LiteLLM, allowing you to invoke and manage A2A agents with the same controls you have for LLM APIs.

As a LiteLLM Gateway Admin, you can now do the following:

  • Request/Response Logging - Every agent invocation is logged to the Logs page with full request and response tracking.
  • Access Control - Control which Team/Key can access which agents.

As a developer, you can continue using the A2A SDK, all you need to do is point you A2AClient to the LiteLLM proxy URL and your API key.

Works with the A2A SDK:

from a2a.client import A2AClient

client = A2AClient(
base_url="http://localhost:4000", # Your LiteLLM proxy
api_key="sk-1234" # LiteLLM API key
)

response = client.send_message(
agent_id="my-agent",
message="What's the status of my order?"
)

Get started with Agent Gateway here: Agent Gateway Documentation


Customer (End User) Usage UI​

Users can now filter usage statistics by customers, providing the same granular filtering capabilities available for teams and organizations.

Details:

  • Filter usage analytics, spend logs, and activity metrics by customer ID
  • View customer-level breakdowns alongside existing team and user-level filters
  • Consistent filtering experience across all usage and analytics views

New Providers and Endpoints​

New Providers (5 new providers)​

ProviderSupported LiteLLM EndpointsDescription
Z.AI (Zhipu AI)/v1/chat/completions, /v1/responses, /v1/messagesBuilt-in support for Zhipu AI GLM models
RAGFlow/v1/chat/completions, /v1/responses, /v1/messages, /v1/vector_storesRAG-based chat completions with vector store support
PublicAI/v1/chat/completions, /v1/responses, /v1/messagesOpenAI-compatible provider via JSON config
Google Cloud Chirp3 HD/v1/audio/speech, /v1/audio/speech/streamText-to-speech with Google Cloud Chirp3 HD voices

New LLM API Endpoints (2 new endpoints)​

EndpointMethodDescriptionDocumentation
/v1/agents/invokePOSTInvoke A2A agents through the AI GatewayAgent Gateway
/cursor/chat/completionsPOSTCursor BYOK endpoint - accepts Responses API input, returns Chat Completions outputCursor Integration

New Models / Updated Models​

New Model Support (33 new models)​

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Features
OpenAIgpt-5.1-codex-max400K$1.25$10.00Reasoning, vision, PDF input, responses API
Azureazure/gpt-5.1-codex-max400K$1.25$10.00Reasoning, vision, PDF input, responses API
Anthropicclaude-opus-4-5200K$5.00$25.00Computer use, reasoning, vision
Bedrockglobal.anthropic.claude-opus-4-5-20251101-v1:0200K$5.00$25.00Computer use, reasoning, vision
Bedrockamazon.nova-2-lite-v1:01M$0.30$2.50Reasoning, vision, video, PDF input
Bedrockamazon.titan-image-generator-v2:0--$0.008/imageImage generation
Fireworksfireworks_ai/deepseek-v3p2164K$1.20$1.20Function calling, response schema
Fireworksfireworks_ai/kimi-k2-instruct-0905262K$0.60$2.50Function calling, response schema
DeepSeekdeepseek/deepseek-v3.2164K$0.28$0.40Reasoning, function calling
Mistralmistral/mistral-large-3256K$0.50$1.50Function calling, vision
Azure AIazure_ai/mistral-large-3256K$0.50$1.50Function calling, vision
Moonshotmoonshot/kimi-k2-0905-preview262K$0.60$2.50Function calling, web search
Moonshotmoonshot/kimi-k2-turbo-preview262K$1.15$8.00Function calling, web search
Moonshotmoonshot/kimi-k2-thinking-turbo262K$1.15$8.00Function calling, web search
OpenRouteropenrouter/deepseek/deepseek-v3.2164K$0.28$0.40Reasoning, function calling
Databricksdatabricks/databricks-claude-haiku-4-5200K$1.00$5.00Reasoning, function calling
Databricksdatabricks/databricks-claude-opus-4200K$15.00$75.00Reasoning, function calling
Databricksdatabricks/databricks-claude-opus-4-1200K$15.00$75.00Reasoning, function calling
Databricksdatabricks/databricks-claude-opus-4-5200K$5.00$25.00Reasoning, function calling
Databricksdatabricks/databricks-claude-sonnet-4200K$3.00$15.00Reasoning, function calling
Databricksdatabricks/databricks-claude-sonnet-4-1200K$3.00$15.00Reasoning, function calling
Databricksdatabricks/databricks-gemini-2-5-flash1M$0.30$2.50Function calling
Databricksdatabricks/databricks-gemini-2-5-pro1M$1.25$10.00Function calling
Databricksdatabricks/databricks-gpt-5400K$1.25$10.00Function calling
Databricksdatabricks/databricks-gpt-5-1400K$1.25$10.00Function calling
Databricksdatabricks/databricks-gpt-5-mini400K$0.25$2.00Function calling
Databricksdatabricks/databricks-gpt-5-nano400K$0.05$0.40Function calling
Vertex AIvertex_ai/chirp-$30.00/1M chars-Text-to-speech (Chirp3 HD)
Z.AIzai/glm-4.6200K$0.60$2.20Function calling
Z.AIzai/glm-4.5128K$0.60$2.20Function calling
Z.AIzai/glm-4.5v128K$0.60$1.80Function calling, vision
Z.AIzai/glm-4.5-flash128KFreeFreeFunction calling
Vertex AIvertex_ai/bge-large-en-v1.5---BGE Embeddings

Features​

Bug Fixes​

  • Bedrock

    • Fix extra_headers in messages API bedrock invoke - PR #17271
    • Fix Bedrock models in model map - PR #17419
    • Make Bedrock converse messages respect modify_params as expected - PR #17427
    • Fix Anthropic beta headers for Bedrock imported Qwen models - PR #17467
    • Preserve usage from JSON response for OpenAI provider in Bedrock - PR #17589
  • SambaNova

    • Fix acompletion throws error with SambaNova models - PR #17217
  • General

    • Fix AttributeError when metadata is null in request body - PR #17306
    • Fix 500 error for malformed request - PR #17291
    • Respect custom LLM provider in header - PR #17290
    • Replace deprecated .dict() with .model_dump() in streaming_handler - PR #17359

LLM API Endpoints​

Features​

Bugs​

  • General
    • Fix streaming error validation - PR #17242
    • Add length validation for empty tool_calls in delta - PR #17523

Management Endpoints / UI​

Features​

  • New Login Page

  • Customer (End User) Usage

  • Virtual Keys

    • Standardize API Key vs Virtual Key in UI - PR #17325
    • Add User Alias Column to Internal User Table - PR #17321
    • Delete Credential Enhancements - PR #17317
  • Models + Endpoints

    • Show all credential values on Edit Credential Modal - PR #17397
    • Change Edit Team Models Shown to Match Create Team - PR #17394
    • Support Images in Compare UI - PR #17562
  • Callbacks

  • Management Routes

    • Allow admin viewer to access global tag usage - PR #17501
    • Allow wildcard routes for nonproxy admin (SCIM) - PR #17178
    • Return 404 when a user is not found on /user/info - PR #16850
  • OCI Configuration

    • Enable Oracle Cloud Infrastructure configuration via UI - PR #17159

Bugs​

  • UI Fixes

    • Fix Request and Response Panel JSONViewer - PR #17233
    • Adding Button Loading States to Edit Settings - PR #17236
    • Fix Various Text, button state, and test changes - PR #17237
    • Fix Fallbacks Immediately Deleting before API resolves - PR #17238
    • Remove Feature Flags - PR #17240
    • Fix metadata tags and model name display in UI for Azure passthrough - PR #17258
    • Change labeling around Vertex Fields - PR #17383
    • Remove second scrollbar when sidebar is expanded + tooltip z index - PR #17436
    • Fix Select in Edit Membership Modal - PR #17524
    • Change useAuthorized Hook to redirect to new Login Page - PR #17553
  • SSO

    • Fix the generic SSO provider - PR #17227
    • Clear SSO integration for all users - PR #17287
    • Fix SSO users not added to Entra synced team - PR #17331
  • Auth / JWT

    • JWT Auth - Allow using regular OIDC flow with user info endpoints - PR #17324
    • Fix litellm user auth not passing issue - PR #17342
    • Add other routes in JWT auth - PR #17345
    • Fix new org team validate against org - PR #17333
    • Fix litellm_enterprise ensure imported routes exist - PR #17337
    • Use organization.members instead of deprecated organization field - PR #17557
  • Organizations/Teams

    • Fix organization max budget not enforced - PR #17334
    • Fix budget update to allow null max_budget - PR #17545

AI Integrations (2 new integrations)​

Logging (1 new integration)​

New Integration​

Improvements & Fixes​

Guardrails (1 new integration)​

New Integration​

  • Generic Guardrail API
    • Generic Guardrail API - allows guardrail providers to add INSTANT support for LiteLLM w/out PR to repo - PR #17175
    • Guardrails API V2 - user api key metadata, session id, specify input type (request/response), image support - PR #17338
    • Guardrails API - add streaming support - PR #17400
    • Guardrails API - support tool call checks on OpenAI /chat/completions, OpenAI /responses, Anthropic /v1/messages - PR #17459
    • Guardrails API - new structured_messages param - PR #17518
    • Correctly map a v1/messages call to the anthropic unified guardrail - PR #17424
    • Support during_call event type for unified guardrails - PR #17514

Improvements & Fixes​

Secret Managers​

  • CyberArk

    • Allow setting SSL verify to false - PR #17433
  • General

    • Make email and secret manager operations independent in key management hooks - PR #17551

Spend Tracking, Budgets and Rate Limiting​

  • Rate Limiting

    • Parallel Request Limiter with /messages - PR #17426
    • Allow using dynamic rate limit/priority reservation on teams - PR #17061
    • Dynamic Rate Limiter - Fix token count increases/decreases by 1 instead of actual count + Redis TTL - PR #17558
  • Spend Logs

    • Deprecate spend/logs & add spend/logs/v2 - PR #17167
    • Optimize SpendLogs queries to use timestamp filtering for index usage - PR #17504
  • Enforce User Param

    • Enforce support of enforce_user_param to OpenAI post endpoints - PR #17407

MCP Gateway​

  • MCP Configuration

    • Remove URL format validation for MCP server endpoints - PR #17270
    • Add stack trace to MCP error message - PR #17269
  • MCP Tool Results

    • Preserve tool metadata in CallToolResult - PR #17561

Agent Gateway (A2A)​

  • Agent Invocation

    • Allow invoking agents through AI Gateway - PR #17440
    • Allow tracking request/response in "Logs" Page - PR #17449
  • Agent Access Control

    • Enforce Allowed agents by key, team + add agent access groups on backend - PR #17502
  • Agent Gateway UI


Performance / Loadbalancing / Reliability improvements​

  • Audio/Speech Performance

    • Fix /audio/speech performance by using shared_sessions - PR #16739
  • Memory Optimization

    • Prevent memory leak in aiohttp connection pooling - PR #17388
    • Lazy-load utils to reduce memory + import time - PR #17171
  • Database

    • Update default database connection number - PR #17353
    • Update default proxy_batch_write_at number - PR #17355
    • Add background health checks to db - PR #17528
  • Proxy Caching

    • Fix proxy caching between requests in aiohttp transport - PR #17122
  • Session Management

    • Fix session consistency, move Lasso API version away from source code - PR #17316
    • Conditionally pass enable_cleanup_closed to aiohttp TCPConnector - PR #17367
  • Vector Store

    • Fix vector store configuration synchronization failure - PR #17525

Documentation Updates​

  • Provider Documentation

    • Add Azure AI Foundry documentation for Claude models - PR #17104
    • Document responses and embedding API for GitHub Copilot - PR #17456
    • Add gpt-5.1-codex-max to OpenAI provider documentation - PR #17602
    • Update Instructions For Phoenix Integration - PR #17373
  • Guides

    • Add guide on how to debug gateway error vs provider error - PR #17387
    • Agent Gateway documentation - PR #17454
    • A2A Permission management documentation - PR #17515
    • Update docs to link agent hub - PR #17462
  • Projects

    • Add Google ADK and Harbor to projects - PR #17352
    • Add Microsoft Agent Lightning to projects - PR #17422
  • Cleanup

    • Cleanup: Remove orphan docs pages and Docusaurus template files - PR #17356
    • Remove source .env from docs - PR #17466

Infrastructure / CI/CD​

  • Helm Chart

  • Docker

    • Add retry logic to apk package installation in Dockerfile.non_root - PR #17596
    • Chainguard fixes - PR #17406
  • OpenAPI Schema

    • Refactor add_schema_to_components to move definitions to components/schemas - PR #17389
  • Security

    • Fix security vulnerability: update mdast-util-to-hast to 13.2.1 - PR #17601
    • Bump jws from 3.2.2 to 3.2.3 - PR #17494

New Contributors​

  • @weichiet made their first contribution in PR #17242
  • @AndyForest made their first contribution in PR #17220
  • @omkar806 made their first contribution in PR #17217
  • @v0rtex20k made their first contribution in PR #17178
  • @hxomer made their first contribution in PR #17207
  • @orgersh92 made their first contribution in PR #17316
  • @dannykopping made their first contribution in PR #17313
  • @rioiart made their first contribution in PR #17333
  • @codgician made their first contribution in PR #17278
  • @epistoteles made their first contribution in PR #17277
  • @kothamah made their first contribution in PR #17368
  • @flozonn made their first contribution in PR #17371
  • @richardmcsong made their first contribution in PR #17389
  • @matt-greathouse made their first contribution in PR #17384
  • @mossbanay made their first contribution in PR #17380
  • @mhielpos-asapp made their first contribution in PR #17376
  • @Joilence made their first contribution in PR #17367
  • @deepaktammali made their first contribution in PR #17357
  • @axiomofjoy made their first contribution in PR #16611
  • @DevajMody made their first contribution in PR #17445
  • @andrewtruong made their first contribution in PR #17439
  • @AnasAbdelR made their first contribution in PR #17490
  • @dominicfeliton made their first contribution in PR #17516
  • @kristianmitk made their first contribution in PR #17504
  • @rgshr made their first contribution in PR #17130
  • @dominicfallows made their first contribution in PR #17489
  • @irfansofyana made their first contribution in PR #17467
  • @GusBricker made their first contribution in PR #17191
  • @OlivverX made their first contribution in PR #17255
  • @withsmilo made their first contribution in PR #17585

Full Changelog​

View complete changelog on GitHub