[Preview] v1.80.0-stable - Agent Hub Support
Deploy this version​
- Docker
- Pip
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.80.0.rc.2
pip install litellm==1.80.0
Key Highlights​
- 🆕 Agent Hub Support - Register and make agents public for your organization
- RunwayML Provider - Complete video generation, image generation, and text-to-speech support
- GPT-5.1 Family Support - Day-0 support for OpenAI's latest GPT-5.1 and GPT-5.1-Codex models
- Prometheus OSS - Prometheus metrics now available in open-source version
- Vector Store Files API - Complete OpenAI-compatible Vector Store Files API with full CRUD operations
- Embeddings Performance - O(1) lookup optimization for router embeddings with shared sessions
Agent Hub​
This release adds support for registering and making agents public for your organization. This is great for Proxy Admins who want a central place to make agents built in their organization, discoverable to their users.
Here's the flow:
- Add agent to litellm.
- Make it public.
- Allow anyone to discover it on the public AI Hub page.
Performance – /embeddings 13× Lower p95 Latency​
This update significantly improves /embeddings latency by routing it through the same optimized pipeline as /chat/completions, benefiting from all previously applied networking optimizations.
Results​
| Metric | Before | After | Improvement |
|---|---|---|---|
| p95 latency | 5,700 ms | 430 ms | −92% (~13× faster)** |
| p99 latency | 7,200 ms | 780 ms | −89% |
| Average latency | 844 ms | 262 ms | −69% |
| Median latency | 290 ms | 230 ms | −21% |
| RPS | 1,216.7 | 1,219.7 | +0.25% |
Test Setup​
| Category | Specification |
|---|---|
| Load Testing | Locust: 1,000 concurrent users, 500 ramp-up |
| System | 4 vCPUs, 8 GB RAM, 4 workers, 4 instances |
| Database | PostgreSQL (Redis unused) |
| Configuration | config.yaml |
| Load Script | no_cache_hits.py |
🆕 RunwayML​
Complete integration for RunwayML's Gen-4 family of models, supporting video generation, image generation, and text-to-speech.
Supported Endpoints:
/v1/videos- Video generation (Gen-4 Turbo, Gen-4 Aleph, Gen-3A Turbo)/v1/images/generations- Image generation (Gen-4 Image, Gen-4 Image Turbo)/v1/audio/speech- Text-to-speech (ElevenLabs Multilingual v2)
Quick Start:
curl --location 'http://localhost:4000/v1/videos' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"model": "runwayml/gen4_turbo",
"prompt": "A high quality demo video of litellm ai gateway",
"input_reference": "https://example.com/image.jpg",
"seconds": 5,
"size": "1280x720"
}'
Prometheus Metrics - Open Source​
Prometheus metrics are now available in the open-source version of LiteLLM, providing comprehensive observability for your AI Gateway without requiring an enterprise license.
Quick Start:
litellm_settings:
success_callback: ["prometheus"]
failure_callback: ["prometheus"]
Vector Store Files API​
Complete OpenAI-compatible Vector Store Files API now stable, enabling full file lifecycle management within vector stores.
Supported Endpoints:
POST /v1/vector_stores/{vector_store_id}/files- Create vector store fileGET /v1/vector_stores/{vector_store_id}/files- List vector store filesGET /v1/vector_stores/{vector_store_id}/files/{file_id}- Retrieve vector store fileGET /v1/vector_stores/{vector_store_id}/files/{file_id}/content- Retrieve file contentDELETE /v1/vector_stores/{vector_store_id}/files/{file_id}- Delete vector store fileDELETE /v1/vector_stores/{vector_store_id}- Delete vector store
Quick Start:
curl --location 'http://localhost:4000/v1/vector_stores/vs_123/files' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"file_id": "file_abc"
}'
Get Started with Vector Stores
New Providers and Endpoints​
New Providers​
| Provider | Supported Endpoints | Description |
|---|---|---|
| RunwayML | /v1/videos, /v1/images/generations, /v1/audio/speech | Gen-4 video generation, image generation, and text-to-speech |
New LLM API Endpoints​
| Endpoint | Method | Description | Documentation |
|---|---|---|---|
/v1/vector_stores/{vector_store_id}/files | POST | Create vector store file | Docs |
/v1/vector_stores/{vector_store_id}/files | GET | List vector store files | Docs |
/v1/vector_stores/{vector_store_id}/files/{file_id} | GET | Retrieve vector store file | Docs |
/v1/vector_stores/{vector_store_id}/files/{file_id}/content | GET | Retrieve file content | Docs |
/v1/vector_stores/{vector_store_id}/files/{file_id} | DELETE | Delete vector store file | Docs |
/v1/vector_stores/{vector_store_id} | DELETE | Delete vector store | Docs |
New Models / Updated Models​
New Model Support​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| OpenAI | gpt-5.1 | 272K | $1.25 | $10.00 | Reasoning, vision, PDF input, responses API |
| OpenAI | gpt-5.1-2025-11-13 | 272K | $1.25 | $10.00 | Reasoning, vision, PDF input, responses API |
| OpenAI | gpt-5.1-chat-latest | 128K | $1.25 | $10.00 | Reasoning, vision, PDF input |
| OpenAI | gpt-5.1-codex | 272K | $1.25 | $10.00 | Responses API, reasoning, vision |
| OpenAI | gpt-5.1-codex-mini | 272K | $0.25 | $2.00 | Responses API, reasoning, vision |
| Moonshot | moonshot/kimi-k2-thinking | 262K | $0.60 | $2.50 | Function calling, web search, reasoning |
| Mistral | mistral/magistral-medium-2509 | 40K | $2.00 | $5.00 | Reasoning, function calling |
| Vertex AI | vertex_ai/moonshotai/kimi-k2-thinking-maas | 256K | $0.60 | $2.50 | Function calling, web search |
| OpenRouter | openrouter/deepseek/deepseek-v3.2-exp | 164K | $0.20 | $0.40 | Function calling, prompt caching |
| OpenRouter | openrouter/minimax/minimax-m2 | 205K | $0.26 | $1.02 | Function calling, reasoning |
| OpenRouter | openrouter/z-ai/glm-4.6 | 203K | $0.40 | $1.75 | Function calling, reasoning |
| OpenRouter | openrouter/z-ai/glm-4.6:exacto | 203K | $0.45 | $1.90 | Function calling, reasoning |
| Voyage | voyage/voyage-3.5 | 32K | $0.06 | - | Embeddings |
| Voyage | voyage/voyage-3.5-lite | 32K | $0.02 | - | Embeddings |
Video Generation Models​
| Provider | Model | Cost Per Second | Resolutions | Features |
|---|---|---|---|---|
| RunwayML | runwayml/gen4_turbo | $0.05 | 1280x720, 720x1280 | Text + image to video |
| RunwayML | runwayml/gen4_aleph | $0.15 | 1280x720, 720x1280 | Text + image to video |
| RunwayML | runwayml/gen3a_turbo | $0.05 | 1280x720, 720x1280 | Text + image to video |
Image Generation Models​
| Provider | Model | Cost Per Image | Resolutions | Features |
|---|---|---|---|---|
| RunwayML | runwayml/gen4_image | $0.05 | 1280x720, 1920x1080 | Text + image to image |
| RunwayML | runwayml/gen4_image_turbo | $0.02 | 1280x720, 1920x1080 | Text + image to image |
| Fal.ai | fal_ai/fal-ai/flux-pro/v1.1 | $0.04/image | - | Image generation |
| Fal.ai | fal_ai/fal-ai/flux/schnell | $0.003/image | - | Fast image generation |
| Fal.ai | fal_ai/fal-ai/bytedance/seedream/v3/text-to-image | $0.03/image | - | Image generation |
| Fal.ai | fal_ai/fal-ai/bytedance/dreamina/v3.1/text-to-image | $0.03/image | - | Image generation |
| Fal.ai | fal_ai/fal-ai/ideogram/v3 | $0.06/image | - | Image generation |
| Fal.ai | fal_ai/fal-ai/imagen4/preview/fast | $0.02/image | - | Fast image generation |
| Fal.ai | fal_ai/fal-ai/imagen4/preview/ultra | $0.06/image | - | High-quality image generation |
Audio Models​
| Provider | Model | Cost | Features |
|---|---|---|---|
| RunwayML | runwayml/eleven_multilingual_v2 | $0.0003/char | Text-to-speech |
Features​
-
Gemini (Google AI Studio + Vertex AI)
- Add support for
reasoning_effort='none'for Gemini models - PR #16548 - Add all Gemini image models support in image generation - PR #16526
- Add Gemini image edit support - PR #16430
- Fix preserve non-ASCII characters in function call arguments - PR #16550
- Fix Gemini conversation format issue with MCP auto-execution - PR #16592
- Add support for
-
- Add support for filtering knowledge base queries - PR #16543
- Ensure correct
aws_regionis used when provided dynamically for embeddings - PR #16547 - Add support for custom KMS encryption keys in Bedrock Batch operations - PR #16662
- Add bearer token authentication support for AgentCore - PR #16556
- Fix AgentCore SSE stream iterator to async for proper streaming support - PR #16293
-
- Fix Magistral streaming to emit reasoning chunks - PR #16434
-
- Add Kimi K2 thinking model support - PR #16445
-
- Fix SambaNova API rejecting requests when message content is passed as a list format - PR #16612
-
- Fix improve Azure auth parameter handling for None values - PR #14436
-
- Fix parse failed chunks for Groq - PR #16595
-
- Add Voyage 3.5 and 3.5-lite embeddings pricing and doc update - PR #16641
Bug Fixes​
- General
- Fix sanitize null token usage in OpenAI-compatible responses - PR #16493
- Fix apply provided timeout value to ClientTimeout.total - PR #16395
- Fix raising wrong 429 error on wrong exception - PR #16482
- Add new models, delete repeat models, update pricing - PR #16491
- Update model logging format for custom LLM provider - PR #16485
LLM API Endpoints​
New Endpoints​
- GET /providers
- Add GET list of providers endpoint - PR #16432
Features​
-
- Allow internal users to access video generation routes - PR #16472
-
- Vector store files stable release with complete CRUD operations - PR #16643
POST /v1/vector_stores/{vector_store_id}/files- Create vector store fileGET /v1/vector_stores/{vector_store_id}/files- List vector store filesGET /v1/vector_stores/{vector_store_id}/files/{file_id}- Retrieve vector store fileGET /v1/vector_stores/{vector_store_id}/files/{file_id}/content- Retrieve file contentDELETE /v1/vector_stores/{vector_store_id}/files/{file_id}- Delete vector store fileDELETE /v1/vector_stores/{vector_store_id}- Delete vector store
- Ensure users can access
search_resultsfor both stream + non-stream response - PR #16459
- Vector store files stable release with complete CRUD operations - PR #16643
Bugs​
-
- Fix use GET for
/v1/videos/{video_id}/content- PR #16672
- Fix use GET for
-
General
- Fix remove generic exception handling - PR #16599
Management Endpoints / UI​
Features​
-
Proxy CLI Auth
- Fix remove strict master_key check in add_deployment - PR #16453
-
Virtual Keys
-
Models + Endpoints
- UI - Add LiteLLM Params to Edit Model - PR #16496
- UI - Add Model use backend data - PR #16664
- UI - Remove Description Field from LLM Credentials - PR #16608
- UI - Add RunwayML on Admin UI supported models/providers - PR #16606
- Infra - Migrate Add Model Fields to Backend - PR #16620
- Add API Endpoint for creating model access group - PR #16663
-
Teams
-
Budgets
- UI - Move Budgets out of Experimental - PR #16544
-
Guardrails
-
Callbacks
-
Usage & Analytics
-
Health Check
- Add Langfuse OTEL and SQS to Health Check - PR #16514
-
General UI
- UI - Normalize table action columns appearance - PR #16657
- UI - Button Styles and Sizing in Settings Pages - PR #16600
- UI - SSO Modal Cosmetic Changes - PR #16554
- Fix UI logos loading with SERVER_ROOT_PATH - PR #16618
- Fix remove misleading 'Custom' option mention from OpenAI endpoint tooltips - PR #16622
Bugs​
- Management Endpoints
- Fix inconsistent error responses in customer management endpoints - PR #16450
- Fix correct date range filtering in /spend/logs endpoint - PR #16443
- Fix /spend/logs/ui Access Control - PR #16446
- Add pagination for /spend/logs/session/ui endpoint - PR #16603
- Fix LiteLLM Usage shows key_hash - PR #16471
- Fix app_roles missing from jwt payload - PR #16448
Logging / Guardrail / Prompt Management Integrations​
New Integration​
- 🆕 Zscaler AI Guard
- Add Zscaler AI Guard hook for security policy enforcement - PR #15691
Logging​
-
- Fix handle null usage values to prevent validation errors - PR #16396
-
- Fix updated spend would not be sent to CloudZero - PR #16201
Guardrails​
- IBM Detector
- Ensure detector-id is passed as header to IBM detector server - PR #16649
Prompt Management​
- Custom Prompt Management
- Add SDK focused examples for custom prompt management - PR #16441
Spend Tracking, Budgets and Rate Limiting​
- End User Budgets
- Allow pointing max_end_user budget to an id, so the default ID applies to all end users - PR #16456
MCP Gateway​
- Configuration
- Add dynamic OAuth2 metadata discovery for MCP servers - PR #16676
- Fix allow tool call even when server name prefix is missing - PR #16425
- Fix exclude unauthorized MCP servers from allowed server list - PR #16551
- Fix unable to delete MCP server from permission settings - PR #16407
- Fix avoid crashing when MCP server record lacks credentials - PR #16601
Agents​
- Agent Registration (A2A Spec)
- Support agent registration + discovery following Agent-to-Agent specification - PR #16615
Performance / Loadbalancing / Reliability improvements​
-
Embeddings Performance
- Use router's O(1) lookup and shared sessions for embeddings - PR #16344
-
Router Reliability
- Support default fallbacks for unknown models - PR #16419
-
Callback Management
- Add atexit handlers to flush callbacks for async completions - PR #16487
General Proxy Improvements​
- Configuration Management
- Fix update model_cost_map_url to use environment variable - PR #16429
Documentation Updates​
-
Provider Documentation
-
API Documentation
-
General Documentation
New Contributors​
- @artplan1 made their first contribution in PR #16423
- @JehandadK made their first contribution in PR #16472
- @vmiscenko made their first contribution in PR #16453
- @mcowger made their first contribution in PR #16429
- @yellowsubmarine372 made their first contribution in PR #16395
- @Hebruwu made their first contribution in PR #16201
- @jwang-gif made their first contribution in PR #15691
- @AnthonyMonaco made their first contribution in PR #16502
- @andrewm4894 made their first contribution in PR #16487
- @f14-bertolotti made their first contribution in PR #16485
- @busla made their first contribution in PR #16293
- @MightyGoldenOctopus made their first contribution in PR #16537
- @ultmaster made their first contribution in PR #14436
- @bchrobot made their first contribution in PR #16542
- @sep-grindr made their first contribution in PR #16622
- @pnookala-godaddy made their first contribution in PR #16607
- @dtunikov made their first contribution in PR #16592
- @lukapecnik made their first contribution in PR #16648
- @jyeros made their first contribution in PR #16618
Full Changelog​
View complete changelog on GitHub

