[Preview] v1.79.3-stable - Built-in Guardrails on AI Gateway
Deploy this versionโ
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.79.3.rc.1
pip install litellm
pip install litellm==1.79.3.rc.1
Key Highlightsโ
- LiteLLM Custom Guardrail - Built-in guardrail with UI configuration support
- Performance Improvements -
/responsesAPI 19ร Lower Median Latency - Veo3 Video Generation (Vertex AI + Google AI Studio) - Use OpenAI Video API to generate videos with Vertex AI and Google AI Studio Veo3 models
Built-in Guardrails on AI Gatewayโ
This release introduces built-in guardrails for LiteLLM AI Gateway, allowing you to enforce protections without depending on an external guardrail API.
- Blocking Keywords - Block known sensitive keywords like "litellm", "python", etc.
- Pattern Detection - Block known sensitive patterns like emails, Social Security Numbers, API keys, etc.
- Custom Regex Patterns - Define custom regex patterns for your specific use case.
Get started with the built-in guardrails on AI Gateway here.
Performance โ /responses 19ร Lower Median Latencyโ
This update significantly improves /responses latency by integrating our internal network management for connection handling, eliminating per-request setup overhead.
Resultsโ
| Metric | Before | After | Improvement |
|---|---|---|---|
| Median latency | 3,600 ms | 190 ms | โ95% (~19ร faster) |
| p95 latency | 4,300 ms | 280 ms | โ93% |
| p99 latency | 4,600 ms | 590 ms | โ87% |
| Average latency | 3,571 ms | 208 ms | โ94% |
| RPS | 231 | 1,059 | +358% |
Test Setupโ
| Category | Specification |
|---|---|
| Load Testing | Locust: 1,000 concurrent users, 500 ramp-up |
| System | 4 vCPUs, 8 GB RAM, 4 workers, 4 instances |
| Database | PostgreSQL (Redis unused) |
| Configuration | config.yaml |
| Load Script | no_cache_hits.py |
New Models / Updated Modelsโ
New Model Supportโ
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| Azure | azure/gpt-5-pro | 272K | $15.00 | $120.00 | Responses API, reasoning, vision, PDF input |
| Azure | azure/gpt-image-1-mini | - | - | - | Image generation - per pixel pricing |
| Azure | azure/container | - | - | - | Container API - $0.03/session |
| OpenAI | openai/container | - | - | - | Container API - $0.03/session |
| Cohere | cohere/embed-v4.0 | 128K | $0.12 | - | Embeddings with image input support |
| Gemini | gemini/gemini-live-2.5-flash-preview-native-audio-09-2025 | 1M | $0.30 | $2.00 | Native audio, vision, web search |
| Vertex AI | vertex_ai/minimaxai/minimax-m2-maas | 196K | $0.30 | $1.20 | Function calling, tool choice |
| NVIDIA | nvidia/nemotron-nano-9b-v2 | - | - | - | Chat completions |
OCR Modelsโ
| Provider | Model | Cost Per Page | Features |
|---|---|---|---|
| Azure AI | azure_ai/doc-intelligence/prebuilt-read | $0.0015 | Document reading |
| Azure AI | azure_ai/doc-intelligence/prebuilt-layout | $0.01 | Layout analysis |
| Azure AI | azure_ai/doc-intelligence/prebuilt-document | $0.01 | Document processing |
| Vertex AI | vertex_ai/mistral-ocr-2505 | $0.0005 | OCR processing |
Search Modelsโ
| Provider | Model | Pricing | Features |
|---|---|---|---|
| Firecrawl | firecrawl/search | Tiered: $0.00166-$0.0166/query | 10-100 results per query |
| SearXNG | searxng/search | Free | Open-source metasearch |
Featuresโ
-
- Add Azure GPT-5-Pro Responses API support with reasoning capabilities - PR #16235
- Add gpt-image-1-mini pricing for Azure with quality tiers (low/medium/high) - PR #16182
- Add support for returning Azure Content Policy error information when exceptions from Azure OpenAI occur - PR #16231
- Fix Azure GPT-5 incorrectly routed to O-series config (temperature parameter unsupported) - PR #16246
- Fix Azure doesn't accept extra body param - PR #16116
- Fix Azure DALL-E-3 health check content policy violation by using safe default prompt - PR #16329
-
- Fix empty assistant message handling in AWS Bedrock Converse API to prevent 400 Bad Request errors - PR #15850
- Fix: Filter AWS authentication params from Bedrock InvokeModel request body - PR #16315
- Fix Bedrock proxy adding name to file content, breaks when cache_control in use - PR #16275
- Fix global.anthropic.claude-haiku-4-5-20251001-v1:0 supports_reasoning flag and update pricing - PR #16263
-
Gemini (Google AI Studio + Vertex AI)
- Add gemini live audio model cost in model map - PR #16183
- Fix translation problem with Gemini parallel tool calls - PR #16194
- Fix: Send Gemini API key via x-goog-api-key header with custom api_base - PR #16085
- Fix image_config.aspect_ratio not working for gemini-2.5-flash-image - PR #15999
- Fix Gemini minimal reasoning env overrides disabling thoughts - PR #16347
- Fix cache_read_input_token_cost for gemini-2.5-flash - PR #16354
-
- Fix Anthropic token counting for VertexAI - PR #16171
- Fix anthropic-adapter: properly translate Anthropic image format to OpenAI - PR #16202
- Enable automated prompt caching message format for Claude on Databricks - PR #16200
- Add support for Anthropic Memory Tool - PR #16115
- Propagate cache creation/read token costs for model info to fix Anthropic long context cost calculations - PR #16376
-
- Fix databricks streaming - PR #16368
-
- Return the diarized transcript when it's required in the request - PR #16133
-
- Update Fireworks audio endpoints to new
api.fireworks.aidomains - PR #16346
- Update Fireworks audio endpoints to new
-
- Add cohere embed-v4.0 model support - PR #16358
-
- Support
reasoning_effortfor watsonx chat models - PR #16261
- Support
-
- Remove automatic summary from reasoning_effort transformation - PR #16210
-
- Remove Grok 4 Models Reasoning Effort Parameter - PR #16265
-
- Fix HostedVLLMRerankConfig will not be used - PR #16352
New Provider Supportโ
- Bedrock Agentcore
- Add Bedrock Agentcore as a provider on LiteLLM Python SDK and LiteLLM AI Gateway - PR #16252
LLM API Endpointsโ
Featuresโ
-
- Add gpt-4o-transcribe cost tracking - PR #16412
-
- Milvus - search vector store support + support multi-part form data on passthrough - PR #16035
- Azure AI Vector Stores - support "virtual" indexes + create vector store on passthrough API - PR #16160
- Milvus - Passthrough API support - adds create + read vector store support via passthrough API's - PR #16170
-
- Use valid CallTypes enum value in embeddings endpoint - PR #16328
-
- Generalize tiered pricing in generic cost calculator - PR #16150
Bugsโ
- General
Management Endpoints / UIโ
Featuresโ
-
Virtual Keys
-
Models + Endpoints
- UI - Add Model Existing Credentials Improvement - PR #16166
- UI - Add Azure AD Token field and Azure API Key optional - PR #16331
- UI - Fixed Label for vLLM in Model Create Flow - PR #16285
- UI - Include Model Access Group Models on Team Models Table - PR #16298
- Fix /model_group/info Returning Entire Model List for SSO Users - PR #16296
- Litellm non root docker Model Hub Table fix - PR #16282
-
Guardrails
- UI - Fix regression where Guardrail Entity Could not be selected and entity was not displayed - PR #16165
- UI - Guardrail Info Page Show PII Config - PR #16164
- Change guardrail_information to list type - PR #16127
- UI - LiteLLM Guardrail - ensure you can see UI Friendly name for PII Patterns - PR #16382
- UI - Guardrails - LiteLLM Content Filter, Allow Viewing/Editing Content Filter Settings - PR #16383
- UI - Guardrails - allow updating guardrails through UI. Ensure litellm_params actually get updated in memory - PR #16384
-
SSO Settings
-
Usage & Analytics
-
Cache Settings
- UI - Cache Settings Redis Add Semantic Cache Settings - PR #16398
Bugsโ
- General
AI Integrationsโ
Loggingโ
-
- Fix langfuse input tokens logic for cached tokens - PR #16203
-
- Fix the bug with not incorrect attachment to existing trace & refactor - PR #15529
-
- OTEL - Log Cost Breakdown on OTEL Logger - PR #16334
-
- Add DD Agent Host support for
datadogcallback - PR #16379
- Add DD Agent Host support for
Guardrailsโ
-
- PANW prisma airs guardrail deduplication and enhanced session tracking - PR #16273
Secret Managersโ
-
- Add tags and descriptions support to aws secrets manager - PR #16224
-
- Add Custom Secret Manager - Allow users to define and write a custom secret manager - PR #16297
-
General
Spend Tracking, Budgets and Rate Limitingโ
- Cost Tracking
- Fix OpenAI Responses API streaming tests usage field names and cost calculation - PR #16236
MCP Gatewayโ
Performance / Loadbalancing / Reliability improvementsโ
-
Memory Leak Fixes
- Resolve memory accumulation caused by Pydantic 2.11+ deprecation warnings - PR #16110
-
Session Management
- Add shared_session support to responses API - PR #16260
-
Error Handling
-
Configuration
-
Redis
- Handle float redis_version from AWS ElastiCache Valkey - PR #16207
-
Hooks
- Add parallel execution handling in during_call_hook - PR #16279
-
Infrastructure
- Install runtime node for prisma - PR #16410
Documentation Updatesโ
-
Provider Documentation
-
General Documentation
-
Security
- Remove tornado test files (including test.key), fixes Python 3.13 security issues - PR #16342
New Contributorsโ
- @steve-gore-snapdocs made their first contribution in PR #16149
- @timbmg made their first contribution in PR #16120
- @Nivg made their first contribution in PR #16202
- @pablobgar made their first contribution in PR #16194
- @AlanPonnachan made their first contribution in PR #16150
- @Chesars made their first contribution in PR #16236
- @bowenliang123 made their first contribution in PR #16255
- @dean-zavad made their first contribution in PR #16199
- @alexkuzmik made their first contribution in PR #15529
- @Granine made their first contribution in PR #16281
- @Oodapow made their first contribution in PR #16279
- @jgoodyear made their first contribution in PR #16275
- @Qanpi made their first contribution in PR #16321
- @ShimonMimoun made their first contribution in PR #16313
- @andriykislitsyn made their first contribution in PR #16288
- @reckless-huang made their first contribution in PR #16263
- @chenmoneygithub made their first contribution in PR #16368
- @stembe-digitalex made their first contribution in PR #16354
- @jfcherng made their first contribution in PR #16352
- @xingyaoww made their first contribution in PR #16246
- @emerzon made their first contribution in PR #16373
- @wwwillchen made their first contribution in PR #16376
- @fabriciojoc made their first contribution in PR #16203
- @jroberts2600 made their first contribution in PR #16273

