v1.74.0-stable

July 5, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

Deploy this version

Docker
Pip

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.74.0-stable

pip install litellm
pip install litellm==1.74.0.post2

Key Highlights

MCP Gateway Namespace Servers - Clients connecting to LiteLLM can now specify which MCP servers to use.
Key/Team Based Logging on UI - Proxy Admins can configure team or key-based logging settings directly in the UI.
Azure Content Safety Guardrails - Added support for prompt injection and text moderation with Azure Content Safety Guardrails.
VertexAI Deepseek Models - Support for calling VertexAI Deepseek models with LiteLLM's/chat/completions or /responses API.
Github Copilot API - You can now use Github Copilot as an LLM API provider.

MCP Gateway: Namespaced MCP Servers

This release brings support for namespacing MCP Servers on LiteLLM MCP Gateway. This means you can specify the x-mcp-servers header to specify which servers to list tools from.

This is useful when you want to point MCP clients to specific MCP Servers on LiteLLM.

Usage

OpenAI API
LiteLLM Proxy
Cursor IDE

cURL Example with Server Segregation
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
    "model": "gpt-4o",
    "tools": [
        {
            "type": "mcp",
            "server_label": "litellm",
            "server_url": "<your-litellm-proxy-base-url>/mcp",
            "require_approval": "never",
            "headers": {
                "x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
                "x-mcp-servers": "Zapier_Gmail"
            }
        }
    ],
    "input": "Run available tools",
    "tool_choice": "required"
}'

In this example, the request will only have access to tools from the "Zapier_Gmail" MCP server.

cURL Example with Server Segregation
curl --location '<your-litellm-proxy-base-url>/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $LITELLM_API_KEY" \
--data '{
    "model": "gpt-4o",
    "tools": [
        {
            "type": "mcp",
            "server_label": "litellm",
            "server_url": "<your-litellm-proxy-base-url>/mcp",
            "require_approval": "never",
            "headers": {
                "x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
                "x-mcp-servers": "Zapier_Gmail,Server2"
            }
        }
    ],
    "input": "Run available tools",
    "tool_choice": "required"
}'

This configuration restricts the request to only use tools from the specified MCP servers.

Cursor MCP Configuration with Server Segregation
{
  "mcpServers": {
    "LiteLLM": {
      "url": "<your-litellm-proxy-base-url>/mcp",
      "headers": {
        "x-litellm-api-key": "Bearer $LITELLM_API_KEY",
        "x-mcp-servers": "Zapier_Gmail,Server2"
      }
    }
  }
}

This configuration in Cursor IDE settings will limit tool access to only the specified MCP server.

Team / Key Based Logging on UI

This release brings support for Proxy Admins to configure Team/Key Based Logging Settings on the UI. This allows routing LLM request/response logs to different Langfuse/Arize projects based on the team or key.

For developers using LiteLLM, their logs are automatically routed to their specific Arize/Langfuse projects. On this release, we support the following integrations for key/team based logging:

langfuse
arize
langsmith

Azure Content Safety Guardrails

LiteLLM now supports Azure Content Safety Guardrails for Prompt Injection and Text Moderation. This is great for internal chat-ui use cases, as you can now create guardrails with detection for Azure’s Harm Categories, specify custom severity thresholds and run them across 100+ LLMs for just that use-case (or across all your calls).

Get Started

Python SDK: 2.3 Second Faster Import Times

This release brings significant performance improvements to the Python SDK with 2.3 seconds faster import times. We've refactored the initialization process to reduce startup overhead, making LiteLLM more efficient for applications that need quick initialization. This is a major improvement for applications that need to initialize LiteLLM quickly.

New Models / Updated Models

Pricing / Context Window Updates

Provider	Model	Context Window	Input ($/1M tokens)	Output ($/1M tokens)	Type
Watsonx	`watsonx/mistralai/mistral-large`	131k	$3.00	$10.00	New
Azure AI	`azure_ai/cohere-rerank-v3.5`	4k	$2.00/1k queries	-	New (Rerank)

Features

🆕 GitHub Copilot - Use GitHub Copilot API with LiteLLM - PR, Get Started
🆕 VertexAI DeepSeek - Add support for VertexAI DeepSeek models - PR, Get Started
Azure AI
- Add azure_ai cohere rerank v3.5 - PR, Get Started
Vertex AI
- Add size parameter support for image generation - PR, Get Started
Custom LLM
- Pass through extra_ properties on "custom" llm provider - PR

Bugs

Mistral
- Fix transform_response handling for empty string content - PR
- Turn Mistral to use llm_http_handler - PR
Gemini
- Fix tool call sequence - PR
- Fix custom api_base path preservation - PR
Anthropic
- Fix user_id validation logic - PR
Bedrock
- Support optional args for bedrock - PR
Ollama
- Fix default parameters for ollama-chat - PR
VLLM
- Add 'audio_url' message type support - PR

LLM API Endpoints

Features

/batches
- Support batch retrieve with target model Query Param - PR
- Anthropic completion bridge improvements - PR
/responses
- Azure responses api bridge improvements - PR
- Fix responses api error handling - PR
/mcp (MCP Gateway)
- Add MCP url masking on frontend - PR
- Add MCP servers header to scope - PR
- Litellm mcp tool prefix - PR
- Segregate MCP tools on connections using headers - PR
- Added changes to mcp url wrapping - PR

Bugs

/v1/messages
- Remove hardcoded model name on streaming - PR
- Support lowest latency routing - PR
- Non-anthropic models token usage returned - PR
/chat/completions
- Support Cursor IDE tool_choice format {"type": "auto"} - PR
/generateContent
- Allow passing litellm_params - PR
- Only pass supported params when using OpenAI models - PR
- Fix using gemini-cli with Vertex Anthropic Models - PR
Streaming
- Fix Error code: 307 for LlamaAPI Streaming Chat - PR
- Store finish reason even if is_finished - PR

Spend Tracking / Budget Improvements

Bugs

Fix allow strings in calculate cost - PR
VertexAI Anthropic streaming cost tracking with prompt caching fixes - PR

Management Endpoints / UI

Bugs

Team Management
- Prevent team model reset on model add - PR
- Return team-only models on /v2/model/info - PR
- Render team member budget correctly - PR
UI Rendering
- Fix rendering ui on non-root images - PR
- Correctly display 'Internal Viewer' user role - PR
Configuration
- Handle empty config.yaml - PR
- Fix gemini /models - replace models/ as expected - PR

Features

Team Management
- Allow adding team specific logging callbacks - PR
- Add Arize Team Based Logging - PR
- Allow Viewing/Editing Team Based Callbacks - PR
UI Improvements
- Comma separated spend and budget display - PR
- Add logos to callback list - PR
CLI
- Add litellm-proxy cli login for starting to use litellm proxy - PR
Email Templates
- Customizable Email template - Subject and Signature - PR

Logging / Guardrail Integrations

Features

Guardrails
- All guardrails are now supported on the UI - PR
Azure Content Safety
- Add Azure Content Safety Guardrails to LiteLLM proxy - PR
- Add azure content safety guardrails to the UI - PR
DeepEval
- Fix DeepEval logging format for failure events - PR
Arize
- Add Arize Team Based Logging - PR
Langfuse
- Langfuse prompt_version support - PR
Sentry Integration
- Add sentry scrubbing - PR
AWS SQS Logging
- New AWS SQS Logging Integration - PR
S3 Logger
- Add failure logging support - PR
Prometheus Metrics
- Add better error validation for prometheus metrics and labels - PR

Bugs

Security
- Ensure only LLM API route fails get logged on Langfuse - PR
OpenMeter
- Integration error handling fix - PR
Message Redaction
- Ensure message redaction works for responses API logging - PR
Bedrock Guardrails
- Fix bedrock guardrails post_call for streaming responses - PR

Performance / Loadbalancing / Reliability improvements

Features

Python SDK
- 2 second faster import times - PR
- Reduce python sdk import time by .3s - PR
Error Handling
- Add error handling for MCP tools not found or invalid server - PR
SSL/TLS
- Fix SSL certificate error - PR
- Fix custom ca bundle support in aiohttp transport - PR

General Proxy Improvements

Startup
- Add new banner on startup - PR
Dependencies
- Update pydantic version - PR

New Contributors

@wildcard made their first contribution in https://github.com/BerriAI/litellm/pull/12157
@colesmcintosh made their first contribution in https://github.com/BerriAI/litellm/pull/12168
@seyeong-han made their first contribution in https://github.com/BerriAI/litellm/pull/11946
@dinggh made their first contribution in https://github.com/BerriAI/litellm/pull/12162
@raz-alon made their first contribution in https://github.com/BerriAI/litellm/pull/11432
@tofarr made their first contribution in https://github.com/BerriAI/litellm/pull/12200
@szafranek made their first contribution in https://github.com/BerriAI/litellm/pull/12179
@SamBoyd made their first contribution in https://github.com/BerriAI/litellm/pull/12147
@lizzij made their first contribution in https://github.com/BerriAI/litellm/pull/12219
@cipri-tom made their first contribution in https://github.com/BerriAI/litellm/pull/12201
@zsimjee made their first contribution in https://github.com/BerriAI/litellm/pull/12185
@jroberts2600 made their first contribution in https://github.com/BerriAI/litellm/pull/12175
@njbrake made their first contribution in https://github.com/BerriAI/litellm/pull/12202
@NANDINI-star made their first contribution in https://github.com/BerriAI/litellm/pull/12244
@utsumi-fj made their first contribution in https://github.com/BerriAI/litellm/pull/12230
@dcieslak19973 made their first contribution in https://github.com/BerriAI/litellm/pull/12283
@hanouticelina made their first contribution in https://github.com/BerriAI/litellm/pull/12286
@lowjiansheng made their first contribution in https://github.com/BerriAI/litellm/pull/11999
@JoostvDoorn made their first contribution in https://github.com/BerriAI/litellm/pull/12281
@takashiishida made their first contribution in https://github.com/BerriAI/litellm/pull/12239

Deploy this version​

Key Highlights​

MCP Gateway: Namespaced MCP Servers​

Usage​

Team / Key Based Logging on UI​

Azure Content Safety Guardrails​

Python SDK: 2.3 Second Faster Import Times​

New Models / Updated Models​

Pricing / Context Window Updates​

Features​

Bugs​

LLM API Endpoints​

Features​

Bugs​

Spend Tracking / Budget Improvements​

Bugs​

Management Endpoints / UI​

Bugs​

Features​

Logging / Guardrail Integrations​

Features​

Bugs​

Performance / Loadbalancing / Reliability improvements​

Features​

General Proxy Improvements​

New Contributors​

Git Diff​

Deploy this version

Key Highlights

MCP Gateway: Namespaced MCP Servers

Usage

Team / Key Based Logging on UI

Azure Content Safety Guardrails

Python SDK: 2.3 Second Faster Import Times

New Models / Updated Models

Pricing / Context Window Updates

Features

Bugs

LLM API Endpoints

Features

Bugs

Spend Tracking / Budget Improvements

Bugs

Management Endpoints / UI

Bugs

Features

Logging / Guardrail Integrations

Features

Bugs

Performance / Loadbalancing / Reliability improvements

Features

General Proxy Improvements

New Contributors

Git Diff