Skip to main content

[Preview] v1.78.0-stable - MCP Gateway: Control Tool Access by Team/Key

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM
Alexsander Hamir
Backend Performance Engineer
Achintya Rajan
Fullstack Engineer
Sameer Kankute
Backend Engineer (LLM Translation)

Deploy this versionโ€‹


Key Highlightsโ€‹

  • MCP Gateway Enhancements - Fine-grained tool control at team/key level, OpenAPI to MCP server conversion, and per-tool parameter allowlists
  • GPT-5 Pro & GPT-Image-1-Mini - Day 0 support for OpenAI's GPT-5 Pro (400K context) and gpt-image-1-mini image generation
  • UI Performance Boost - Replaces bloated key list calls with lean key aliases endpoint, Turbopack for faster development, and major UI refactors
  • EnkryptAI Guardrails - New guardrail integration for content moderation
  • Tag-Based Budgets - Support for setting budgets based on request tags
  • Azure AD & SSO - Enhanced Azure AD default credentials selection and EntraID app roles support

New Models / Updated Modelsโ€‹

New Model Supportโ€‹

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Features
OpenAIgpt-5-pro400K$15.00$120.00Responses API, reasoning, vision, function calling, prompt caching, web search
OpenAIgpt-5-pro-2025-10-06400K$15.00$120.00Responses API, reasoning, vision, function calling, prompt caching, web search
OpenAIgpt-image-1-mini-$2.00/img-Image generation and editing
OpenAIgpt-realtime-mini128K$0.60$2.40Realtime audio, function calling
Azure AIazure_ai/Phi-4-mini-reasoning131K$0.08$0.32Function calling
Azure AIazure_ai/Phi-4-reasoning32K$0.125$0.50Function calling, reasoning
Azure AIazure_ai/MAI-DS-R1128K$1.35$5.40Reasoning, function calling
Bedrockau.anthropic.claude-sonnet-4-5-20250929-v1:0200K$3.30$16.50Chat, reasoning, vision, function calling, prompt caching
Bedrockglobal.anthropic.claude-sonnet-4-5-20250929-v1:0200K$3.00$15.00Chat, reasoning, vision, function calling, prompt caching
Bedrockglobal.anthropic.claude-sonnet-4-20250514-v1:01M$3.00$15.00Chat, reasoning, vision, function calling, prompt caching
Bedrockcohere.embed-v4:0128K$0.12-Embeddings, image input support
OCIoci/cohere.command-latest128K$1.56$1.56Function calling
OCIoci/cohere.command-a-03-2025256K$1.56$1.56Function calling
OCIoci/cohere.command-plus-latest128K$1.56$1.56Function calling
Together AItogether_ai/moonshotai/Kimi-K2-Instruct-0905262K$1.00$3.00Function calling
Together AItogether_ai/Qwen/Qwen3-Next-80B-A3B-Instruct262K$0.15$1.50Function calling
Together AItogether_ai/Qwen/Qwen3-Next-80B-A3B-Thinking262K$0.15$1.50Function calling
Vertex AIMedGemma modelsVariesVariesVariesMedical-focused Gemma models on custom endpoints
Watson X27 new foundation modelsVariesVariesVariesGranite, Llama, Mistral families

Featuresโ€‹

  • OpenAI

    • Add GPT-5 Pro model configuration and documentation - PR #15258
    • Add stop parameter to non-supported params for GPT-5 - PR #15244
    • Day 0 Support, Add gpt-image-1-mini - PR #15259
    • Add gpt-realtime-mini support - PR #15283
    • Add gpt-5-pro-2025-10-06 to model costs - PR #15344
    • Minimal fix: gpt5 models should not go on cooldown when called with temperature!=1 - PR #15330
  • Snowflake Cortex

    • Add function calling support for Snowflake Cortex REST API - PR #15221
  • Gemini

    • Fix header forwarding for Gemini/Vertex AI providers in proxy mode - PR #15231
  • Azure

    • Removed stop param from unsupported azure models - PR #15229
    • Fix(azure/responses): remove invalid status param from azure call - PR #15253
    • Add new Azure AI models with pricing details - PR #15387
    • AzureAD Default credentials - select credential type based on environment - PR #14470
  • Bedrock

    • Add Global Cross-Region Inference - PR #15210
    • Add Cohere Embed v4 support for AWS Bedrock - PR #15298
    • Fix(bedrock): include cacheWriteInputTokens in prompt_tokens calculation - PR #15292
    • Add Bedrock AU Cross-Region Inference for Claude Sonnet 4.5 - PR #15402
    • Converse โ†’ /v1/messages streaming doesn't handle parallel tool calls with Claude models - PR #15315
  • Vertex AI

    • Implement Context Caching for Vertex AI provider - PR #15226
    • Support for Vertex AI Gemma Models on Custom Endpoints - PR #15397
    • VertexAI - gemma model family support (custom endpoints) - PR #15419
    • VertexAI Gemma model family streaming support + Added MedGemma - PR #15427
  • OCI

    • Add OCI Cohere support with tool calling and streaming capabilities - PR #15365
  • Watson X

    • Add Watson X foundation model definitions to model_prices_and_context_window.json - PR #15219
    • Watsonx - Apply correct prompt templates for openai/gpt-oss model family - PR #15341
  • OpenRouter

    • Fix - (openrouter): move cache_control to content blocks for claude/gemini - PR #15345
    • Fix - OpenRouter cache_control to only apply to last content block - PR #15395
  • Together AI

Bug Fixesโ€‹

  • General
    • Bug fix: gpt-5-chat-latest has incorrect max_input_tokens value - PR #15116
    • Fix reasoning response ID - PR #15265
    • Fix issue with parsing assistant messages - PR #15320
    • Fix litellm_param based costing - PR #15336
    • Fix lint errors - PR #15406

LLM API Endpointsโ€‹

Featuresโ€‹

Bugsโ€‹

  • General
    • Fix x-litellm-cache-key header not being returned on cache hit - PR #15348

Management Endpoints / UIโ€‹

Featuresโ€‹

  • Proxy CLI Auth

    • Proxy CLI - dont store existing key in the URL, store it in the state param - PR #15290
  • Models + Endpoints

    • Make PATCH /model/{model_id}/update handle team_id consistently with POST /model/new - PR #15297
    • Feature: adds Infinity as a provider in the UI - PR #15285
    • Fix: model + endpoints page crash when config file contains router_settings.model_group_alias - PR #15308
    • Models & Endpoints Initial Refactor - PR #15435
    • Litellm UI API Reference page updates - PR #15438
  • Teams

    • Teams page: new column "Your Role" on the teams table - PR #15384
    • LiteLLM Dashboard Teams UI refactor - PR #15418
  • UI Infrastructure

    • Added prettier to autoformat frontend - PR #15215
    • Adds turbopack to the npm run dev command in UI to build faster during development - PR #15250
    • (perf) fix: Replaces bloated key list calls with lean key aliases endpoint - PR #15252
    • Potentially fixes a UI spasm issue with an expired cookie - PR #15309
    • LiteLLM UI Refactor Infrastructure - PR #15236
    • Enforces removal of unused imports from UI - PR #15416
    • Fix: usage page >> Model Activity >> spend per day graph: y-axis clipping on large spend values - PR #15389
    • Updates guardrail provider logos - PR #15421
  • Admin Settings

    • Fix: Router settings do not update despite success message - PR #15249
    • Fix: Prevents DB from accidentally overriding config file values if they are empty in DB - PR #15340
  • SSO

    • SSO - support EntraID app roles - PR #15351

Logging / Guardrail / Prompt Management Integrationsโ€‹

Featuresโ€‹

Guardrailsโ€‹


Spend Tracking, Budgets and Rate Limitingโ€‹

  • Tag Management

    • Tag Management - Add support for setting tag based budgets - PR #15433
  • Dynamic Rate Limiter v3

    • QA/Fixes - Dynamic Rate Limiter v3 - final QA - PR #15311
    • Fix dynamic Rate limiter v3 - inserting litellm_model_saturation - PR #15394
  • Shared Health Check

    • Implement Shared Health Check State Across Pods - PR #15380

MCP Gatewayโ€‹

  • Tool Control

    • MCP Gateway - UI - Select allowed tools for Key, Teams - PR #15241
    • MCP Gateway - Backend - Allow storing allowed tools by team/key - PR #15243
    • MCP Gateway - Fine-grained Database Object Storage Control - PR #15255
    • MCP Gateway - Litellm mcp fixes team control - PR #15304
    • MCP Gateway - QA/Fixes - Ensure Team/Key level enforcement works for MCPs - PR #15305
    • Feature: Include server_name in /v1/mcp/server/health endpoint response - PR #15431
  • OpenAPI Integration

    • MCP - support converting OpenAPI specs to MCP servers - PR #15343
    • MCP - specify allowed params per tool - PR #15346
  • Configuration

    • MCP - support setting CA_BUNDLE_PATH - PR #15253
    • Fix: Ensure MCP client stays open during tool call - PR #15391
    • Remove hardcoded "public" schema in migration.sql - PR #15363

Performance / Loadbalancing / Reliability improvementsโ€‹

  • Router Optimizations

    • Fix - Router: add model_name index for O(1) deployment lookups - PR #15113
    • Refactor Utils: extract inner function from client - PR #15234
    • Fix Networking: remove limitations - PR #15302
  • Session Management

    • Fix - Sessions not being shared - PR #15388
    • Fix: remove panic from hot path - PR #15396
    • Fix - shared session parsing and usage issue - PR #15440
    • Fix: handle closed aiohttp sessions - PR #15442
    • Fix: prevent session leaks when recreating aiohttp sessions - PR #15443
  • SSL/TLS Performance

    • Perf: optimize SSL/TLS handshake performance with prioritized cipher - PR #15398
  • Dependencies

    • Upgrades tenacity version to 8.5.0 - PR #15303
  • Data Masking

    • Fix - SensitiveDataMasker converts lists to string - PR #15420

General AI Gateway Improvementsโ€‹

Securityโ€‹

  • General
    • Fix: redact AWS credentials when redact_user_api_key_info enabled - PR #15321

Documentation Updatesโ€‹

  • Provider Documentation

  • Deployment

    • Deletion of docker-compose buggy comment that cause config.yaml based startup fail - PR #15425

New Contributorsโ€‹

  • @Gal-bloch made their first contribution in PR #15219
  • @lcfyi made their first contribution in PR #15315
  • @ashengstd made their first contribution in PR #15362
  • @vkolehmainen made their first contribution in PR #15363
  • @jlan-nl made their first contribution in PR #15330
  • @BCook98 made their first contribution in PR #15402
  • @PabloGmz96 made their first contribution in PR #15425

Full Changelogโ€‹