Skip to main content

[Pre-Release] v1.73.0-stable

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM
info

This is a pre-release version.

The production version will be released on Wednesday.

Deploy this version​

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.73.0.rc.1

TLDR​

  • Why Upgrade
    • Passthrough Endpoints v2: Enhanced support for subroutes and custom cost tracking for passthrough endpoints.
    • Health Check Dashboard: New frontend UI for monitoring model health and status.
    • User Management: Set default team for new users - enables giving all users $10 API keys for exploration.
  • Who Should Read
    • Teams using Passthrough Endpoints
    • Teams using User Management on LiteLLM
    • Teams using Health Check Dashboard for models
    • Teams using Claude Code with LiteLLM
  • Risk of Upgrade
    • Low
      • No major breaking changes to existing functionality.

Key Highlights​

Passthrough Endpoints v2​


This release brings support for adding billing and full URL forwarding for passthrough endpoints.

Previously, you could only map simple endpoints, but now you can add just /bria and all subroutes automatically get forwarded - for example, /bria/v1/text-to-image/base/model and /bria/v1/enhance_image will both be forwarded to the target URL with the same path structure.

This means you as Proxy Admin can onboard third-party endpoints like Bria API and Mistral OCR, set a cost per request, and give your developers access to the complete API functionality.

Learn more about Passthrough Endpoints

v2 Health Checks​


This release brings support for Proxy Admins to select which specific models to health check and see the health status as soon as its individual check completes, along with last check times.

This allows Proxy Admins to immediately identify which specific models are in a bad state and view the full error stack trace for faster troubleshooting.

Set Default Team for New Users​


v1.73.0 introduces the ability to assign new users to Default Teams. This makes it much easier to enable experimentation with LLMs within your company, while also ensuring spend for exploration is tracked correctly.

What this means for Proxy Admins:

  • Set a max budget per team member: This sets a max amount an individual can spend within a team.
  • Set a default team for new users: When a new user signs in via SSO / invitation link, they will be automatically added to this team.

What this means for Developers:

  • View models across teams: You can now go to Models + Endpoints and view the models you have access to, across all teams you're a member of.
  • Safe create key modal: If you have no model access outside of a team (default behaviour), you are now nudged to select a team on the Create Key modal. This resolves a common confusion point for new users onboarding to the proxy.

Get Started


New / Updated Models​

Pricing / Context Window Updates​

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Type
Google VertexAIvertex_ai/imagen-4N/AImage GenerationImage GenerationNew
Google VertexAIvertex_ai/imagen-4-previewN/AImage GenerationImage GenerationNew
Geminigemini-2.5-pro2M$1.25$5.00New
Geminigemini-2.5-flash-lite1M$0.075$0.30New
OpenRouterVarious modelsUpdatedUpdatedUpdatedUpdated
Azureazure/o3200k$2.00$8.00Updated
Azureazure/o3-pro200k$2.00$8.00Updated
Azure OpenAIAzure Codex ModelsVariousVariousVariousNew

Updated Models​

Features​

  • Azure
    • Support for new /v1 preview Azure OpenAI API - PR, Get Started
    • Add Azure Codex Models support - PR, Get Started
    • Make Azure AD scope configurable - PR
    • Handle more GPT custom naming patterns - PR
    • Update o3 pricing to match OpenAI pricing - PR
  • VertexAI
    • Add Vertex Imagen-4 models - PR, Get Started
    • Anthropic streaming passthrough cost tracking - PR
  • Gemini
    • Working Gemini TTS support via /v1/speech endpoint - PR
    • Fix gemini 2.5 flash config - PR
    • Add missing flash-2.5-flash-lite model and fix pricing - PR
    • Mark all gemini-2.5 models as supporting PDF input - PR
    • Add gemini-2.5-pro with reasoning support - PR
  • AWS Bedrock
    • AWS credentials no longer mandatory - PR
    • Add AWS Bedrock profiles for APAC region - PR
    • Fix AWS Bedrock Claude tool call index - PR
    • Handle base64 file data with qs:.. prefix - PR
    • Add Mistral Small to BEDROCK_CONVERSE_MODELS - PR
  • Mistral
    • Enhance Mistral API with parallel tool calls support - PR
  • Meta Llama API
    • Enable tool calling for meta_llama models - PR
  • Volcengine
    • Add thinking parameter support - PR

Bugs​

  • VertexAI
    • Handle missing tokenCount in promptTokensDetails - PR
    • Fix vertex AI claude thinking params - PR
  • Gemini
  • Custom LLM
    • Set anthropic custom LLM provider property - PR
  • Anthropic
    • Bump anthropic package version - PR
  • Ollama
    • Update ollama_embeddings to work on sync API - PR
    • Fix response_format not working - PR

LLM API Endpoints​

Features​

  • Responses API
    • Day-0 support for OpenAI re-usable prompts Responses API - PR, Get Started
    • Support passing image URLs in Completion-to-Responses bridge - PR
  • MCP Gateway
    • Add Allowed MCPs to Creating/Editing Organizations - PR, Get Started
    • Allow connecting to MCP with authentication headers - PR, Get Started
  • Speech API
    • Working Gemini TTS support via OpenAI's /v1/speech endpoint - PR
  • Passthrough Endpoints
    • Add support for subroutes for passthrough endpoints - PR
    • Support for setting custom cost per passthrough request - PR
    • Ensure "Request" is tracked for passthrough requests on LiteLLM Proxy - PR
    • Add V2 Passthrough endpoints on UI - PR
    • Move passthrough endpoints under Models + Endpoints in UI - PR
    • QA improvements for adding passthrough endpoints - PR, PR
  • Models API
    • Allow /models to return correct models for custom wildcard prefixes - PR

Bugs​

  • Messages API
    • Fix /v1/messages endpoint always using us-central1 with vertex_ai-anthropic models - PR
    • Fix model_group tracking for /v1/messages and /moderations - PR
    • Fix cost tracking and logging via /v1/messages API when using Claude Code - PR
  • MCP Gateway
    • Fix using MCPs defined on config.yaml - PR
  • Chat Completion API
    • Allow dict for tool_choice argument in acompletion - PR
  • Passthrough Endpoints
    • Don't log request to Langfuse passthrough on Langfuse - PR

Spend Tracking​

Features​

  • User Agent Tracking
    • Automatically track spend by user agent (allows cost tracking for Claude Code) - PR
    • Add user agent tags in spend logs payload - PR
  • Tag Management
    • Support adding public model names in tag management - PR

Management Endpoints / UI​

Features​

  • Test Key Page
    • Allow testing /v1/messages on the Test Key Page - PR
  • SSO
    • Allow passing additional headers - PR
  • JWT Auth
    • Correctly return user email - PR
  • Model Management
    • Allow editing model access group for existing model - PR
  • Team Management
    • Allow setting default team for new users - PR, PR
    • Fix default team settings - PR
  • SCIM
    • Add error handling for existing user on SCIM - PR
    • Add SCIM PATCH and PUT operations for users - PR
  • Health Check Dashboard
    • Implement health check backend API and storage functionality - PR
    • Add LiteLLM_HealthCheckTable to database schema - PR
    • Implement health check frontend UI components and dashboard integration - PR
    • Add success modal for health check responses - PR
    • Fix clickable model ID in health check table - PR
    • Fix health check UI table design - PR

Logging / Guardrails Integrations​

Bugs​

  • Prometheus
    • Fix bug for using prometheus metrics config - PR

Security & Reliability​

Security Fixes​

  • Documentation Security
    • Security fixes for docs - PR
    • Add Trivy Security Scan for UI + Docs folder - remove all vulnerabilities - PR

Reliability Improvements​

  • Dependencies
    • Fix aiohttp version requirement - PR
    • Bump next from 14.2.26 to 14.2.30 in UI dashboard - PR
  • Networking
    • Allow using CA Bundles - PR
    • Add workload identity federation between GCP and AWS - PR

General Proxy Improvements​

Features​

  • Deployment
    • Add deployment annotations for Kubernetes - PR
    • Add ciphers in command and pass to hypercorn for proxy - PR
  • Custom Root Path
    • Fix loading UI on custom root path - PR
  • SDK Improvements
    • LiteLLM SDK / Proxy improvement (don't transform message client-side) - PR

Bugs​


New Contributors​

  • @kjoth made their first contribution in PR
  • @shagunb-acn made their first contribution in PR
  • @MadsRC made their first contribution in PR
  • @Abiji-2020 made their first contribution in PR
  • @salzubi401 made their first contribution in PR
  • @orolega made their first contribution in PR
  • @X4tar made their first contribution in PR
  • @karen-veigas made their first contribution in PR
  • @Shankyg made their first contribution in PR
  • @pascallim made their first contribution in PR
  • @lgruen-vcgs made their first contribution in PR
  • @rinormaloku made their first contribution in PR
  • @InvisibleMan1306 made their first contribution in PR
  • @ervwalter made their first contribution in PR
  • @ThakeeNathees made their first contribution in PR
  • @jnhyperion made their first contribution in PR
  • @Jannchie made their first contribution in PR

Demo Instance​

Here's a Demo Instance to test changes:

Git Diff​