Skip to main content

v1.69.0-stable - Loadbalance Batch API Models

Krrish Dholakia
Ishaan Jaffer

Deploy this version​

docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.69.0-stable

Key Highlights​

LiteLLM v1.69.0-stable brings the following key improvements:

  • Loadbalance Batch API Models: Easily loadbalance across multiple azure batch deployments using LiteLLM Managed Files
  • Email Invites 2.0: Send new users onboarded to LiteLLM an email invite.
  • Nscale: LLM API for compliance with European regulations.
  • Bedrock /v1/messages: Use Bedrock Anthropic models with Anthropic's /v1/messages.

Batch API Load Balancing​

This release brings LiteLLM Managed File support to Batches. This is great for:

  • Proxy Admins: You can now control which Batch models users can call.
  • Developers: You no longer need to know the Azure deployment name when creating your batch .jsonl files - just specify the model your LiteLLM key has access to.

Over time, we expect LiteLLM Managed Files to be the way most teams use Files across /chat/completions, /batch, /fine_tuning endpoints.

Read more here

Email Invites​

This release brings the following improvements to our email invite integration:

  • New templates for user invited and key created events.
  • Fixes for using SMTP email providers.
  • Native support for Resend API.
  • Ability for Proxy Admins to control email events.

For LiteLLM Cloud Users, please reach out to us if you want this enabled for your instance.

Read more here

New Models / Updated Models​

  • Gemini (VertexAI + Google AI Studio)
    • Added gemini-2.5-pro-preview-05-06 models with pricing and context window info - PR
    • Set correct context window length for all Gemini 2.5 variants - PR
  • Perplexity:
    • Added new Perplexity models - PR
    • Added sonar-deep-research model pricing - PR
  • Azure OpenAI:
    • Fixed passing through of azure_ad_token_provider parameter - PR
  • OpenAI:
    • Added support for pdf url's in 'file' parameter - PR
  • Sagemaker:
    • Fix content length for sagemaker_chat provider - PR
  • Azure AI Foundry:
    • Added cost tracking for the following models PR
      • DeepSeek V3 0324
      • Llama 4 Scout
      • Llama 4 Maverick
  • Bedrock:
    • Added cost tracking for Bedrock Llama 4 models - PR
    • Fixed template conversion for Llama 4 models in Bedrock - PR
    • Added support for using Bedrock Anthropic models with /v1/messages format - PR
    • Added streaming support for Bedrock Anthropic models with /v1/messages format - PR
  • OpenAI: Added reasoning_effort support for o3 models - PR
  • Databricks:
    • Fixed issue when Databricks uses external model and delta could be empty - PR
  • Cerebras: Fixed Llama-3.1-70b model pricing and context window - PR
  • Ollama:
    • Fixed custom price cost tracking and added 'max_completion_token' support - PR
    • Fixed KeyError when using JSON response format - PR
  • 🆕 Nscale:
    • Added support for chat, image generation endpoints - PR

LLM API Endpoints​

  • Messages API:
    • 🆕 Added support for using Bedrock Anthropic models with /v1/messages format - PR and streaming support - PR
  • Moderations API:
    • Fixed bug to allow using LiteLLM UI credentials for /moderations API - PR
  • Realtime API:
    • Fixed setting 'headers' in scope for websocket auth requests and infinite loop issues - PR
  • Files API:
    • Unified File ID output support - PR
    • Support for writing files to all deployments - PR
    • Added target model name validation - PR
  • Batches API:
    • Complete unified batch ID support - replacing model in jsonl to be deployment model name - PR
    • Beta support for unified file ID (managed files) for batches - PR

Spend Tracking / Budget Improvements​

  • Bug Fix - PostgreSQL Integer Overflow Error in DB Spend Tracking - PR

Management Endpoints / UI​

  • Models
    • Fixed model info overwriting when editing a model on UI - PR
    • Fixed team admin model updates and organization creation with specific models - PR
  • Logs:
    • Bug Fix - copying Request/Response on Logs Page - PR
    • Bug Fix - log did not remain in focus on QA Logs page + text overflow on error logs - PR
    • Added index for session_id on LiteLLM_SpendLogs for better query performance - PR
  • User Management:
    • Added user management functionality to Python client library & CLI - PR
    • Bug Fix - Fixed SCIM token creation on Admin UI - PR
    • Bug Fix - Added 404 response when trying to delete verification tokens that don't exist - PR

Logging / Guardrail Integrations​

  • Custom Logger API: v2 Custom Callback API (send llm logs to custom api) - PR, Get Started
  • OpenTelemetry:
    • Fixed OpenTelemetry to follow genai semantic conventions + support for 'instructions' param for TTS - PR
  • Bedrock PII:
    • Add support for PII Masking with bedrock guardrails - Get Started, PR
  • Documentation:
    • Added documentation for StandardLoggingVectorStoreRequest - PR

Performance / Reliability Improvements​

  • Python Compatibility:
    • Added support for Python 3.11- (fixed datetime UTC handling) - PR
    • Fixed UnicodeDecodeError: 'charmap' on Windows during litellm import - PR
  • Caching:
    • Fixed embedding string caching result - PR
    • Fixed cache miss for Gemini models with response_format - PR

General Proxy Improvements​

  • Proxy CLI:
    • Added --version flag to litellm-proxy CLI - PR
    • Added dedicated litellm-proxy CLI - PR
  • Alerting:
    • Fixed Slack alerting not working when using a DB - PR
  • Email Invites:
    • Added V2 Emails with fixes for sending emails when creating keys + Resend API support - PR
    • Added user invitation emails - PR
    • Added endpoints to manage email settings - PR
  • General:
    • Fixed bug where duplicate JSON logs were getting emitted - PR

New Contributors​