Skip to main content

v1.76.1-stable - Gemini 2.5 Flash Image

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

Deploy this versionโ€‹

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.76.1

Key Highlightsโ€‹

  • Major Performance Improvements - 6.5x faster LiteLLM Python SDK completion with fastuuid integration.
  • New Model Support - Gemini 2.5 Flash Image Preview, Grok Code Fast, and GPT Realtime models
  • Enhanced Provider Support - DeepSeek-v3.1 pricing on Fireworks AI, Vercel AI Gateway, and improved Anthropic/GitHub Copilot integration
  • MCP Improvements - Better connection testing and SSE MCP tools bug fixes

Major Changesโ€‹

  • Added support for using Gemini 2.5 Flash Image Preview with /chat/completions. ๐Ÿšจ Warning If you were using gemini-2.0-flash-exp-image-generation please follow this migration guide. Gemini Image Generation Migration Guide

Performance Improvementsโ€‹

This release includes significant performance optimizations:

  • 6.5x faster LiteLLM Python SDK Completion - Major performance boost for completion operations - PR #13990
  • fastuuid Integration - 2.1x faster UUID generation with +80 RPS improvement for /chat/completions and other LLM endpoints - PR #13992, PR #14016
  • Optimized Request Logging - Don't print request params by default for +50 RPS improvement - PR #14015
  • Cache Performance - 21% speedup in InMemoryCache.evict_cache and 45% speedup in _is_debugging_on function - PR #14012, PR #13988

New Models / Updated Modelsโ€‹

New Model Supportโ€‹

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Features
Googlegemini-2.5-flash-image-preview1M$0.30$2.50Chat completions + image generation ($0.039/image)
X.AIxai/grok-code-fast256K$0.20$1.50Code generation
OpenAIgpt-realtime32K$4.00$16.00Real-time conversation + audio
Vercel AI Gatewayvercel_ai_gateway/openai/o3200K$2.00$8.00Advanced reasoning
Vercel AI Gatewayvercel_ai_gateway/openai/o3-mini200K$1.10$4.40Efficient reasoning
Vercel AI Gatewayvercel_ai_gateway/openai/o4-mini200K$1.10$4.40Latest mini model
DeepInfradeepinfra/zai-org/GLM-4.5131K$0.55$2.00Chat completions
Perplexityperplexity/codellama-34b-instruct16K$0.35$1.40Code generation
Fireworks AIfireworks_ai/accounts/fireworks/models/deepseek-v3p1128K$0.56$1.68Chat completions

Additional Models Added: Various other Vercel AI Gateway models were added too. See models.litellm.ai for the full list.

Featuresโ€‹

New Provider Supportโ€‹


LLM API Endpointsโ€‹

Featuresโ€‹

Bugsโ€‹

  • General
    • Fixed handling of None metadata in batch requests - PR #13996
    • Fixed token_counter with special token input - PR #13374
    • Removed incorrect web search support for azure/gpt-4.1 family - PR #13566

MCP Gatewayโ€‹

Featuresโ€‹

  • SSE MCP Tools
    • Bug fix for adding SSE MCP tools - improved connection testing when adding MCPs - PR #14048

Read More


Management Endpoints / UIโ€‹

Featuresโ€‹

  • Team Management
    • Allow setting Team Member RPM/TPM limits when creating a team - PR #13943
  • UI Improvements
    • Fixed Next.js Security Vulnerabilities in UI Dashboard - PR #14084
    • Fixed collapsible navbar design - PR #14075

Bugsโ€‹

  • Authentication
    • Fixed Virtual keys with llm_api type causing Internal Server Error for /anthropic/* and other LLM passthrough routes - PR #14046

Logging / Guardrail Integrationsโ€‹

Featuresโ€‹

New Guardrail Supportโ€‹


Performance / Loadbalancing / Reliability improvementsโ€‹

Featuresโ€‹

  • Caching
    • Verify if cache entry has expired prior to serving it to client - PR #13933
    • Fixed error saving latency as timedelta on Redis - PR #14040
  • Router
    • Refactored router to choose weights by 'weight', 'rpm', 'tpm' in one loop for simple_shuffle - PR #13562
  • Logging
    • Fixed LoggingWorker graceful shutdown to prevent CancelledError warnings - PR #14050
    • Enhanced logging for containers to log on files both with usual format and json format - PR #13394

Bugsโ€‹

  • Dependencies
    • Bumped orjson version to "3.11.2" - PR #13969

General Proxy Improvementsโ€‹

Featuresโ€‹

  • AWS
    • Add support for AWS assume_role with a session token - PR #13919
  • OCI Provider
    • Added oci_key_file as an optional_parameter - PR #14036
  • Configuration
    • Allow configuration to set threshold before request entry in spend log gets truncated - PR #14042
    • Enhanced proxy_config configuration: add support for existing configmap in Helm charts - PR #14041
  • Docker
    • Added back supervisor to non-root image - PR #13922

New Contributorsโ€‹

  • @ArthurRenault made their first contribution in PR #13922
  • @stevenmanton made their first contribution in PR #13919
  • @uc4w6c made their first contribution in PR #13914
  • @nielsbosma made their first contribution in PR #13573
  • @Yuki-Imajuku made their first contribution in PR #13567
  • @codeflash-ai[bot] made their first contribution in PR #13988
  • @ColeFrench made their first contribution in PR #13978
  • @dttran-glo made their first contribution in PR #13969
  • @manascb1344 made their first contribution in PR #13965
  • @DorZion made their first contribution in PR #13572
  • @edwardsamuel made their first contribution in PR #13536
  • @blahgeek made their first contribution in PR #13374
  • @Deviad made their first contribution in PR #13394
  • @XSAM made their first contribution in PR #13775
  • @KRRT7 made their first contribution in PR #14012
  • @ikaadil made their first contribution in PR #13991
  • @timelfrink made their first contribution in PR #13691
  • @qidu made their first contribution in PR #13562
  • @nagyv made their first contribution in PR #13243
  • @xywei made their first contribution in PR #12885
  • @ericgtkb made their first contribution in PR #12797
  • @NoWall57 made their first contribution in PR #13945
  • @lmwang9527 made their first contribution in PR #14050
  • @WilsonSunBritten made their first contribution in PR #14042
  • @Const-antine made their first contribution in PR #14041
  • @dmvieira made their first contribution in PR #14040
  • @gotsysdba made their first contribution in PR #14036
  • @moshemorad made their first contribution in PR #14005
  • @joshualipman123 made their first contribution in PR #13144

Full Changelogโ€‹