[Preview] v1.78.5-stable - Native OCR Support
Deploy this versionโ
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.78.5.rc.1
pip install litellm
pip install litellm==1.78.5
Key Highlightsโ
- Native OCR Endpoints - Native
/v1/ocr
endpoint support with cost tracking for Mistral OCR and Azure AI OCR - Global Vendor Discounts - Specify global vendor discount percentages for accurate cost tracking and reporting
- Team Spending Reports - Team admins can now export detailed spending reports for their teams
- Claude Haiku 4.5 - Day 0 support for Claude Haiku 4.5 across Bedrock, Vertex AI, and OpenRouter with 200K context window
- GPT-5-Codex - Support for GPT-5-Codex via Responses API on OpenAI and Azure
- Performance Improvements - Major router optimizations: O(1) model lookups, 10-100x faster shallow copy, 30-40% faster timing calls, and O(n) to O(1) hash generation
New Models / Updated Modelsโ
New Model Supportโ
Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
---|---|---|---|---|---|
Anthropic | claude-haiku-4-5 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching, computer use |
Anthropic | claude-haiku-4-5-20251001 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching, computer use |
Bedrock | anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching |
Bedrock | global.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching |
Bedrock | jp.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (JP Cross-Region) |
Bedrock | us.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (US region) |
Bedrock | eu.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (EU region) |
Bedrock | apac.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (APAC region) |
Bedrock | au.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (AU region) |
Vertex AI | vertex_ai/claude-haiku-4-5@20251001 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching |
OpenAI | gpt-5 | 272K | $1.25 | $10.00 | Chat, responses API, reasoning, vision, function calling, prompt caching |
OpenAI | gpt-5-codex | 272K | $1.25 | $10.00 | Responses API mode |
Azure | azure/gpt-5-codex | 272K | $1.25 | $10.00 | Responses API mode |
Gemini | gemini-2.5-flash-image | 32K | $0.30 | $2.50 | Image generation (GA - Nano Banana) - $0.039/image |
ZhipuAI | glm-4.6 | - | - | - | Chat completions |
Featuresโ
-
- GPT-5 return reasoning content via /chat/completions + GPT-5-Codex working on Claude Code - PR #15441
-
- Add anthropic.claude-haiku-4-5-20251001-v1:0 on Bedrock, VertexAI - PR #15581
- Add Claude Haiku 4.5 support for Bedrock global and US regions - PR #15650
- Add Claude Haiku 4.5 support for Bedrock Other regions - PR #15653
- Add JP Cross-Region Inference jp.anthropic.claude-haiku-4-5-20251001 - PR #15598
- Fix: bedrock-pricing-geo-inregion-cross-region / add Global Cross-Region Inference - PR #15685
- Fix: Support us-gov prefix for AWS GovCloud Bedrock models - PR #15626
- Fix GPT-OSS in Bedrock now supports streaming. Revert fake streaming - PR #15668
-
- Fix(ollama/chat): correctly map reasoning_effort to think in requests - PR #15465
-
- Fix(cometapi): improve CometAPI provider support (embeddings, image generation, docs) - PR #15591
-
- Adding new models to the lemonade provider - PR #15554
-
- Fix (pricing): Fix pricing for watsonx model family for various models - PR #15670
-
- Add glm-4.6 model to pricing configuration - PR #15679
-
- Add Vertex AI Discovery Engine Rerank Support - PR #15532
Bug Fixesโ
-
- Fix: Pricing for Claude Sonnet 4.5 in US regions is 10x too high - PR #15374
-
- Change gpt-5-codex support in model_price json - PR #15540
-
- Fix filtering headers for signature calcs - PR #15590
-
General
- Add native reasoning and streaming support flag for gpt-5-codex - PR #15569
LLM API Endpointsโ
Featuresโ
-
- Feat: Add native litellm.ocr() functions - PR #15567
- Feat: Add /ocr route on LiteLLM AI Gateway - Adds support for native Mistral OCR calling - PR #15571
- Feat: Add Azure AI Mistral OCR Integration - PR #15572
- Feat: Native /ocr endpoint support - PR #15573
- Feat: Add Cost Tracking for /ocr endpoints - PR #15678
-
- Fix: Dall-e-2 for Image Edits API - PR #15604
-
- Feat: Allow calling /invoke, /converse routes through AI Gateway + models on config.yaml - PR #15618
Bugsโ
- General
Management Endpoints / UIโ
Featuresโ
-
Virtual Keys
-
Teams
- Feat: Allow Team Admins to export a report of the team spending - PR #15542
-
Passthrough
- Feat: Passthrough - allow admin to give access to specific passthrough endpoints - PR #15401
-
SCIM v2
- Feat(scim_v2.py): if group.id doesn't exist, use external id + Passthrough - ensure updates and deletions persist across instances - PR #15276
-
SSO
Logging / Guardrail / Prompt Management Integrationsโ
Guardrailsโ
-
General
-
- Feature: update pillar security integration to support no persistence mode in litellm proxy - PR #15599
Prompt Managementโ
- General
- Small fix code snippet custom_prompt_management.md - PR #15544
Spend Tracking, Budgets and Rate Limitingโ
-
Cost Tracking
-
Budgets
- Fix: improve budget clarity - PR #15682
Performance / Loadbalancing / Reliability improvementsโ
-
Router Optimizations
- Perf(router): use shallow copy instead of deepcopy for model aliases - 10-100x faster than deepcopy on nested dict structures - PR #15576
- Perf(router): optimize string concatenation in hash generation - Improves time complexity from O(nยฒ) to O(n) - PR #15575
- Perf(router): optimize model lookups with O(1) data structures - Replace O(n) scans with index map lookups - PR #15578
- Perf(router): optimize model lookups with O(1) index maps - Use model_id_to_deployment_index_map and model_name_to_deployment_indices for instant lookups - PR #15574
- Perf(router): optimize timing functions in completion hot path - Use time.perf_counter() for duration measurements and time.monotonic() for timeout calculations, providing 30-40% faster timing calls - PR #15617
-
SSL/TLS Performance
- Feat(ssl): add configurable ECDH curve for TLS performance - Configure via ssl_ecdh_curve setting to disable PQC on OpenSSL 3.x for better performance - PR #15617
-
Token Counter
- Fix(token-counter): extract model_info from deployment for custom_tokenizer - PR #15680
-
Performance Metrics
- Add: perf summary - PR #15458
-
CI/CD
- Fix: CI/CD - Missing env key & Linter type error - PR #15606
Documentation Updatesโ
-
Provider Documentation
-
General
- Fixed a few typos - PR #15267
New Contributorsโ
- @jlan-nl made their first contribution in PR #15374
- @ImadSaddik made their first contribution in PR #15267
- @huangyafei made their first contribution in PR #15472
- @mubashir1osmani made their first contribution in PR #15468
- @kowyo made their first contribution in PR #15465
- @dhruvyad made their first contribution in PR #15448
- @davizucon made their first contribution in PR #15544
- @FelipeRodriguesGare made their first contribution in PR #15540
- @ndrsfel made their first contribution in PR #15557
- @shinharaguchi made their first contribution in PR #15598
- @TensorNull made their first contribution in PR #15591
- @TeddyAmkie made their first contribution in PR #15583
- @aniketmaurya made their first contribution in PR #15580
- @eddierichter-amd made their first contribution in PR #15554
- @konekohana made their first contribution in PR #15535
- @Classic298 made their first contribution in PR #15495
- @afogel made their first contribution in PR #15599
- @orolega made their first contribution in PR #15633
- @LucasSugi made their first contribution in PR #15634
- @uc4w6c made their first contribution in PR #15619
- @Sameerlite made their first contribution in PR #15658
- @yuneng-jiang made their first contribution in PR #15672
- @Nikro made their first contribution in PR #15680