[PRE-RELEASE]v1.75.8
Deploy this versionโ
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.75.8
pip install litellm
pip install litellm==1.75.8
Key Highlightsโ
- Team Member Rate Limits - Individual rate limiting for team members with JWT authentication support.
- Performance Improvements - New experimental HTTP handler flag for 100+ RPS improvement on OpenAI calls.
- GPT-5 Model Family Support - Full support for OpenAI's GPT-5 models with
reasoning_effort
parameter and Azure OpenAI integration. - Azure AI Flux Image Generation - Support for Azure AI's Flux image generation models.
New Models / Updated Modelsโ
New Model Supportโ
Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
---|---|---|---|---|---|
Azure AI | azure_ai/FLUX-1.1-pro | - | - | $40/image | Image generation |
Azure AI | azure_ai/FLUX.1-Kontext-pro | - | - | $40/image | Image generation |
Vertex AI | vertex_ai/deepseek-ai/deepseek-r1-0528-maas | 65k | $1.35 | $5.4 | Chat completions + reasoning |
OpenRouter | openrouter/deepseek/deepseek-chat-v3-0324 | 65k | $0.14 | $0.28 | Chat completions |
Featuresโ
- OpenAI
- Added
reasoning_effort
parameter support for GPT-5 model family - PR #13475, Get Started - Support for
reasoning
parameter in Responses API - PR #13475, Get Started
- Added
- Azure OpenAI
- GPT-5 support with max_tokens and
reasoning
parameter - PR #13510, Get Started
- GPT-5 support with max_tokens and
- AWS Bedrock
- Streaming support for bedrock gpt-oss model family - PR #13346, Get Started
/messages
endpoint compatibility withbedrock/converse/<model>
- PR #13627- Cache point support for assistant and tool messages - PR #13640
- Azure AI
- New Azure AI Flux Image Generation provider - PR #13592, Get Started
- Fixed Content-Type header for image generation - PR #13584
- CometAPI
- New provider support with chat completions and streaming - PR #13458
- SambaNova
- Added embedding model support - PR #13308, Get Started
- Vertex AI
- hosted_vllm
- Added
reasoning_effort
parameter support - PR #13620, Get Started
- Added
Bugsโ
- OCI
- Fixed streaming issues - PR #13437
- Ollama
- Fixed GPT-OSS streaming with 'thinking' field - PR #13375
- VolcEngine
- Fixed thinking disabled parameter handling - PR #13598
- Streaming
- Consistent 'finish_reason' chunk indexing - PR #13560
LLM API Endpointsโ
Featuresโ
Bugsโ
- Real-time API
- Fixed endpoint for no intent scenarios - PR #13476
- Responses API
- Fixed
stream=True
+background=True
with Responses API - PR #13654
- Fixed
MCP Gatewayโ
Featuresโ
- Access Control & Configuration
- Enhanced MCPServerManager with access groups and description support - PR #13549
Bugsโ
- Authentication
- Fixed MCP gateway key authentication - PR #13630
Management Endpoints / UIโ
Featuresโ
- Team Management
- Team Member Rate Limits implementation - PR #13601
- JWT authentication support for team member rate limits - PR #13601
- Show team member TPM/RPM limits in UI - PR #13662
- Allow editing team member RPM/TPM limits - PR #13669
- Allow unsetting TPM and RPM in Teams Settings - PR #13430
- Team Member Permissions Page access column changes - PR #13145
- Key Management
- UI Improvements
- Credentials
- Added CredentialDeleteModal component and integration with CredentialsPanel - PR #13550
- Admin & Permissions
- Allow routes for admin viewer - PR #13588
Bugsโ
- SCIM Integration
- Fixed SCIM Team Memberships metadata handling - PR #13553
- Authentication
- Fixed incorrect key info endpoint - PR #13633
Logging / Guardrail Integrationsโ
Featuresโ
- Langfuse OTEL
- MLflow
- Updated MLflow logger usage span attributes - PR #13561
Bugsโ
- Security
Performance / Loadbalancing / Reliability improvementsโ
Featuresโ
- HTTP Performance
- New 'EXPERIMENTAL_OPENAI_BASE_LLM_HTTP_HANDLER' flag for +100 RPS improvement on OpenAI calls - PR #13625
- Database Monitoring
- Added DB metrics to Prometheus - PR #13626
- Error Handling
- Added safe divide by 0 protection to prevent crashes - PR #13624
Bugsโ
- Dependencies
- Updated boto3 to 1.36.0 and aioboto3 to 13.4.0 - PR #13665
General Proxy Improvementsโ
Featuresโ
- Database
- Removed redundant
use_prisma_migrate
flag - now default - PR #13555
- Removed redundant
- LLM Translation
New Contributorsโ
- @TensorNull made their first contribution in PR #13458
- @MajorD00m made their first contribution in PR #13577
- @VerunicaM made their first contribution in PR #13584
- @huangyafei made their first contribution in PR #13607
- @TomeHirata made their first contribution in PR #13561
- @willfinnigan made their first contribution in PR #13659
- @dcbark01 made their first contribution in PR #13633
- @javacruft made their first contribution in PR #13631