[Pre Release] v1.74.7
Deploy this versionโ
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.74.7
pip install litellm
pip install litellm==1.74.7
Key Highlightsโ
- Vector Stores - Support for Vertex RAG Engine, PG Vector, OpenAI & Azure OpenAI Vector Stores.
- Health Check Improvements - Separate health check app on dedicated port for better Kubernetes liveness probes.
- New LLM Providers - Added Moonshot API
moonshot
andv0
provider support.
Vector Stores APIโ
This release introduces support for using VertexAI RAG Engine, PG Vector, Bedrock Knowledge Bases, and OpenAI Vector Stores with LiteLLM.
This is ideal for use cases requiring external knowledge sources with LLMs.
This brings the following benefits for LiteLLM users:
Proxy Admin Benefits:
- Fine-grained access control: determine which Keys and Teams can access specific Vector Stores
- Complete usage tracking and monitoring across all vector store operations
Developer Benefits:
- Simple, unified interface for querying vector stores and using them with LLM API requests
- Consistent API experience across all supported vector store providers
New Models / Updated Modelsโ
Pricing / Context Window Updatesโ
Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
---|---|---|---|---|
Azure AI | azure_ai/grok-3 | 131k | $3.30 | $16.50 |
Azure AI | azure_ai/global/grok-3 | 131k | $3.00 | $15.00 |
Azure AI | azure_ai/global/grok-3-mini | 131k | $0.25 | $1.27 |
Azure AI | azure_ai/grok-3-mini | 131k | $0.275 | $1.38 |
Azure AI | azure_ai/jais-30b-chat | 8k | $3200 | $9710 |
Groq | groq/moonshotai-kimi-k2-instruct | 131k | $1.00 | $3.00 |
AI21 | jamba-large-1.7 | 256k | $2.00 | $8.00 |
AI21 | jamba-mini-1.7 | 256k | $0.20 | $0.40 |
Together.ai | together_ai/moonshotai/Kimi-K2-Instruct | 131k | $1.00 | $3.00 |
v0 | v0/v0-1.0-md | 128k | $3.00 | $15.00 |
v0 | v0/v0-1.5-md | 128k | $3.00 | $15.00 |
v0 | v0/v0-1.5-lg | 512k | $15.00 | $75.00 |
Moonshot | moonshot/moonshot-v1-8k | 8k | $0.20 | $2.00 |
Moonshot | moonshot/moonshot-v1-32k | 32k | $1.00 | $3.00 |
Moonshot | moonshot/moonshot-v1-128k | 131k | $2.00 | $5.00 |
Moonshot | moonshot/moonshot-v1-auto | 131k | $2.00 | $5.00 |
Moonshot | moonshot/kimi-k2-0711-preview | 131k | $0.60 | $2.50 |
Moonshot | moonshot/moonshot-v1-32k-0430 | 32k | $1.00 | $3.00 |
Moonshot | moonshot/moonshot-v1-128k-0430 | 131k | $2.00 | $5.00 |
Moonshot | moonshot/moonshot-v1-8k-0430 | 8k | $0.20 | $2.00 |
Moonshot | moonshot/kimi-latest | 131k | $2.00 | $5.00 |
Moonshot | moonshot/kimi-latest-8k | 8k | $0.20 | $2.00 |
Moonshot | moonshot/kimi-latest-32k | 32k | $1.00 | $3.00 |
Moonshot | moonshot/kimi-latest-128k | 131k | $2.00 | $5.00 |
Moonshot | moonshot/kimi-thinking-preview | 131k | $30.00 | $30.00 |
Moonshot | moonshot/moonshot-v1-8k-vision-preview | 8k | $0.20 | $2.00 |
Moonshot | moonshot/moonshot-v1-32k-vision-preview | 32k | $1.00 | $3.00 |
Moonshot | moonshot/moonshot-v1-128k-vision-preview | 131k | $2.00 | $5.00 |
Featuresโ
- ๐ Moonshot API (Kimi)
- New LLM API integration for accessing Kimi models - PR #12592, Get Started
- ๐ v0 Provider
- New provider integration for v0.dev - PR #12751, Get Started
- OpenAI
- Use OpenAI DeepResearch models with
litellm.completion
(/chat/completions
) - PR #12627 DOC NEEDED - Add
input_fidelity
parameter for OpenAI image generation - PR #12662, Get Started
- Use OpenAI DeepResearch models with
- Azure OpenAI
- Anthropic
- Tool cache control support - PR #12668
- Bedrock
- Claude 4 /invoke route support - PR #12599, Get Started
- Application inference profile tool choice support - PR #12599
- Gemini
- VertexAI
- Added Vertex AI RAG Engine support (use with OpenAI compatible
/vector_stores
API) - PR #12752, Get Started
- Added Vertex AI RAG Engine support (use with OpenAI compatible
- vLLM
- Added support for using Rerank endpoints with vLLM - PR #12738, Get Started
- AI21
- Added ai21/jamba-1.7 model family pricing - PR #12593, Get Started
- Together.ai
- [New Model] add together_ai/moonshotai/Kimi-K2-Instruct - PR #12645, Get Started
- Groq
- Add groq/moonshotai-kimi-k2-instruct model configuration - PR #12648, Get Started
- Github Copilot
- Change System prompts to assistant prompts for GH Copilot - PR #12742, Get Started
Bugsโ
- Anthropic
- Fix streaming + response_format + tools bug - PR #12463
- XAI
- grok-4 does not support the
stop
param - PR #12646
- grok-4 does not support the
- AWS
- Role chaining with web authentication for AWS Bedrock - PR #12607
- VertexAI
- Add project_id to cached credentials - PR #12661
- Bedrock
- Fix bedrock nova micro and nova lite context window info in PR #12619
LLM API Endpointsโ
Featuresโ
- /chat/completions
- Include tool calls in output of trim_messages - PR #11517
- /v1/vector_stores
- New OpenAI-compatible vector store endpoints - PR #12699, Get Started
- Vector store search endpoint - PR #12749, Get Started
- Support for using PG Vector as a vector store - PR #12667, Get Started
- /streamGenerateContent
- Non-gemini model support - PR #12647
Bugsโ
- /vector_stores
- Knowledge Base Call returning error when passing as
tools
- PR #12628
- Knowledge Base Call returning error when passing as
MCP Gatewayโ
Featuresโ
- Access Groups
- Namespacing
- Gateway Features
- Allow using MCPs with all LLM APIs (VertexAI, Gemini, Groq, etc.) when using /responses - PR #12546
Bugsโ
- Fix to update object permission on update/delete key/team - PR #12701
- Include /mcp in list of available routes on proxy - PR #12612
Management Endpoints / UIโ
Featuresโ
- Keys
- Regenerate Key State Management improvements - PR #12729
- Models
- Usage Page
- Fix Y-axis labels overlap on Spend per Tag chart - PR #12754
- Teams
- Users
- New
/user/bulk_update
endpoint - PR #12720
- New
- Logs Page
- Add
end_user
filter on UI Logs Page - PR #12663
- Add
- MCP Servers
- Copy MCP Server name functionality - PR #12760
- Vector Stores
- General
- Add Copy-on-Click for all IDs (Key, Team, Organization, MCP Server) - PR #12615
- SCIM
- Add GET /ServiceProviderConfig endpoint - PR #12664
Bugsโ
- Teams
Logging / Guardrail Integrationsโ
Featuresโ
- Google Cloud Model Armor
- New guardrails integration - PR #12492
- Bedrock Guardrails
- Allow disabling exception on 'BLOCKED' action - PR #12693
- Guardrails AI
- Support
llmOutput
based guardrails as pre-call hooks - PR #12674
- Support
- DataDog LLM Observability
- Add support for tracking the correct span type based on LLM Endpoint used - PR #12652
- Custom Logging
- Allow reading custom logger python scripts from S3 or GCS Bucket - PR #12623
Bugsโ
- General Logging
- StandardLoggingPayload on cache_hits should track custom llm provider - PR #12652
- S3 Buckets
- S3 v2 log uploader crashes when using with guardrails - PR #12733
Performance / Loadbalancing / Reliability improvementsโ
Featuresโ
- Health Checks
- Caching
- Add Azure Blob cache support - PR #12587
- Router
- Handle ZeroDivisionError with zero completion tokens in lowest_latency strategy - PR #12734
Bugsโ
- Database
- Cache
- Fix: redis caching for embedding response models - PR #12750
Helm Chartโ
- DB Migration Hook: refactor to support use_prisma_migrate - for helm hook PR
- Add envVars and extraEnvVars support to Helm migrations job - PR #12591
General Proxy Improvementsโ
Featuresโ
- Control Plane + Data Plane Architecture
- Control Plane + Data Plane support - PR #12601
- Proxy CLI
- Add "keys import" command to CLI - PR #12620
- Swagger Documentation
- Add swagger docs for LiteLLM /chat/completions, /embeddings, /responses - PR #12618
- Dependencies
- Loosen rich version from ==13.7.1 to >=13.7.1 - PR #12704
Bugsโ
-
Verbose log is enabled by default fix - PR #12596
-
Add support for disabling callbacks in request body - PR #12762
-
Handle circular references in spend tracking metadata JSON serialization - PR #12643
New Contributorsโ
- @AntonioKL made their first contribution in https://github.com/BerriAI/litellm/pull/12591
- @marcelodiaz558 made their first contribution in https://github.com/BerriAI/litellm/pull/12541
- @dmcaulay made their first contribution in https://github.com/BerriAI/litellm/pull/12463
- @demoray made their first contribution in https://github.com/BerriAI/litellm/pull/12587
- @staeiou made their first contribution in https://github.com/BerriAI/litellm/pull/12631
- @stefanc-ai2 made their first contribution in https://github.com/BerriAI/litellm/pull/12622
- @RichardoC made their first contribution in https://github.com/BerriAI/litellm/pull/12607
- @yeahyung made their first contribution in https://github.com/BerriAI/litellm/pull/11795
- @mnguyen96 made their first contribution in https://github.com/BerriAI/litellm/pull/12619
- @rgambee made their first contribution in https://github.com/BerriAI/litellm/pull/11517
- @jvanmelckebeke made their first contribution in https://github.com/BerriAI/litellm/pull/12725
- @jlaurendi made their first contribution in https://github.com/BerriAI/litellm/pull/12704
- @doublerr made their first contribution in https://github.com/BerriAI/litellm/pull/12661