Skip to main content

[Pre Release] v1.74.7

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

Deploy this versionโ€‹

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.74.7

Key Highlightsโ€‹

  • Vector Stores - Support for Vertex RAG Engine, PG Vector, OpenAI & Azure OpenAI Vector Stores.
  • Health Check Improvements - Separate health check app on dedicated port for better Kubernetes liveness probes.
  • New LLM Providers - Added Moonshot API moonshot and v0 provider support.

Vector Stores APIโ€‹

This release introduces support for using VertexAI RAG Engine, PG Vector, Bedrock Knowledge Bases, and OpenAI Vector Stores with LiteLLM.

This is ideal for use cases requiring external knowledge sources with LLMs.

This brings the following benefits for LiteLLM users:

Proxy Admin Benefits:

  • Fine-grained access control: determine which Keys and Teams can access specific Vector Stores
  • Complete usage tracking and monitoring across all vector store operations

Developer Benefits:

  • Simple, unified interface for querying vector stores and using them with LLM API requests
  • Consistent API experience across all supported vector store providers

Get started


New Models / Updated Modelsโ€‹

Pricing / Context Window Updatesโ€‹

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)
Azure AIazure_ai/grok-3131k$3.30$16.50
Azure AIazure_ai/global/grok-3131k$3.00$15.00
Azure AIazure_ai/global/grok-3-mini131k$0.25$1.27
Azure AIazure_ai/grok-3-mini131k$0.275$1.38
Azure AIazure_ai/jais-30b-chat8k$3200$9710
Groqgroq/moonshotai-kimi-k2-instruct131k$1.00$3.00
AI21jamba-large-1.7256k$2.00$8.00
AI21jamba-mini-1.7256k$0.20$0.40
Together.aitogether_ai/moonshotai/Kimi-K2-Instruct131k$1.00$3.00
v0v0/v0-1.0-md128k$3.00$15.00
v0v0/v0-1.5-md128k$3.00$15.00
v0v0/v0-1.5-lg512k$15.00$75.00
Moonshotmoonshot/moonshot-v1-8k8k$0.20$2.00
Moonshotmoonshot/moonshot-v1-32k32k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k131k$2.00$5.00
Moonshotmoonshot/moonshot-v1-auto131k$2.00$5.00
Moonshotmoonshot/kimi-k2-0711-preview131k$0.60$2.50
Moonshotmoonshot/moonshot-v1-32k-043032k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k-0430131k$2.00$5.00
Moonshotmoonshot/moonshot-v1-8k-04308k$0.20$2.00
Moonshotmoonshot/kimi-latest131k$2.00$5.00
Moonshotmoonshot/kimi-latest-8k8k$0.20$2.00
Moonshotmoonshot/kimi-latest-32k32k$1.00$3.00
Moonshotmoonshot/kimi-latest-128k131k$2.00$5.00
Moonshotmoonshot/kimi-thinking-preview131k$30.00$30.00
Moonshotmoonshot/moonshot-v1-8k-vision-preview8k$0.20$2.00
Moonshotmoonshot/moonshot-v1-32k-vision-preview32k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k-vision-preview131k$2.00$5.00

Featuresโ€‹

Bugsโ€‹


LLM API Endpointsโ€‹

Featuresโ€‹

Bugsโ€‹


MCP Gatewayโ€‹

Featuresโ€‹

Bugsโ€‹

  • Fix to update object permission on update/delete key/team - PR #12701
  • Include /mcp in list of available routes on proxy - PR #12612

Management Endpoints / UIโ€‹

Featuresโ€‹

  • Keys
    • Regenerate Key State Management improvements - PR #12729
  • Models
    • Wildcard model filter support - PR #12597
    • Fixes for handling team only models on UI - PR #12632
  • Usage Page
    • Fix Y-axis labels overlap on Spend per Tag chart - PR #12754
  • Teams
    • Allow setting custom key duration + show key creation stats - PR #12722
    • Enable team admins to update member roles - PR #12629
  • Users
  • Logs Page
    • Add end_user filter on UI Logs Page - PR #12663
  • MCP Servers
    • Copy MCP Server name functionality - PR #12760
  • Vector Stores
    • UI support for clicking into Vector Stores - PR #12741
    • Allow adding Vertex RAG Engine, OpenAI, Azure through UI - PR #12752
  • General
    • Add Copy-on-Click for all IDs (Key, Team, Organization, MCP Server) - PR #12615
  • SCIM
    • Add GET /ServiceProviderConfig endpoint - PR #12664

Bugsโ€‹

  • Teams
    • Ensure user id correctly added when creating new teams - PR #12719
    • Fixes for handling team-only models on UI - PR #12632

Logging / Guardrail Integrationsโ€‹

Featuresโ€‹

Bugsโ€‹


Performance / Loadbalancing / Reliability improvementsโ€‹

Featuresโ€‹

  • Health Checks
    • Separate health app for liveness probes - PR #12669
    • Health check app on separate port - PR #12718
  • Caching
  • Router
    • Handle ZeroDivisionError with zero completion tokens in lowest_latency strategy - PR #12734

Bugsโ€‹

  • Database
    • Use upsert for managed object table to avoid UniqueViolationError - PR #11795
    • Refactor to support use_prisma_migrate for helm hook - PR #12600
  • Cache
    • Fix: redis caching for embedding response models - PR #12750

Helm Chartโ€‹

  • DB Migration Hook: refactor to support use_prisma_migrate - for helm hook PR
  • Add envVars and extraEnvVars support to Helm migrations job - PR #12591

General Proxy Improvementsโ€‹

Featuresโ€‹

  • Control Plane + Data Plane Architecture
    • Control Plane + Data Plane support - PR #12601
  • Proxy CLI
    • Add "keys import" command to CLI - PR #12620
  • Swagger Documentation
    • Add swagger docs for LiteLLM /chat/completions, /embeddings, /responses - PR #12618
  • Dependencies
    • Loosen rich version from ==13.7.1 to >=13.7.1 - PR #12704

Bugsโ€‹

  • Verbose log is enabled by default fix - PR #12596

  • Add support for disabling callbacks in request body - PR #12762

  • Handle circular references in spend tracking metadata JSON serialization - PR #12643


New Contributorsโ€‹

Full Changelogโ€‹