2 posts tagged with "thinking content"

View All Tags

v1.63.14-stable

March 22, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

These are the changes since v1.63.11-stable.

This release brings:

LLM Translation Improvements (MCP Support and Bedrock Application Profiles)
Perf improvements for Usage-based Routing
Streaming guardrail support via websockets
Azure OpenAI client perf fix (from previous release)

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.63.14-stable.patch1

Demo Instance

Here's a Demo Instance to test changes:

Instance: https://demo.litellm.ai/
Login Credentials:
- Username: admin
- Password: sk-1234

New Models / Updated Models

Azure gpt-4o - fixed pricing to latest global pricing - PR
O1-Pro - add pricing + model information - PR
Azure AI - mistral 3.1 small pricing added - PR
Azure - gpt-4.5-preview pricing added - PR

LLM Translation

New LLM Features

Bedrock: Support bedrock application inference profiles Docs
- Infer aws region from bedrock application profile id - (arn:aws:bedrock:us-east-1:...)
Ollama - support calling via /v1/completions Get Started
Bedrock - support us.deepseek.r1-v1:0 model name Docs
OpenRouter - OPENROUTER_API_BASE env var support Docs
Azure - add audio model parameter support - Docs
OpenAI - PDF File support Docs
OpenAI - o1-pro Responses API streaming support Docs
[BETA] MCP - Use MCP Tools with LiteLLM SDK Docs

Bug Fixes

Voyage: prompt token on embedding tracking fix - PR
Sagemaker - Fix ‘Too little data for declared Content-Length’ error - PR
OpenAI-compatible models - fix issue when calling openai-compatible models w/ custom_llm_provider set - PR
VertexAI - Embedding ‘outputDimensionality’ support - PR
Anthropic - return consistent json response format on streaming/non-streaming - PR

Spend Tracking Improvements

litellm_proxy/ - support reading litellm response cost header from proxy, when using client sdk
Reset Budget Job - fix budget reset error on keys/teams/users PR
Streaming - Prevents final chunk w/ usage from being ignored (impacted bedrock streaming + cost tracking) PR

UI

Users Page
- Feature: Control default internal user settings PR
Icons:
- Feature: Replace external "artificialanalysis.ai" icons by local svg PR
Sign In/Sign Out
- Fix: Default login when default_user_id user does not exist in DB PR

Logging Integrations

Support post-call guardrails for streaming responses Get Started
Arize Get Started
- fix invalid package import PR
- migrate to using standardloggingpayload for metadata, ensures spans land successfully PR
- fix logging to just log the LLM I/O PR
- Dynamic API Key/Space param support Get Started
StandardLoggingPayload - Log litellm_model_name in payload. Allows knowing what the model sent to API provider was Get Started
Prompt Management - Allow building custom prompt management integration Get Started

Performance / Reliability improvements

Redis Caching - add 5s default timeout, prevents hanging redis connection from impacting llm calls PR
Allow disabling all spend updates / writes to DB - patch to allow disabling all spend updates to DB with a flag PR
Azure OpenAI - correctly re-use azure openai client, fixes perf issue from previous Stable release PR
Azure OpenAI - uses litellm.ssl_verify on Azure/OpenAI clients PR
Usage-based routing - Wildcard model support Get Started
Usage-based routing - Support batch writing increments to redis - reduces latency to same as ‘simple-shuffle’ PR
Router - show reason for model cooldown on ‘no healthy deployments available error’ PR
Caching - add max value limit to an item in in-memory cache (1MB) - prevents OOM errors on large image url’s being sent through proxy PR

General Improvements

Passthrough Endpoints - support returning api-base on pass-through endpoints Response Headers Docs
SSL - support reading ssl security level from env var - Allows user to specify lower security settings Get Started
Credentials - only poll Credentials table when STORE_MODEL_IN_DB is True PR
Image URL Handling - new architecture doc on image url handling Docs
OpenAI - bump to pip install "openai==1.68.2" PR
Gunicorn - security fix - bump gunicorn==23.0.0 PR

Complete Git Diff

Here's the complete git diff

v1.63.11-stable

March 15, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

These are the changes since v1.63.2-stable.

This release is primarily focused on:

[Beta] Responses API Support
Snowflake Cortex Support, Amazon Nova Image Generation
UI - Credential Management, re-use credentials when adding new models
UI - Test Connection to LLM Provider before adding a model

Known Issues

🚨 Known issue on Azure OpenAI - We don't recommend upgrading if you use Azure OpenAI. This version failed our Azure OpenAI load test

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.63.11-stable

Demo Instance

Here's a Demo Instance to test changes:

Instance: https://demo.litellm.ai/
Login Credentials:
- Username: admin
- Password: sk-1234

New Models / Updated Models

Image Generation support for Amazon Nova Canvas Getting Started
Add pricing for Jamba new models PR
Add pricing for Amazon EU models PR
Add Bedrock Deepseek R1 model pricing PR
Update Gemini pricing: Gemma 3, Flash 2 thinking update, LearnLM PR
Mark Cohere Embedding 3 models as Multimodal PR
Add Azure Data Zone pricing PR
- LiteLLM Tracks cost for azure/eu and azure/us models

LLM Translation

New Endpoints

[Beta] POST /responses API. Getting Started

New LLM Providers

Snowflake Cortex Getting Started

New LLM Features

Support OpenRouter reasoning_content on streaming Getting Started

Bug Fixes

OpenAI: Return code, param and type on bad request error More information on litellm exceptions
Bedrock: Fix converse chunk parsing to only return empty dict on tool use PR
Bedrock: Support extra_headers PR
Azure: Fix Function Calling Bug & Update Default API Version to 2025-02-01-preview PR
Azure: Fix AI services URL PR
Vertex AI: Handle HTTP 201 status code in response PR
Perplexity: Fix incorrect streaming response PR
Triton: Fix streaming completions bug PR
Deepgram: Support bytes.IO when handling audio files for transcription PR
Ollama: Fix "system" role has become unacceptable PR
All Providers (Streaming): Fix String data: stripped from entire content in streamed responses PR

Spend Tracking Improvements

Support Bedrock converse cache token tracking Getting Started
Cost Tracking for Responses API Getting Started
Fix Azure Whisper cost tracking Getting Started

UI

Re-Use Credentials on UI

You can now onboard LLM provider credentials on LiteLLM UI. Once these credentials are added you can re-use them when adding new models Getting Started

Test Connections before adding models

Before adding a model you can test the connection to the LLM provider to verify you have setup your API Base + API Key correctly

General UI Improvements

Add Models Page
- Allow adding Cerebras, Sambanova, Perplexity, Fireworks, Openrouter, TogetherAI Models, Text-Completion OpenAI on Admin UI
- Allow adding EU OpenAI models
- Fix: Instantly show edit + deletes to models
Keys Page
- Fix: Instantly show newly created keys on Admin UI (don't require refresh)
- Fix: Allow clicking into Top Keys when showing users Top API Key
- Fix: Allow Filter Keys by Team Alias, Key Alias and Org
- UI Improvements: Show 100 Keys Per Page, Use full height, increase width of key alias
Users Page
- Fix: Show correct count of internal user keys on Users Page
- Fix: Metadata not updating in Team UI
Logs Page
- UI Improvements: Keep expanded log in focus on LiteLLM UI
- UI Improvements: Minor improvements to logs page
- Fix: Allow internal user to query their own logs
- Allow switching off storing Error Logs in DB Getting Started
Sign In/Sign Out
- Fix: Correctly use PROXY_LOGOUT_URL when set Getting Started

Security

Support for Rotating Master Keys Getting Started
Fix: Internal User Viewer Permissions, don't allow internal_user_viewer role to see Test Key Page or Create Key Button More information on role based access controls
Emit audit logs on All user + model Create/Update/Delete endpoints Getting Started
JWT
- Support multiple JWT OIDC providers Getting Started
- Fix JWT access with Groups not working when team is assigned All Proxy Models access
Using K/V pairs in 1 AWS Secret Getting Started

Logging Integrations

Prometheus: Track Azure LLM API latency metric Getting Started
Athina: Added tags, user_feedback and model_options to additional_keys which can be sent to Athina Getting Started

Performance / Reliability improvements

Redis + litellm router - Fix Redis cluster mode for litellm router PR

General Improvements

OpenWebUI Integration - display thinking tokens

Guide on getting started with LiteLLM x OpenWebUI. Getting Started
Display thinking tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) Getting Started

Complete Git Diff

Here's the complete git diff

Docker Run LiteLLM Proxy​

Demo Instance​

New Models / Updated Models​

LLM Translation​

Spend Tracking Improvements​

UI​

Logging Integrations​

Performance / Reliability improvements​

General Improvements​

Complete Git Diff​

Known Issues​

Docker Run LiteLLM Proxy​

Demo Instance​

New Models / Updated Models​

LLM Translation​

Spend Tracking Improvements​

UI​

Re-Use Credentials on UI​

Test Connections before adding models​

General UI Improvements​

Security​

Logging Integrations​

Performance / Reliability improvements​

General Improvements​

Complete Git Diff​

Docker Run LiteLLM Proxy

Demo Instance

New Models / Updated Models

LLM Translation

Spend Tracking Improvements

UI

Logging Integrations

Performance / Reliability improvements

General Improvements

Complete Git Diff

Known Issues

Docker Run LiteLLM Proxy

Demo Instance

New Models / Updated Models

LLM Translation

Spend Tracking Improvements

UI

Re-Use Credentials on UI

Test Connections before adding models

General UI Improvements

Security

Logging Integrations

Performance / Reliability improvements

General Improvements

Complete Git Diff