Deploy this version​
- Docker
- Pip
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.70.1-stable
pip install litellm==1.70.1
Key Highlights​
LiteLLM v1.70.1-stable is live now. Here are the key highlights of this release:
- Gemini Realtime API: You can now call Gemini's Live API via the OpenAI /v1/realtime API
- Spend Logs Retention Period: Enable deleting spend logs older than a certain period.
- PII Masking 2.0: Easily configure masking or blocking specific PII/PHI entities on the UI
Gemini Realtime API​
This release brings support for calling Gemini's realtime models (e.g. gemini-2.0-flash-live) via OpenAI's /v1/realtime API. This is great for developers as it lets them easily switch from OpenAI to Gemini by just changing the model name.
Key Highlights:
- Support for text + audio input/output
- Support for setting session configurations (modality, instructions, activity detection) in the OpenAI format
- Support for logging + usage tracking for realtime sessions
This is currently supported via Google AI Studio. We plan to release VertexAI support over the coming week.
Spend Logs Retention Period​
This release enables deleting LiteLLM Spend Logs older than a certain period. Since we now enable storing the raw request/response in the logs, deleting old logs ensures the database remains performant in production.
PII Masking 2.0​
This release brings improvements to our Presidio PII Integration. As a Proxy Admin, you now have the ability to:
- Mask or block specific entities (e.g., block medical licenses while masking other entities like emails).
- Monitor guardrails in production. LiteLLM Logs will now show you the guardrail run, the entities it detected, and its confidence score for each entity.
New Models / Updated Models​
- Gemini (VertexAI + Google AI Studio)
- Google AI Studio
/realtime
- Gemini Multimodal Live API support
- Audio input/output support, optional param mapping, accurate usage calculation - PR
- VertexAI
/chat/completion
- Fix llama streaming error - where model response was nested in returned streaming chunk - PR
- Ollama
/chat/completion
- structure responses fix - PR
- Bedrock
- Nvidia NIM
/chat/completion
[NEED DOCS ON SUPPORTED PARAMS]- Add tools, tool_choice, parallel_tool_calls support - PR
- Novita AI
- New Provider added for
/chat/completion
routes - PR
- New Provider added for
- Azure
/image/generation
- Fix azure dall e 3 call with custom model name - PR
- Cohere
/embeddings
- Migrate embedding to use
/v2/embed
- adds support for output_dimensions param - PR
- Migrate embedding to use
- Anthropic
/chat/completion
- Web search tool support - native + openai format - Get Started
- VLLM
/embeddings
- Support embedding input as list of integers
- OpenAI
/chat/completion
- Fix - b64 file data input handling - Get Started
- Add ‘supports_pdf_input’ to all vision models - PR
LLM API Endpoints​
- Responses API
- Fix delete API support - PR
- Rerank API
/v2/rerank
now registered as ‘llm_api_route’ - enabling non-admins to call it - PR
Spend Tracking Improvements​
/chat/completion
,/messages
/audio/transcription
/embeddings
- Azure AI - Add cohere embed v4 pricing - PR
Management Endpoints / UI​
- Models
- Ollama - adds api base param to UI
- Logs
- Add team id, key alias, key hash filter on logs - https://github.com/BerriAI/litellm/pull/10831
- Guardrail tracing now in Logs UI - https://github.com/BerriAI/litellm/pull/10893
- Teams
- Patch for updating team info when team in org and members not in org - https://github.com/BerriAI/litellm/pull/10835
- Guardrails
- Add Bedrock, Presidio, Lakers guardrails on UI - https://github.com/BerriAI/litellm/pull/10874
- See guardrail info page - https://github.com/BerriAI/litellm/pull/10904
- Allow editing guardrails on UI - https://github.com/BerriAI/litellm/pull/10907
- Test Key
- select guardrails to test on UI
Logging / Alerting Integrations​
- StandardLoggingPayload
- Log any
x-
headers in requester metadata - Get Started - Guardrail tracing now in standard logging payload - Get Started
- Log any
- Generic API Logger
- Support passing application/json header
- Arize Phoenix
- PagerDuty
- Pagerduty is now a free feature - PR
- Alerting
- Sending slack alerts on virtual key/user/team updates is now free - PR
Guardrails​
- Guardrails
- New
/apply_guardrail
endpoint for directly testing a guardrail - PR
- New
- Lakera
/v2
endpoints support - PR
- Presidio
- Aim Security
- Support for anonymization in AIM Guardrails - PR
Performance / Loadbalancing / Reliability improvements​
- Allow overriding all constants using a .env variable - PR
- Maximum retention period for spend logs
General Proxy Improvements​
- Authentication
- Handle Bearer $LITELLM_API_KEY in x-litellm-api-key custom header PR
- New Enterprise pip package -
litellm-enterprise
- fixes issue whereenterprise
folder was not found when using pip package - Proxy CLI
- Add
models import
command - PR
- Add
- OpenWebUI
- Configure LiteLLM to Parse User Headers from Open Web UI
- LiteLLM Proxy w/ LiteLLM SDK
- Option to force/always use the litellm proxy when calling via LiteLLM SDK
New Contributors​
- @imdigitalashish made their first contribution in PR #10617
- @LouisShark made their first contribution in PR #10688
- @OscarSavNS made their first contribution in PR #10764
- @arizedatngo made their first contribution in PR #10654
- @jugaldb made their first contribution in PR #10805
- @daikeren made their first contribution in PR #10781
- @naliotopier made their first contribution in PR #10077
- @damienpontifex made their first contribution in PR #10813
- @Dima-Mediator made their first contribution in PR #10789
- @igtm made their first contribution in PR #10814
- @shibaboy made their first contribution in PR #10752
- @camfarineau made their first contribution in PR #10629
- @ajac-zero made their first contribution in PR #10439
- @damgem made their first contribution in PR #9802
- @hxdror made their first contribution in PR #10757
- @wwwillchen made their first contribution in PR #10894
Demo Instance​
Here's a Demo Instance to test changes:
- Instance: https://demo.litellm.ai/
- Login Credentials:
- Username: admin
- Password: sk-1234