v1.67.4-stable - Improved User Management
Deploy this version
- Docker
- Pip
docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
docker.litellm.ai/berriai/litellm:main-v1.67.4-stable
pip install litellm
pip install litellm==1.67.4.post1
Key Highlights
- Improved User Management: This release enables search and filtering across users, keys, teams, and models.
- Responses API Load Balancing: Route requests across provider regions and ensure session continuity.
- UI Session Logs: Group several requests to LiteLLM into a session.
Improved User Management
This release makes it easier to manage users and keys on LiteLLM. You can now search and filter across users, keys, teams, and models, and control user settings more easily.
New features include:
- Search for users by email, ID, role, or team.
- See all of a user's models, teams, and keys in one place.
- Change user roles and model access right from the Users Tab.
These changes help you spend less time on user setup and management on LiteLLM.
Responses API Load Balancing
This release introduces load balancing for the Responses API, allowing you to route requests across provider regions and ensure session continuity. It works as follows:
- If a
previous_response_idis provided, LiteLLM will route the request to the original deployment that generated the prior response — ensuring session continuity. - If no
previous_response_idis provided, LiteLLM will load-balance requests across your available deployments.
UI Session Logs
This release allow you to group requests to LiteLLM proxy into a session. If you specify a litellm_session_id in your request LiteLLM will automatically group all logs in the same session. This allows you to easily track usage and request content per session.
New Models / Updated Models
- OpenAI
- Added
gpt-image-1cost tracking Get Started - Bug fix: added cost tracking for gpt-image-1 when quality is unspecified PR
- Added
- Azure
- Fixed timestamp granularities passing to whisper in Azure Get Started
- Added azure/gpt-image-1 pricing Get Started, PR
- Added cost tracking for
azure/computer-use-preview,azure/gpt-4o-audio-preview-2024-12-17,azure/gpt-4o-mini-audio-preview-2024-12-17PR
- Bedrock
- Added support for all compatible Bedrock parameters when model="arn:.." (Bedrock application inference profile models) Get started, PR
- Fixed wrong system prompt transformation PR
- VertexAI / Google AI Studio
- Allow setting
budget_tokens=0forgemini-2.5-flashGet Started,PR - Ensure returned
usageincludes thinking token usage PR - Added cost tracking for
gemini-2.5-pro-preview-03-25PR
- Allow setting
- Cohere
- Added support for cohere command-a-03-2025 Get Started, PR
- SageMaker
- Added support for max_completion_tokens parameter Get Started, PR
- Responses API
- Added support for GET and DELETE operations -
/v1/responses/{response_id}Get Started - Added session management support for all supported models PR
- Added routing affinity to maintain model consistency within sessions Get Started, PR
- Added support for GET and DELETE operations -
