Skip to main content

[PRE-RELEASE]v1.76.0-stable - RPS Improvements

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM
info

LiteLLM is hiring a Founding Backend Engineer, in San Francisco.

Apply here if you're interested!

Deploy this version

info

This release is not live yet.


New Models / Updated Models

Bugs

Features

LLM API Endpoints

Bugs

MCP Gateway

Bugs

  • fix StreamableHTTPSessionManager .run() error - PR #13666

Vector Stores

Bugs

Management Endpoints / UI

Bugs

Features

  • Models
    • Add Search Functionality for Public Model Names in Model Dashboard - PR #13687
    • Auto-Add azure/ to deployment Name in UI - PR #13685
    • Models page row UI restructure - PR #13771
  • Notifications
    • Add new notifications toast UI everywhere - PR #13813
  • Keys
    • Fix key edit settings after regenerating a key - PR #13815
    • Require team_id when creating service account keys - PR #13873
    • Filter - show all options on filter option click - PR #13858
  • Usage
    • Fix ‘Cannot read properties of undefined’ exception on user agent activity tab - PR #13892
  • SSO
    • Free SSO usage for up to 5 users - PR #13843

Logging / Guardrail Integrations

Bugs

Features

Performance / Loadbalancing / Reliability improvements

Bugs

  • Cooldowns
    • don't return raw Azure Exceptions to client (can contain prompt leakage) - PR #13529
  • Auto-router
    • Ensures the relevant dependencies for auto router existing on LiteLLM Docker - PR #13788
  • Model Alias
    • Fix calling key with access to model alias - PR #13830

Features

  • S3 Caching
  • Model Group header forwarding
    • reuse same logic as global header forwarding - PR #13741
    • add support for hosted_vllm on UI - PR #13885
  • Performance
    • Improve LiteLLM Python SDK RPS by +200 RPS (braintrust import + aiohttp transport fixes) - PR #13839
    • Use O(1) Set lookups for model routing - PR #13879
    • Reduce Significant CPU overhead from litellm_logging.py - PR #13895
    • Improvements for Async Success Handler (Logging Callbacks) - Approx +130 RPS - PR #13905

General Proxy Improvements

Bugs

  • SDK
    • Fix litellm compatibility with newest release of openAI (>v1.100.0) - PR #13728
  • Helm
    • Add possibility to configure resources for migrations-job - PR #13617
    • Ensure Helm chart auto generated master keys follow sk-xxxx format - PR #13871
    • Enhance database configuration: add support for optional endpointKey - PR #13763
  • Rate Limits
  • Non-root
    • fix permission access on prisma migrate in non-root image - PR #13848, s/o @Ithanil