Skip to main content

v1.83.10 - Claude Opus 4.7, Prompt Compression & Multi-Window Budgets

Deploy this version​

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:main-v1.83.10-stable

Key Highlights​


New Models / Updated Models​

New Model Support (10 new models)​

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Features
Anthropicclaude-opus-4-7, claude-opus-4-7-202604161M$5.00$25.00Chat, reasoning, vision, computer use, prompt caching, PDF input, xhigh reasoning effort
AWS Bedrockanthropic.claude-opus-4-7, us.anthropic.claude-opus-4-7, eu.anthropic.claude-opus-4-7, au.anthropic.claude-opus-4-7, global.anthropic.claude-opus-4-71M$5.50$27.50Chat, reasoning, vision, computer use, prompt caching, PDF input, native structured output
Vertex AIvertex_ai/claude-opus-4-7, vertex_ai/claude-opus-4-7@default1M$5.00$25.00Chat, reasoning, vision, computer use, prompt caching, PDF input
Azure AIazure_ai/claude-opus-4-7200K$5.00$25.00Chat, reasoning, vision, computer use, prompt caching, PDF input
Perplexityperplexity/anthropic/claude-opus-4-7---Web search, function calling (Responses mode)
Google Geminigemini/veo-3.1-lite-generate-preview1024-$0.05 / secVideo generation preview
OpenRouteropenrouter/google/gemini-3.1-flash-lite-preview1.05M$0.25$1.50Chat, code execution, file search, function calling, prompt caching, reasoning, web search, vision, video/audio/PDF input
xAIxai/grok-4.20-0309-reasoning2M$2.00$6.00Function calling, reasoning, tool choice, vision, web search
W&B Inferencewandb/MiniMaxAI/MiniMax-M2.5197K$0.30$1.20Function calling, reasoning, response schema
W&B Inferencewandb/moonshotai/Kimi-K2.5262K$0.60$3.00Function calling, reasoning, response schema, vision

Features​

  • Anthropic

    • Day-0 support for Claude Opus 4.7 across Anthropic native, Bedrock, Vertex AI, Azure AI, and Perplexity - PR #25867
    • Hotfix follow-ups for Opus 4.7 routing/version-string handling - PR #25875, PR #25876
    • Retry /v1/messages after invalid thinking signature errors - PR #25674
  • AWS Bedrock

    • Normalize custom tool JSON schema for both Invoke and Converse APIs - PR #25396
    • Bedrock API response null-type handling - PR #25810, PR #24147
    • Prevent negative streaming costs for start-only cache usage - PR #25846
    • Accurate cache token cost breakdown in UI and SpendLogs - PR #25735
    • Remove unresolved merge conflict markers in Bedrock test file - PR #25995
    • Replace flaky Bedrock gpt-oss tool-call live test with request-body mock - PR #25739
    • Mock Bedrock Moonshot tests + fix TogetherAIConfig recursion - PR #25920
    • Remove dead Bedrock clear_thinking interleaved-thinking-beta assertion - PR #25913
  • Google Vertex AI

    • Normalize Gemini finish_reason enum through map_finish_reason - PR #25337
    • Add us-south1 region for vertex_ai/qwen3-235b-a22b-instruct-2507-maas - PR #25382
    • Add vertex_ai/claude-opus-4-7 and vertex_ai/claude-opus-4-7@default cost map entries - cost map
  • Google Gemini

    • Veo 3.1 Lite pricing, video resolution usage, and tiered cost tracking - PR #25348
  • Azure AI

    • Add azure_ai/claude-opus-4-7 cost map entry - cost map
    • Populate standard_logging_object for Azure passthrough via logging hook - PR #25679
  • OpenAI

    • Omit null encoding_format for OpenAI embedding requests - PR #25395 (later reverted in PR #25698 — see Bug Fixes)
  • xAI

    • Add xai/grok-4.20-0309-reasoning cost map entry - PR #25930
  • Together AI

    • Expose reasoning effort fields in get_model_info and add together_ai/gpt-oss-120b - PR #25263
    • Replace deprecated Mixtral with serverless Qwen3.5-9B in tests - PR #25728
  • DashScope

    • Preserve cache_control for explicit prompt caching - PR #25331
  • GitHub Copilot

    • Allow overriding the default GitHub Copilot authentication endpoint - PR #25915
  • W&B Inference

    • Add Kimi-K2.5 and MiniMax-M2.5 cost map entries - PR #25409

Bug Fixes​

LLM API Endpoints​

Features​

Bugs​

  • General
    • Tighten api_key value check in credential validation - PR #25917
    • Tighten environment-reference handling in request parameters - PR #25592
    • Harden request parameter handling - PR #25827
    • Add shared path utilities and prevent directory traversal - PR #25834
    • Add URL validation for user-supplied URLs - PR #25906
    • Read guardrail config from admin metadata; fix tag-routing consistency - PR #25905
    • Enforce organization boundaries in admin operations - PR #25904
    • Resolve prometheus_helpers file/package shadow breaking /global/spend/logs - PR #26026
    • Harden CORS credentials, create_views exception handling, and spend-log cleanup loop - PR #25559
    • Prevent API key leaks in error tracebacks, logs, and alerts - PR #25117
    • Remove leading space from license public_key.pem - PR #25339
    • Cache invalidation: stop double-hashing token in bulk update and key rotation - PR #25552
    • model_max_budget silently broken for routed models - PR #25549
    • Bump 22 of 25 vulnerable dependabot-reported dependencies - PR #25442
    • Fix multiple values TypeError in get_cache_key - PR #20261
    • S3v2: use prepared URL for SigV4-signed S3 requests - PR #25074
    • Health-check reasoning-token max-token precedence - PR #25936
    • BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var - PR #25344
    • Batch-limit stale managed object cleanup to prevent 300K-row UPDATE - PR #25227
    • Preserve provider response headers in StandardLoggingPayload - PR #25807
    • Optimize DB query to prevent OOM during health checks - PR #25732
    • PodLockManager.release_lock atomic compare-and-delete (re-land #21226) - PR #24466
    • routing_strategy_args returns None when strategy is not latency-based - PR #25882
    • is_tool_name_prefixed validates against known MCP server prefixes - PR #25085
    • Persist default router end-budget across restarts - PR #25991
    • Enforce team membership in team-scoped key management checks - PR #25686
    • Agent endpoint and routing permission checks - PR #25922
    • JWT-auth key_alias=user_id for Prometheus metrics — initial fix and revert - PR #25340, PR #25438
    • Gate post-custom-auth DB lookups behind opt-in flag - PR #25634
    • Align field-level checks in user and key update endpoints - PR #25541
    • /spend/logs filter handling aligned with user scoping - PR #25594
    • Replace custom_code guardrail sandbox with RestrictedPython - PR #25818
    • Presidio: use correct text positions in anonymize_text - PR #24998

Management Endpoints / UI​

Features​

  • Virtual Keys

    • Configurable multi-threshold budget alerts (e.g. 50% / 80% / 95%) - PR #25989
    • Multiple concurrent budget windows per API key and team (#24883) - PR #25109
    • Per-member model scope + team default_team_member_models - PR #24950
    • Migrate regenerate key modal to AntD - PR #25406
    • Strip empty premium fields from key update payload - PR #26023
    • Default invite-user modal global role to least privilege - PR #25721
  • Teams

    • Allow editing router settings after team creation - PR #25398
    • Per-team opt-out for specific global guardrails - PR #25575
    • Enterprise notice banner on deleted Keys/Teams - PR #25814
    • Invalidate org queries after team mutations - PR #25812
    • E2E test for editing team model TPM/RPM limits - PR #25658
  • Models + Endpoints

    • Claude Code BYOK support in UI Settings - PR #25998
    • E2E tests for Add Model flow - PR #25590
    • Pre-select backend default for boolean guardrail provider fields - PR #25700
    • Render guardrail optional_params bool defaults in Select - PR #25806
    • Use AntD Select for MCP ToolTestPanel boolean inputs - PR #25809
    • Persist extra_headers on MCP server edit - PR #26003
    • Migrate Guardrail Test Playground from @tremor/react to AntD - PR #25749
    • Migrate router_settings page from Tremor to AntD - PR #25879
    • Reduce Tremor usage in Guardrails Monitor layout - PR #25803
    • Remove Chat UI link from Swagger docs message - PR #25727
    • Delete policy attachments via controlled modal - PR #25324
  • Auth / SSO

    • Resolve login redirect loop when reverse proxy adds HttpOnly to cookies - PR #23532
    • Gate post-custom-auth DB lookups behind opt-in flag - PR #25634
  • Logs / Activity

    • Isolate logs team-filter dropdown from root teams state bleed - PR #25716
    • Align /spend/logs filter handling with user scoping - PR #25594
  • Helm

    • Add tpl support to extraContainers and extraInitContainers - PR #25494

Bugs​

  • Strip empty premium fields from key update payload - PR #26023
  • Tighten api_key value check in credential validation - PR #25917
  • extra_headers not persisting on MCP server edit - PR #26003
  • Logs team-filter dropdown leakage - PR #25716
  • Add getCookie to cookieUtils mock in user_dashboard test - PR #25719
  • Remove deprecated tests/ui_e2e_tests/ suite - PR #25657
  • Restrict x-pass- header forwarding - PR #25916
  • Blog dark-mode text invisible on dark background - PR #25620
  • Default invite-user role least-privilege - PR #25721

AI Integrations​

Logging​

  • Prometheus

    • Add 7m and 10m latency histogram buckets - PR #25071
    • Performance improvements for Prometheus exporter - PR #25934
    • Resolve prometheus_helpers file/package shadow breaking /global/spend/logs - PR #26026
  • Azure Pass-Through

    • Populate standard_logging_object via logging hook - PR #25679
  • General

    • Preserve provider response headers in StandardLoggingPayload - PR #25807

Guardrails​

  • PromptGuard

    • New PromptGuard guardrail integration for prompt-injection detection - PR #24268
  • Custom Code Guardrails

    • Replace custom_code sandbox with RestrictedPython - PR #25818
  • Presidio

    • Use correct text positions in anonymize_text - PR #24998
  • General

    • Per-team opt-out for specific global guardrails - PR #25575
    • UI: pre-select backend default for boolean guardrail provider fields - PR #25700
    • UI: render guardrail optional_params boolean defaults in Select - PR #25806
    • Read guardrail config from admin metadata and fix tag-routing consistency - PR #25905

Caching​

  • Add Responses API params to cache key allow-list - PR #25673
  • Prevent multiple values TypeError in get_cache_key - PR #20261
  • S3v2: use prepared URL for SigV4-signed S3 requests - PR #25074

Prompt Management / Compression​

Secret Managers​

  • No new secret manager provider additions in this release.

Spend Tracking, Budgets and Rate Limiting​

  • Configurable multi-threshold budget alerts for virtual keys (e.g. 50% / 80% / 95%) - PR #25989
  • Multiple concurrent budget windows per API key and team (#24883) - PR #25109
  • Bedrock/Anthropic accurate cache token cost breakdown in UI and SpendLogs - PR #25735
  • Bedrock: prevent negative streaming costs for start-only cache usage - PR #25846
  • Fix virtual-key projected-spend soft budget alerts - PR #25838
  • Enforce project-level model-specific rate limits in parallel-request limiter - PR #25994
  • Persist default router end-budget across restarts - PR #25991
  • Align reset times for legacy entities (Team Members, End Users) with the standardized calendar - PR #25440
  • Batch-limit stale managed-object cleanup to prevent 300K-row UPDATE - PR #25227
  • Cache invalidation: stop double-hashing token in bulk update and key rotation - PR #25552
  • model_max_budget silently broken for routed models - PR #25549
  • Expose reasoning-effort fields in get_model_info (and add together_ai/gpt-oss-120b to cost map) - PR #25263
  • Veo 3.1 Lite resolution-aware tiered cost tracking - PR #25348
  • Add us-south1 region for Vertex qwen3-235b-a22b-instruct-2507-maas cost map - PR #25382

MCP Gateway​

  • Validate is_tool_name_prefixed against the set of known MCP server prefixes - PR #25085
  • Restore PKCE-triggering 401 when no stored per-user token exists - PR #26032
  • Expose per-server InitializeResult.instructions from the MCP gateway - PR #25694
  • Extract shared PKCE helpers into utils/pkce.ts - PR #25878
  • UI: AntD Select for MCP ToolTestPanel boolean inputs - PR #25809
  • UI: persist extra_headers on MCP server edit - PR #26003

Performance / Loadbalancing / Reliability improvements​

  • Prometheus exporter performance improvements - PR #25934
  • Optimize DB query to prevent OOM during health checks - PR #25732
  • PodLockManager.release_lock atomic compare-and-delete (re-land of #21226) - PR #24466
  • Health-check reasoning-token max-token precedence - PR #25936
  • New BACKGROUND_HEALTH_CHECK_MAX_TOKENS environment variable - PR #25344
  • Return None for routing_strategy_args when strategy is not latency-based - PR #25882
  • Bump proxy dependencies; raise minimum Python to 3.10 - PR #26022
  • Bump 22 of 25 vulnerable dependabot-reported dependencies - PR #25442
  • Migrate packaging, CI, and Docker from Poetry to uv - PR #25007
  • [Infra] Bump llm_translation_testing resource class to xlarge and tolerate worker restarts - PR #25887, PR #25898
  • [Infra] Expand CI branch filters for non-main PR targets - PR #25819
  • [Infra] Guard main to only accept PRs from staging and hotfix branches - PR #25733
  • [Infra] Remove unused publish_proxy_extras and prisma_schema_sync jobs from CircleCI config - PR #25821
  • fix(ci): increase test-server-root-path timeout to 30m - PR #25741
  • Remove non-existent litellm_mcps_tests_coverage from coverage combine - PR #25737
  • Helm: add tpl support to extraContainers / extraInitContainers - PR #25494
  • Advisor tool orchestration loop for non-Anthropic providers - PR #25579

Documentation Updates​

  • Cost discrepancy debugging guide - PR #25622
  • Week 2 onboarding checklist - PR #25452
  • Add "Copy Page as Markdown" + llms.txt to docs site - PR #25975
  • Docs announcement bar for Trivy compromise resolution - PR #25870
  • Restyle docs.litellm.ai/blog to engineering blog aesthetic - PR #25580
  • Ramp-style engineering blog restyle + Redis circuit breaker post - PR #25583
  • Add back arrow to blog post pages - PR #25587
  • Fallbacks image - PR #25731
  • General docs update - PR #25736
  • Backfill release notes for v1.83.3-stable and v1.83.7.rc.1 - PR #25723, PR #25726
  • Fix version shown in docs - PR #25875

New Contributors​

Full Changelog: https://github.com/BerriAI/litellm/compare/v1.83.7-stable...v1.83.10-stable


04/27/2026​

  • New Models / Updated Models: 23
  • LLM API Endpoints: 18
  • Management Endpoints / UI: 22
  • AI Integrations (Logging / Guardrails / Caching / Prompt): 16
  • Spend Tracking, Budgets and Rate Limiting: 13
  • MCP Gateway: 6
  • Performance / Loadbalancing / Reliability improvements: 17
  • Documentation Updates: 11