Skip to main content

3 posts tagged with "claude-3-7-sonnet"

View All Tags

Krrish Dholakia
Ishaan Jaffer

These are the changes since v1.61.20-stable.

This release is primarily focused on:

  • LLM Translation improvements (more thinking content improvements)
  • UI improvements (Error logs now shown on UI)
info

This release will be live on 03/09/2025

Demo Instance​

Here's a Demo Instance to test changes:

New Models / Updated Models​

  1. Add supports_pdf_input for specific Bedrock Claude models PR
  2. Add pricing for amazon eu models PR
  3. Fix Azure O1 mini pricing PR

LLM Translation​

  1. Support /openai/ passthrough for Assistant endpoints. Get Started
  2. Bedrock Claude - fix tool calling transformation on invoke route. Get Started
  3. Bedrock Claude - response_format support for claude on invoke route. Get Started
  4. Bedrock - pass description if set in response_format. Get Started
  5. Bedrock - Fix passing response_format: {"type": "text"}. PR
  6. OpenAI - Handle sending image_url as str to openai. Get Started
  7. Deepseek - return 'reasoning_content' missing on streaming. Get Started
  8. Caching - Support caching on reasoning content. Get Started
  9. Bedrock - handle thinking blocks in assistant message. Get Started
  10. Anthropic - Return signature on streaming. Get Started
  • Note: We've also migrated from signature_delta to signature. Read more
  1. Support format param for specifying image type. Get Started
  2. Anthropic - /v1/messages endpoint - thinking param support. Get Started
  • Note: this refactors the [BETA] unified /v1/messages endpoint, to just work for the Anthropic API.
  1. Vertex AI - handle $id in response schema when calling vertex ai. Get Started

Spend Tracking Improvements​

  1. Batches API - Fix cost calculation to run on retrieve_batch. Get Started
  2. Batches API - Log batch models in spend logs / standard logging payload. Get Started

Management Endpoints / UI​

  1. Virtual Keys Page
    • Allow team/org filters to be searchable on the Create Key Page
    • Add created_by and updated_by fields to Keys table
    • Show 'user_email' on key table
    • Show 100 Keys Per Page, Use full height, increase width of key alias
  2. Logs Page
    • Show Error Logs on LiteLLM UI
    • Allow Internal Users to View their own logs
  3. Internal Users Page
    • Allow admin to control default model access for internal users
  4. Fix session handling with cookies

Logging / Guardrail Integrations​

  1. Fix prometheus metrics w/ custom metrics, when keys containing team_id make requests. PR

Performance / Loadbalancing / Reliability improvements​

  1. Cooldowns - Support cooldowns on models called with client side credentials. Get Started
  2. Tag-based Routing - ensures tag-based routing across all endpoints (/embeddings, /image_generation, etc.). Get Started

General Proxy Improvements​

  1. Raise BadRequestError when unknown model passed in request
  2. Enforce model access restrictions on Azure OpenAI proxy route
  3. Reliability fix - Handle emoji’s in text - fix orjson error
  4. Model Access Patch - don't overwrite litellm.anthropic_models when running auth checks
  5. Enable setting timezone information in docker image

Complete Git Diff​

Here's the complete git diff

Krrish Dholakia
Ishaan Jaffer

v1.63.0 fixes Anthropic 'thinking' response on streaming to return the signature block. Github Issue

It also moves the response structure from signature_delta to signature to be the same as Anthropic. Anthropic Docs

Diff​

"message": {
...
"reasoning_content": "The capital of France is Paris.",
"thinking_blocks": [
{
"type": "thinking",
"thinking": "The capital of France is Paris.",
- "signature_delta": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 OLD FORMAT
+ "signature": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 KEY CHANGE
}
]
}

Krrish Dholakia
Ishaan Jaffer

These are the changes since v1.61.13-stable.

This release is primarily focused on:

  • LLM Translation improvements (claude-3-7-sonnet + 'thinking'/'reasoning_content' support)
  • UI improvements (add model flow, user management, etc)

Demo Instance​

Here's a Demo Instance to test changes:

New Models / Updated Models​

  1. Anthropic 3-7 sonnet support + cost tracking (Anthropic API + Bedrock + Vertex AI + OpenRouter)
    1. Anthropic API Start here
    2. Bedrock API Start here
    3. Vertex AI API See here
    4. OpenRouter See here
  2. Gpt-4.5-preview support + cost tracking See here
  3. Azure AI - Phi-4 cost tracking See here
  4. Claude-3.5-sonnet - vision support updated on Anthropic API See here
  5. Bedrock llama vision support See here
  6. Cerebras llama3.3-70b pricing See here

LLM Translation​

  1. Infinity Rerank - support returning documents when return_documents=True Start here
  2. Amazon Deepseek - <think> param extraction into ‘reasoning_content’ Start here
  3. Amazon Titan Embeddings - filter out ‘aws_’ params from request body Start here
  4. Anthropic ‘thinking’ + ‘reasoning_content’ translation support (Anthropic API, Bedrock, Vertex AI) Start here
  5. VLLM - support ‘video_url’ Start here
  6. Call proxy via litellm SDK: Support litellm_proxy/ for embedding, image_generation, transcription, speech, rerank Start here
  7. OpenAI Pass-through - allow using Assistants GET, DELETE on /openai pass through routes Start here
  8. Message Translation - fix openai message for assistant msg if role is missing - openai allows this
  9. O1/O3 - support ‘drop_params’ for o3-mini and o1 parallel_tool_calls param (not supported currently) See here

Spend Tracking Improvements​

  1. Cost tracking for rerank via Bedrock See PR
  2. Anthropic pass-through - fix race condition causing cost to not be tracked See PR
  3. Anthropic pass-through: Ensure accurate token counting See PR

Management Endpoints / UI​

  1. Models Page - Allow sorting models by ‘created at’
  2. Models Page - Edit Model Flow Improvements
  3. Models Page - Fix Adding Azure, Azure AI Studio models on UI
  4. Internal Users Page - Allow Bulk Adding Internal Users on UI
  5. Internal Users Page - Allow sorting users by ‘created at’
  6. Virtual Keys Page - Allow searching for UserIDs on the dropdown when assigning a user to a team See PR
  7. Virtual Keys Page - allow creating a user when assigning keys to users See PR
  8. Model Hub Page - fix text overflow issue See PR
  9. Admin Settings Page - Allow adding MSFT SSO on UI
  10. Backend - don't allow creating duplicate internal users in DB

Helm​

  1. support ttlSecondsAfterFinished on the migration job - See PR
  2. enhance migrations job with additional configurable properties - See PR

Logging / Guardrail Integrations​

  1. Arize Phoenix support
  2. ‘No-log’ - fix ‘no-log’ param support on embedding calls

Performance / Loadbalancing / Reliability improvements​

  1. Single Deployment Cooldown logic - Use allowed_fails or allowed_fail_policy if set Start here

General Proxy Improvements​

  1. Hypercorn - fix reading / parsing request body
  2. Windows - fix running proxy in windows
  3. DD-Trace - fix dd-trace enablement on proxy

Complete Git Diff​

View the complete git diff here.