Blog
Skip to main content

June Townhall Updates: 94 Bug Fixes, OCR + Realtime are in Rust, and a Zero-Regression Commitment

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

Thank you to everyone who joined our June town hall.

Three numbers capture the month: 24 security fixes, 94 bug fixes, and 78 feature commits. The sections below break each one down, alongside our public commitment to zero reported regressions and the gradual migration of the LiteLLM gateway to Rust.

Security updatesโ€‹

Last 4 weeks: by the numbersโ€‹

MetricCount
Vulnerabilities patched24

Bug bounty โ€” now liveโ€‹

We pay for security reports.

Automated review on every PRโ€‹

Every PR gets a security pass. Look for the Veria scan โ€” it's a required check on every PR, built on Veria AI + zizmor + semgrep. False positives are flagged, never blocking.

What's next for securityโ€‹

  • Invest more in the bug bounty program.
  • Improve code patterns during the stability sprint.

Stability updatesโ€‹

The commitment: zero reported regressions by August 29thโ€‹

The goal:

  • Close 20 reported bugs in core functionality.
  • Fix root causes in 3 high-impact components.
  • Ship a public progress report alongside the August 29 release.

94 bug fixes doneโ€‹

Fixes shipped across five areas:

  • Proxy core & resilience โ€” 22 fixes
  • UI + Auth / SSO โ€” 22 fixes
  • Cost, Budgets & Observability โ€” 21 fixes
  • MCP Gateway โ€” 15 fixes
  • Streaming / Realtime APIs โ€” 14 fixes

What kinds of fixes shipped:

  • Billing accuracy. Closed the gaps where spend slipped through โ€” virtual-key limits are now enforced, and cached and tiered usage on Anthropic and Bedrock is priced correctly.
  • Identity & access. Caller identity now resolves once into a single record, so team IDs and spend attribution stay correct and auth no longer fails open on DB errors.
  • MCP reliability. Tools now list and call consistently across every auth method, with per-user credentials and proper OAuth token refresh.
  • Resource leaks. Guardrails no longer re-initialize on every request, eliminating the runner leaks, latency spikes, and OOMs they caused.
  • Resilience. Streaming requests recover cost on interruption, the proxy self-heals on dropped DB connections, and OTEL metrics no longer overload Splunk.

Root causes, not just symptoms:

  • MCP authentication โ€” 5 separate code paths, one per auth method, caused inconsistent tool listing and calling. Fix: a single unified code path resolves credentials across all auth methods.
  • AI gateway auth โ€” 5+ DB lookups per request to resolve key/user/team identity. Fix: caller identity resolves once into a single record โ€” lookups cut roughly in half.
  • UI forms โ€” saving a form could overwrite unrelated fields. Fix: frontend and backend types are 100% in sync from a shared source, so only edited fields change on save.

Public timelineโ€‹

Bug triage is open and active on GitHub issue #30484.

  • NOW โ€” 20 bugs open in core. Triage active.
  • JULY โ€” MCP auth unified to a single code path. AI gateway identity lookups cut in half.
  • AUGUST โ€” UI form types synced end-to-end. No more silent field overwrites on save.
  • AUG 29 โ€” Public progress report ships with the release. Zero-regression target date.

Product updatesโ€‹

78 feature commits in Juneโ€‹

Rust

  • Rust workspace ยท Mistral OCR bridge
  • OpenAI Realtime translation layer

Sandbox API

  • E2B + OpenSandbox
  • Unified code execution API

New models/providers

  • TinyFish ยท Fal.ai ยท Fireworks AI
  • Cloudflare Workers AI ยท MAI-Image-2.5

Performance: moving LiteLLM to Rustโ€‹

We're migrating the LiteLLM gateway to Rust, and the early numbers make the case:

MetricRust gatewayLiteLLM (Python)Improvement
Per-request overhead0.05ms7.5ms~150x lower
Throughput under load6,782 req/s453 req/s15x
Peak memory under load32MB359MB11x lighter

Per-request overhead measured at 10 concurrent clients vs. a local mock upstream; throughput and memory under sustained load at 50 concurrent clients. Reproducible harness checked in.

How the migration works: a staged rollout, moving piece by piece from a pure Python SDK + FastAPI proxy, to Python driving Rust transforms via PyO3, to a FastAPI shell with pure Rust on the hot path, to an all-Rust async server (axum).

A gradual rollout โ€” one route at a time, proven in production before the next begins. Same config, database, and API: nothing for you to change.

  • Aug 15 โ€” OCR routes: Mistral first, then all OCR.
  • Sep 1 โ€” /messages, then /chat/completions.
  • Sep 15 โ€” The router: load balancing, fallbacks, retries, cooldowns.
  • Dec 1 โ€” The full server: FastAPI thin shell, then pure Rust (axum).

Announcing our version policyโ€‹

Going forward, we'll maintain only the four most recent stable minor releases. This takes effect next Monday, June 29th. Our focus is ensuring stability on the most up-to-date product offerings โ€” bookmark our Release Notes to stay current.

What's nextโ€‹

Thank you again for all the questions and feedback. We'll keep sharing concrete progress updates as these efforts ship โ€” especially as we approach the August 29 zero-regression milestone.

Hiringโ€‹

We are actively hiring across several roles โ€” apply here if you're interested!

Thank you for using LiteLLM - Krrish & Ishaan