Blog
Skip to main content

Launching LiteLLM-Rust: A Minimal Rust AI Gateway for Coding Agents

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

Last Updated: June 2026

Today we're launching LiteLLM-Rust โ€” a minimal Rust-based AI Gateway for coding agents.

Repo: github.com/LiteLLM-Labs/litellm-rust

It does three things:

  1. LiteLLM-compatible โ€” your existing config.yaml and database work out of the box.
  2. Fast โ€” targeting <1ms overhead latency on Claude Code calls.
  3. Built for autonomous agents โ€” sandboxing via E2B and Daytona today, with durable sessions, memory, artifacts, and vault on the roadmap.

LiteLLM-compatible AI Gatewayโ€‹

LiteLLM-Rust reads the same config.yaml format and the same database schema as the Python LiteLLM AI Gateway. Keys, virtual keys, teams, budgets, routing rules, and fallbacks carry over without changes. Client SDKs and admin workflows stay the same.

Drop-in migration โ€” same config, same DB

Before

litellm (Python)
config.yaml
Postgres DB
Client SDKs
swap binary

After

litellm-rust
config.yaml (unchanged)
Postgres DB (unchanged)
Client SDKs (unchanged)
Only the runtime changes โ€” config, DB schema, and client contract are identical
litellm-rust --config /etc/litellm/config.yaml --port 4000

Same interface contract โ€” only the runtime changes.

Fast: <1ms overhead on Claude Code callsโ€‹

Coding agents like Claude Code fan out many LLM calls per task. Every millisecond of gateway overhead compounds across tool calls. Our goal with LiteLLM-Rust is sub-millisecond overhead on the hot path, by removing Python from request forwarding entirely.

Gateway overhead per Claude Code call

LiteLLM-Rust (target)
<1ms
Python (typical)
ms-scale
Per-call overhead compounds across dozens of tool calls in a single agent run
Sub-millisecond target on the hot path โ€” Python removed from request forwarding

Built for autonomous agentsโ€‹

A reliable AI Gateway for coding agents needs more than fast forwarding. It needs to schedule, sandbox, and persist state for long-running agent runs.

LiteLLM-Rust ships today with:

  • Sandboxing via E2B and Daytona โ€” agents run in isolated environments, no host access
  • Claude Code scheduling โ€” kick off agent runs on cron, webhook, or API trigger

On the roadmap:

  • Durable sessions โ€” resume long-running agents across restarts
  • Memory โ€” persistent context across runs
  • Artifacts โ€” store and retrieve agent outputs
  • Vault โ€” secrets management for agent execution

Coding-agent runtime in the gateway

Trigger (cron ยท webhook ยท API)
LiteLLM-Rust Gateway
Sandbox
E2BDaytona
Claude Code runs isolated
Roadmap
durable sessionsmemoryartifactsvault
Results streamed back to caller
One runtime: gateway, scheduler, sandbox โ€” proxying LLM calls and running the agents that make them

Key Takeawaysโ€‹

  • LiteLLM-Rust is a minimal, MIT-licensed Rust AI Gateway built for coding agents
  • Drop-in compatible with your existing LiteLLM config.yaml and database
  • Sub-millisecond overhead is the performance target on Claude Code calls
  • Sandboxing (E2B + Daytona) ships today; durable sessions, memory, artifacts, and vault on the roadmap
  • Early and experimental โ€” feedback welcome on Discord

Frequently Asked Questionsโ€‹

Is it open-source?โ€‹

Yes. 100% open-source under the MIT license. Repo: github.com/LiteLLM-Labs/litellm-rust.

Is it part of my existing LiteLLM deployment?โ€‹

No. LiteLLM-Rust is a separate repo. Goal is to explore the design space safely and bring the learnings back to the core LiteLLM project over time.

How mature is it?โ€‹

Early and experimental. We're shipping it to gather feedback from coding-agent teams running it against real workloads. Please join the Discord and tell us what's working and what's missing.

How is this different from the existing Python LiteLLM AI Gateway?โ€‹

The Python LiteLLM AI Gateway is the production-grade, feature-complete AI Gateway used by enterprise deployments today โ€” and remains the recommended choice. LiteLLM-Rust is a minimal, performance-focused exploration aimed at coding-agent workloads. For teams with strict uptime and compliance requirements, LiteLLM Enterprise on the Python AI Gateway provides SSO/SCIM, air-gapped deployment, 24/7 SLA support, and advanced guardrails.


Newsletter

Get new posts in your inbox