โจ Enterprise Quickstart
Use this guide if you are on an Enterprise trial to evaluate LiteLLM as a unified LLM, MCP, and Agent gateway with enterprise controls and budget enforcement.
- Free trial: 7-day enterprise license
- Talk to us: Book a demo
- SSO is free for up to 5 users. Beyond that, an enterprise license is required.
- Full feature catalog: Enterprise
Deploy + Shared Setupโ
All gateway and budget tests share one deployment and one org/team/key. Do this section first.
- Self-Hosted
- LiteLLM Cloud
Prerequisitesโ
- Docker + Docker Compose
- Postgres โ required for Admin UI, virtual keys, MCP/Agent registries, and budget tracking
- An LLM provider API key (OpenAI, Azure, Anthropic, etc.)
- Your Enterprise license key from the 7-day trial
Deploy with Docker Composeโ
Follow the Docker Compose tab in the Getting Started Tutorial. Condensed steps:
docker pull ghcr.io/berriai/litellm-database:main-latest
curl -O https://raw.githubusercontent.com/BerriAI/litellm/main/docker-compose.yml
Create .env:
LITELLM_MASTER_KEY="sk-1234"
LITELLM_SALT_KEY="sk-salt-change-me"
LITELLM_LICENSE="eyJ..."
OPENAI_API_KEY="your-api-key"
Create config.yaml:
model_list:
- model_name: gpt-5.5
litellm_params:
model: openai/gpt-5.5
api_key: os.environ/OPENAI_API_KEY
litellm_settings:
callbacks: ["prometheus"]
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
database_url: "postgresql://llmproxy:dbpassword9090@db:5432/litellm"
store_model_in_db: true
docker compose up
Verify Enterprise Editionโ
Open http://localhost:4000/ โ Swagger should show "Enterprise Edition" in the description. See the Enterprise license FAQ.
Open the Admin UI at http://localhost:4000/ui and sign in with your master key.
LiteLLM Cloud is fully managed โ we run the proxy and Postgres.
- Request access: Book a demo or contact your LiteLLM account team
- Compliance: SOC 2 Type 2 and ISO 27001 โ Data Security
- Regions: Supported data regions
Enterprise features are pre-enabled. Use the Admin UI URL your account team provides โ no LITELLM_LICENSE needed.
Shared tenant setupโ
Complete these steps in the Admin UI before starting the gateway tracks.
| Step | Action | Why |
|---|---|---|
| 1 | Create an Organization and a Team | RBAC baseline for the PoC |
| 2 | Set team max_budget (e.g. $10, duration 30d) | Validates team-level spend envelope |
| 3 | Create a team-scoped virtual key with model access | All gateway tests use this key, not the master key |
| 4 | Note the team ID and virtual key in a scratchpad | Reused in Sections 1โ4 |
โ Multi-tenant Architecture ยท Virtual Keys
1. Validate LLM Gatewayโ
Prove LiteLLM routes LLM requests through your virtual key, tracks spend, and enforces RBAC.
Stepsโ
-
Confirm model
gpt-5.5(or your model) appears inmodel_list(config or Admin UI โ Models). -
Test with your master key:
curl -X POST 'http://localhost:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "gpt-5.5",
"messages": [{"role": "user", "content": "Hello from LiteLLM Enterprise Gateway"}]
}'
-
Use your team virtual key โ repeat the same request with the key from shared setup.
-
Verify response โ expect
200 OK; assistant text is inchoices[0].message.content. -
Verify logs โ open Logs tab; confirm key, team, model, latency, and spend appear.
-
Verify team spend โ open Teams tab โ select your team; confirm spend incremented toward
max_budget.
โ Virtual Keys โ Gateway Quickstart โ Role-Based Access Control
2. MCP Gatewayโ
Prove LiteLLM registers MCP servers, enforces per-key access, routes tool calls, and tracks MCP cost.
Stepsโ
-
Register MCP server โ Admin UI โ MCP Servers โ Add New MCP Server:
- Name:
deepwiki - URL:
https://mcp.deepwiki.com/mcp - Transport: HTTP
Or add to
config.yaml: - Name:
mcp_servers:
- server_name: deepwiki
url: https://mcp.deepwiki.com/mcp
transport: http
available_on_public_internet: true
-
Assign to team/key โ under MCP Settings on the virtual key or team, allow the
deepwikiserver. See MCP Permission Management. -
List tools โ confirm tools appear in Admin UI under MCP Servers โ MCP Tools.
-
Invoke via
/v1/chat/completions:
curl -X POST 'http://localhost:4000/v1/chat/completions' \
-H 'Authorization: Bearer sk-team-key' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-5.5",
"messages": [{"role": "user", "content": "TLDR of BerriAI/litellm repo"}],
"tools": [{
"type": "mcp",
"server_url": "litellm_proxy/deepwiki",
"server_label": "deepwiki",
"require_approval": "never"
}]
}'
-
Verify response โ contains tool output and an assistant summary.
-
Verify logs โ Logs tab shows MCP tool call with namespaced tool name and cost.
โ MCP Overview ยท MCP Permission Management ยท Using your MCP
3. Agent Gatewayโ
Prove LiteLLM registers A2A agents, enforces per-key access, invokes agents, and tracks agent-attributed spend.
Stepsโ
-
Deploy a sample agent โ use Multi-agent collaboration using A2A (simple deployable A2A agent with streaming support).
-
Register in Admin UI โ Agents tab โ Add Agent โ enter name and URL.
-
Assign to team/key โ under Agent Settings on the virtual key, allow the agent. See Agent Permission Management.
-
List agents:
curl -H 'Authorization: Bearer sk-team-key' \
'http://localhost:4000/v1/agents'
- Invoke via the A2A SDK:
import httpx, asyncio
from uuid import uuid4
from a2a.client import A2ACardResolver, A2AClient
from a2a.types import MessageSendParams, SendMessageRequest
LITELLM_BASE_URL = "http://localhost:4000"
LITELLM_VIRTUAL_KEY = "sk-team-key"
async def main():
headers = {"Authorization": f"Bearer {LITELLM_VIRTUAL_KEY}"}
async with httpx.AsyncClient(headers=headers) as client:
agents = (await client.get(f"{LITELLM_BASE_URL}/v1/agents")).json()
agent_id = agents[0]["agent_id"]
base_url = f"{LITELLM_BASE_URL}/a2a/{agent_id}"
resolver = A2ACardResolver(httpx_client=client, base_url=base_url)
a2a_client = A2AClient(
httpx_client=client,
agent_card=await resolver.get_agent_card(),
)
response = await a2a_client.send_message(
SendMessageRequest(
id=str(uuid4()),
params=MessageSendParams(
message={
"role": "user",
"parts": [{"kind": "text", "text": "Hello, what can you do?"}],
"messageId": uuid4().hex,
}
),
)
)
print(response.model_dump(mode="json", exclude_none=True, indent=2))
asyncio.run(main())
- Verify logs โ Logs tab shows key, team, latency, and agent-attributed cost. Cost counts toward team/key spend from Section 0.
โ Agent Gateway Overview ยท Invoking A2A Agents ยท Agent Cost Tracking
4. Budgets & Spendโ
Budget enforcement runs on all three gateways through the same virtual key โ one control plane governs LLM, MCP, and Agent spend.
4a. Key budget + rate limitsโ
- Create a test key with a tight budget and RPM limit:
curl -X POST 'http://localhost:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"max_budget": 0.01,
"rpm_limit": 1,
"team_id": "<your-team-id>"
}'
- First request with the new key โ
200 OK. - Second request within the same minute โ rate limit error (RPM exceeded).
- Confirm key spend in Admin UI under Virtual Keys.
โ Virtual Keys ยท Docker Quick Start โ RPM test
4b. Team budgetโ
Team max_budget was set in Section 0. After completing Sections 1โ3:
- Open Teams tab โ select your PoC team.
- Confirm spend accumulated across LLM, MCP, and Agent calls.
- Optional negative test โ set team
max_budgetvery low (e.g.$0.0001), make one LLM call, confirm budget-exceeded error.
4c. Tag budgetโ
- Add
tag_budget_configtoconfig.yamland restart the proxy:
litellm_settings:
tag_budget_config:
poc:chat-app:
max_budget: 0.000000000001
budget_duration: 1d
- Make a tagged request:
curl -X POST 'http://localhost:4000/chat/completions' \
-H 'Authorization: Bearer sk-team-key' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-5.5",
"messages": [{"role": "user", "content": "Hello"}],
"metadata": {"tags": ["poc:chat-app"]}
}'
-
First call succeeds; second call with the same tag fails with a budget-exceeded error.
-
Query tag spend:
curl -X GET 'http://localhost:4000/spend/tags' \
-H 'Authorization: Bearer sk-1234'
Verify: response lists poc:chat-app with total_spend and log_count.
Explore next: Projects ยท Temporary budget increases ยท Soft budget alerts ยท Spend reports ยท Budget Routing ยท Enterprise Spend Tracking
5. Enterprise Controlsโ
Layer security and compliance on top of working gateways and budgets.
Audit logsโ
Enable via store_audit_logs: true under litellm_settings of your config.yml. Delete a virtual key via API or UI, then check the Audit Logs tab.
โ Audit Logs
Team/key guardrailsโ
- Guardrails โ create a guardrail (secret detection or content moderation)
- Policies โ attach the guardrail to a team or key
- Send a request that should be blocked; confirm the guardrail fires
โ Guardrail Policies โ Guardrails Quick Start
SSO for Admin UIโ
SSO controls Admin UI login โ separate from API auth (virtual keys or JWT). Register this redirect URI in your IdP:
https://<your-proxy-base-url>/sso/callback
- Microsoft
- Okta / Generic OIDC
GOOGLE_CLIENT_ID="<your-client-id>"
GOOGLE_CLIENT_SECRET="<your-client-secret>"
PROXY_BASE_URL="https://<your-proxy-base-url>"
MICROSOFT_CLIENT_ID="<your-client-id>"
MICROSOFT_CLIENT_SECRET="<your-client-secret>"
MICROSOFT_TENANT="<your-tenant-id>"
PROXY_BASE_URL="https://<your-proxy-base-url>"
GENERIC_CLIENT_ID="<your-client-id>"
GENERIC_CLIENT_SECRET="<your-client-secret>"
GENERIC_AUTHORIZATION_ENDPOINT="https://<your-idp>/oauth2/v1/authorize"
GENERIC_TOKEN_ENDPOINT="https://<your-idp>/oauth2/v1/token"
GENERIC_USERINFO_ENDPOINT="https://<your-idp>/oauth2/v1/userinfo"
PROXY_BASE_URL="https://<your-proxy-base-url>"
Verify: sign in to the Admin UI through your identity provider.
Also available: Custom SSO ยท CLI SSO ยท SCIM provisioning
โ SSO for Admin UI
JWT/OIDC Authโ
Authenticate application requests with your identity provider's JWT tokens instead of static virtual keys.
Secret managerโ
Point LiteLLM at your secret manager so provider keys are read from vault instead of config files.
7. Additional Enterprise Valueโ
8. Need Help?โ
Every Enterprise license includes a dedicated Slack or Teams channel with our engineering team. Reach out to us support@berri.ai and we'll be more than happy to help you!
See Professional Support.