Skip to main content

🐳 Docker, Deploying LiteLLM Proxy

You can find the Dockerfile to build litellm proxy here

Quick Start

See the latest available ghcr docker image here:

docker pull
docker run

Deploy with Database

We maintain a seperate Dockerfile for reducing build time when running LiteLLM proxy with a connected Postgres Database

docker pull docker pull
docker run --name litellm-proxy \
-e DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname> \
-p 4000:4000 \

Your OpenAI proxy server is now running on

Advanced Deployment Settings

Customization of the server root path


In a Kubernetes deployment, it's possible to utilize a shared DNS to host multiple applications by modifying the virtual service

Customize the root path to eliminate the need for employing multiple DNS configurations during deployment.

👉 Set SERVER_ROOT_PATH in your .env and this will be set as your server root path

Setting SSL Certification

Use this, If you need to set ssl certificates for your on prem litellm proxy

Pass ssl_keyfile_path (Path to the SSL keyfile) and ssl_certfile_path (Path to the SSL certfile) when starting litellm proxy

docker run \
--ssl_keyfile_path ssl_test/keyfile.key \
--ssl_certfile_path ssl_test/certfile.crt

Provide an ssl certificate when starting litellm proxy server

Platform-specific Guide

Deploy on Google Cloud Run

Click the button to deploy to Google Cloud Run


Testing your deployed proxy

Assuming the required keys are set as Environment Variables is our example proxy, substitute it with your deployed cloud run app

curl \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.7


Run with docker compose

Step 1

Here's an example docker-compose.yml file

version: "3.9"
context: .
target: runtime
- "8000:8000" # Map the container port to the host, change the host port if necessary
- ./litellm-config.yaml:/app/config.yaml # Mount the local configuration file
# You can change the port or number of workers as per your requirements or pass any new supported CLI augument. Make sure the port passed here matches with the container port defined above in `ports` value
command: [ "--config", "/app/config.yaml", "--port", "8000", "--num_workers", "8" ]

# of your docker-compose config if any

Step 2

Create a litellm-config.yaml file with your LiteLLM config relative to your docker-compose.yml file.

Check the config doc here

Step 3

Run the command docker-compose up or docker compose up as per your docker installation.

Use -d flag to run the container in detached mode (background) e.g. docker compose up -d

Your LiteLLM container should be running now on the defined port e.g. 8000.

LiteLLM Proxy Performance

LiteLLM proxy has been load tested to handle 1500 req/s.

Throughput - 30% Increase

LiteLLM proxy + Load Balancer gives 30% increase in throughput compared to Raw OpenAI API

Latency Added - 0.00325 seconds

LiteLLM proxy adds 0.00325 seconds latency as compared to using the Raw OpenAI API