Skip to main content

LiteLLM Proxy - Locust Load Test

Locust Load Test LiteLLM Proxy​

  1. Add fake-openai-endpoint to your proxy config.yaml and start your litellm proxy litellm provides a free hosted fake-openai-endpoint you can load test against
model_list:
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
  1. pip install locust

  2. Create a file called locustfile.py on your local machine. Copy the contents from the litellm load test located here

  3. Start locust Run locust in the same directory as your locustfile.py from step 2

locust

Output on terminal

[2024-03-15 07:19:58,893] Starting web interface at http://0.0.0.0:8089
[2024-03-15 07:19:58,898] Starting Locust 2.24.0
  1. Run Load test on locust

Head to the locust UI on http://0.0.0.0:8089

Set Users=100, Ramp Up Users=10, Host=Base URL of your LiteLLM Proxy

  1. Expected Results

Expect to see the following response times for /health/readiness Median → /health/readiness is 150ms

Avg → /health/readiness is 219ms