Skip to main content

EmpirioLabs AI

Overview​

PropertyDetails
DescriptionEmpirioLabs AI hosts open, proprietary, and custom models behind one OpenAI-compatible API with pay-as-you-go pricing across text, image, video, audio, search, and 3D endpoints.
Provider Route on LiteLLMempiriolabs/
Link to Provider DocEmpirioLabs AI Documentation ↗
Base URLhttps://api.empiriolabs.ai/v1
Supported Operations/chat/completions, /responses


We support ALL EmpirioLabs chat models, just set empiriolabs/ as a prefix when sending completion requests

Available Models (selection)​

The full live catalog with pricing is at empiriolabs.ai/models. Popular chat models:

ModelDescriptionContext Window
empiriolabs/qwen3-7-maxQwen3.7 Max flagship text model for coding, agents, and deep thinking1M tokens
empiriolabs/qwen3-7-plusCost-effective Qwen3.7 vision-language model (text, image, video input)1M tokens
empiriolabs/deepseek-v4-proDeepSeek V4 flagship MoE (1.6T total / 49B active parameters)1M tokens
empiriolabs/deepseek-v4-flashLightweight DeepSeek V4 MoE (284B total / 13B active parameters)1M tokens
empiriolabs/glm-5-1Zhipu AI long-context reasoning model with tool use202K tokens
empiriolabs/kimi-k2-6Moonshot Kimi K2.6 multimodal reasoning model256K tokens
empiriolabs/minimax-m3MiniMax M3 multimodal reasoning for coding and agents524K tokens
empiriolabs/gemma-4-26b-a4bGoogle Gemma 4 26B A4B open multimodal model256K tokens

Required Variables​

Environment Variables
os.environ["EMPIRIOLABS_API_KEY"] = ""  # your EmpirioLabs API key

Get an API key from the EmpirioLabs dashboard.

Usage - LiteLLM Python SDK​

Non-streaming​

EmpirioLabs Non-streaming Completion
import os
import litellm
from litellm import completion

os.environ["EMPIRIOLABS_API_KEY"] = "" # your EmpirioLabs API key

messages = [{"content": "Hello, how are you?", "role": "user"}]

# EmpirioLabs call
response = completion(model="empiriolabs/qwen3-7-plus", messages=messages)

print(response)

Streaming​

EmpirioLabs Streaming Completion
import os
import litellm
from litellm import completion

os.environ["EMPIRIOLABS_API_KEY"] = "" # your EmpirioLabs API key

messages = [{"content": "Hello, how are you?", "role": "user"}]

# EmpirioLabs call with streaming
response = completion(
model="empiriolabs/qwen3-7-plus",
messages=messages,
stream=True,
)

for chunk in response:
print(chunk)

Usage - LiteLLM Proxy​

Add the following to your LiteLLM Proxy configuration file:

config.yaml
model_list:
- model_name: qwen3-7-plus
litellm_params:
model: empiriolabs/qwen3-7-plus
api_key: os.environ/EMPIRIOLABS_API_KEY

- model_name: deepseek-v4-flash
litellm_params:
model: empiriolabs/deepseek-v4-flash
api_key: os.environ/EMPIRIOLABS_API_KEY

Start your LiteLLM Proxy server:

Start LiteLLM Proxy
litellm --config config.yaml

# RUNNING on http://0.0.0.0:4000
EmpirioLabs via Proxy
from openai import OpenAI

# Initialize client with your proxy URL
client = OpenAI(
base_url="http://localhost:4000", # Your proxy URL
api_key="your-proxy-api-key", # Your proxy API key
)

# Non-streaming response
response = client.chat.completions.create(
model="qwen3-7-plus",
messages=[{"role": "user", "content": "hello from litellm"}],
)

print(response.choices[0].message.content)

Additional Notes​

  • Thinking-capable models accept reasoning_effort (none, low, medium, high, max); the gateway maps it onto each model's native thinking controls.
  • Per-model parameters, limits, and live pricing are listed at docs.empiriolabs.ai and on each model page at empiriolabs.ai/models.
🚅
LiteLLM Enterprise
SSO/SAML, audit logs, spend tracking, multi-team management, and guardrails — built for production.
Learn more →