Skip to main content

Hyperbolic

Overviewโ€‹

PropertyDetails
DescriptionHyperbolic provides access to the latest models at a fraction of legacy cloud costs, with OpenAI-compatible APIs for LLMs, image generation, and more.
Provider Route on LiteLLMhyperbolic/
Link to Provider DocHyperbolic Documentation โ†—
Base URLhttps://api.hyperbolic.xyz/v1
Supported Operations/chat/completions


https://docs.hyperbolic.xyz

We support ALL Hyperbolic models, just set hyperbolic/ as a prefix when sending completion requests

Available Modelsโ€‹

Language Modelsโ€‹

ModelDescriptionContext WindowPricing per 1M tokens
hyperbolic/deepseek-ai/DeepSeek-V3DeepSeek V3 - Fast and efficient131,072 tokens$0.25
hyperbolic/deepseek-ai/DeepSeek-V3-0324DeepSeek V3 March 2024 version131,072 tokens$0.25
hyperbolic/deepseek-ai/DeepSeek-R1DeepSeek R1 - Reasoning model131,072 tokens$2.00
hyperbolic/deepseek-ai/DeepSeek-R1-0528DeepSeek R1 May 2028 version131,072 tokens$0.25
hyperbolic/Qwen/Qwen2.5-72B-InstructQwen 2.5 72B Instruct131,072 tokens$0.40
hyperbolic/Qwen/Qwen2.5-Coder-32B-InstructQwen 2.5 Coder 32B for code generation131,072 tokens$0.20
hyperbolic/Qwen/Qwen3-235B-A22BQwen 3 235B A22B variant131,072 tokens$2.00
hyperbolic/Qwen/QwQ-32BQwen QwQ 32B131,072 tokens$0.20
hyperbolic/meta-llama/Llama-3.3-70B-InstructLlama 3.3 70B Instruct131,072 tokens$0.80
hyperbolic/meta-llama/Meta-Llama-3.1-405B-InstructLlama 3.1 405B Instruct131,072 tokens$5.00
hyperbolic/moonshotai/Kimi-K2-InstructKimi K2 Instruct131,072 tokens$2.00

Required Variablesโ€‹

Environment Variables
os.environ["HYPERBOLIC_API_KEY"] = ""  # your Hyperbolic API key

Get your API key from Hyperbolic dashboard.

Usage - LiteLLM Python SDKโ€‹

Non-streamingโ€‹

Hyperbolic Non-streaming Completion
import os
import litellm
from litellm import completion

os.environ["HYPERBOLIC_API_KEY"] = "" # your Hyperbolic API key

messages = [{"content": "What is the capital of France?", "role": "user"}]

# Hyperbolic call
response = completion(
model="hyperbolic/Qwen/Qwen2.5-72B-Instruct",
messages=messages
)

print(response)

Streamingโ€‹

Hyperbolic Streaming Completion
import os
import litellm
from litellm import completion

os.environ["HYPERBOLIC_API_KEY"] = "" # your Hyperbolic API key

messages = [{"content": "Write a short poem about AI", "role": "user"}]

# Hyperbolic call with streaming
response = completion(
model="hyperbolic/deepseek-ai/DeepSeek-V3",
messages=messages,
stream=True
)

for chunk in response:
print(chunk)

Function Callingโ€‹

Hyperbolic Function Calling
import os
import litellm
from litellm import completion

os.environ["HYPERBOLIC_API_KEY"] = "" # your Hyperbolic API key

tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
]

response = completion(
model="hyperbolic/deepseek-ai/DeepSeek-V3",
messages=[{"role": "user", "content": "What's the weather like in New York?"}],
tools=tools,
tool_choice="auto"
)

print(response)

Usage - LiteLLM Proxyโ€‹

Add the following to your LiteLLM Proxy configuration file:

config.yaml
model_list:
- model_name: deepseek-fast
litellm_params:
model: hyperbolic/deepseek-ai/DeepSeek-V3
api_key: os.environ/HYPERBOLIC_API_KEY

- model_name: qwen-coder
litellm_params:
model: hyperbolic/Qwen/Qwen2.5-Coder-32B-Instruct
api_key: os.environ/HYPERBOLIC_API_KEY

- model_name: deepseek-reasoning
litellm_params:
model: hyperbolic/deepseek-ai/DeepSeek-R1
api_key: os.environ/HYPERBOLIC_API_KEY

Start your LiteLLM Proxy server:

Start LiteLLM Proxy
litellm --config config.yaml

# RUNNING on http://0.0.0.0:4000
Hyperbolic via Proxy - Non-streaming
from openai import OpenAI

# Initialize client with your proxy URL
client = OpenAI(
base_url="http://localhost:4000", # Your proxy URL
api_key="your-proxy-api-key" # Your proxy API key
)

# Non-streaming response
response = client.chat.completions.create(
model="deepseek-fast",
messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}]
)

print(response.choices[0].message.content)
Hyperbolic via Proxy - Streaming
from openai import OpenAI

# Initialize client with your proxy URL
client = OpenAI(
base_url="http://localhost:4000", # Your proxy URL
api_key="your-proxy-api-key" # Your proxy API key
)

# Streaming response
response = client.chat.completions.create(
model="qwen-coder",
messages=[{"role": "user", "content": "Write a Python function to sort a list"}],
stream=True
)

for chunk in response:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")

For more detailed information on using the LiteLLM Proxy, see the LiteLLM Proxy documentation.

Supported OpenAI Parametersโ€‹

Hyperbolic supports the following OpenAI-compatible parameters:

ParameterTypeDescription
messagesarrayRequired. Array of message objects with 'role' and 'content'
modelstringRequired. Model ID (e.g., deepseek-ai/DeepSeek-V3, Qwen/Qwen2.5-72B-Instruct)
streambooleanOptional. Enable streaming responses
temperaturefloatOptional. Sampling temperature (0.0 to 2.0)
top_pfloatOptional. Nucleus sampling parameter
max_tokensintegerOptional. Maximum tokens to generate
frequency_penaltyfloatOptional. Penalize frequent tokens
presence_penaltyfloatOptional. Penalize tokens based on presence
stopstring/arrayOptional. Stop sequences
nintegerOptional. Number of completions to generate
toolsarrayOptional. List of available tools/functions
tool_choicestring/objectOptional. Control tool/function calling
response_formatobjectOptional. Response format specification
seedintegerOptional. Random seed for reproducibility
userstringOptional. User identifier

Advanced Usageโ€‹

Custom API Baseโ€‹

If you're using a custom Hyperbolic deployment:

Custom API Base
import litellm

response = litellm.completion(
model="hyperbolic/deepseek-ai/DeepSeek-V3",
messages=[{"role": "user", "content": "Hello"}],
api_base="https://your-custom-hyperbolic-endpoint.com/v1",
api_key="your-api-key"
)

Rate Limitsโ€‹

Hyperbolic offers different tiers:

  • Basic: 60 requests per minute (RPM)
  • Pro: 600 RPM
  • Enterprise: Custom limits

Pricingโ€‹

Hyperbolic offers competitive pay-as-you-go pricing with no hidden fees or long-term commitments. See the model table above for specific pricing per million tokens.

Precision Optionsโ€‹

  • BF16: Best precision and performance, suitable for tasks where accuracy is critical
  • FP8: Optimized for efficiency and speed, ideal for high-throughput applications at lower cost

Additional Resourcesโ€‹