Skip to main content

Moonshot AI

Overviewโ€‹

PropertyDetails
DescriptionMoonshot AI provides large language models including the moonshot-v1 series and kimi models.
Provider Route on LiteLLMmoonshot/
Link to Provider DocMoonshot AI โ†—
Base URLhttps://api.moonshot.cn/
Supported Operations/chat/completions


https://platform.moonshot.ai/

We support ALL Moonshot AI models, just set moonshot/ as a prefix when sending completion requests

Required Variablesโ€‹

Environment Variables
os.environ["MOONSHOT_API_KEY"] = ""  # your Moonshot AI API key

Usage - LiteLLM Python SDKโ€‹

Non-streamingโ€‹

Moonshot Non-streaming Completion
import os
import litellm
from litellm import completion

os.environ["MOONSHOT_API_KEY"] = "" # your Moonshot AI API key

messages = [{"content": "Hello, how are you?", "role": "user"}]

# Moonshot call
response = completion(
model="moonshot/moonshot-v1-8k",
messages=messages
)

print(response)

Streamingโ€‹

Moonshot Streaming Completion
import os
import litellm
from litellm import completion

os.environ["MOONSHOT_API_KEY"] = "" # your Moonshot AI API key

messages = [{"content": "Hello, how are you?", "role": "user"}]

# Moonshot call with streaming
response = completion(
model="moonshot/moonshot-v1-8k",
messages=messages,
stream=True
)

for chunk in response:
print(chunk)

Usage - LiteLLM Proxyโ€‹

Add the following to your LiteLLM Proxy configuration file:

config.yaml
model_list:
- model_name: moonshot-v1-8k
litellm_params:
model: moonshot/moonshot-v1-8k
api_key: os.environ/MOONSHOT_API_KEY

- model_name: moonshot-v1-32k
litellm_params:
model: moonshot/moonshot-v1-32k
api_key: os.environ/MOONSHOT_API_KEY

- model_name: moonshot-v1-128k
litellm_params:
model: moonshot/moonshot-v1-128k
api_key: os.environ/MOONSHOT_API_KEY

Start your LiteLLM Proxy server:

Start LiteLLM Proxy
litellm --config config.yaml

# RUNNING on http://0.0.0.0:4000
Moonshot via Proxy - Non-streaming
from openai import OpenAI

# Initialize client with your proxy URL
client = OpenAI(
base_url="http://localhost:4000", # Your proxy URL
api_key="your-proxy-api-key" # Your proxy API key
)

# Non-streaming response
response = client.chat.completions.create(
model="moonshot-v1-8k",
messages=[{"role": "user", "content": "hello from litellm"}]
)

print(response.choices[0].message.content)
Moonshot via Proxy - Streaming
from openai import OpenAI

# Initialize client with your proxy URL
client = OpenAI(
base_url="http://localhost:4000", # Your proxy URL
api_key="your-proxy-api-key" # Your proxy API key
)

# Streaming response
response = client.chat.completions.create(
model="moonshot-v1-8k",
messages=[{"role": "user", "content": "hello from litellm"}],
stream=True
)

for chunk in response:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")

For more detailed information on using the LiteLLM Proxy, see the LiteLLM Proxy documentation.

Moonshot AI Limitations & LiteLLM Handlingโ€‹

LiteLLM automatically handles the following Moonshot AI limitations to provide seamless OpenAI compatibility:

Temperature Range Limitationโ€‹

Limitation: Moonshot AI only supports temperature range [0, 1] (vs OpenAI's [0, 2])
LiteLLM Handling: Automatically clamps any temperature > 1 to 1

Temperature + Multiple Outputs Limitationโ€‹

Limitation: If temperature < 0.3 and n > 1, Moonshot AI raises an exception
LiteLLM Handling: Automatically sets temperature to 0.3 when this condition is detected

Tool Choice "Required" Not Supportedโ€‹

Limitation: Moonshot AI doesn't support tool_choice="required"
LiteLLM Handling: Converts this by:

  • Adding message: "Please select a tool to handle the current issue."
  • Removing the tool_choice parameter from the request