Moonshot AI
Overviewโ
Property | Details |
---|---|
Description | Moonshot AI provides large language models including the moonshot-v1 series and kimi models. |
Provider Route on LiteLLM | moonshot/ |
Link to Provider Doc | Moonshot AI โ |
Base URL | https://api.moonshot.cn/ |
Supported Operations | /chat/completions |
We support ALL Moonshot AI models, just set moonshot/
as a prefix when sending completion requests
Required Variablesโ
os.environ["MOONSHOT_API_KEY"] = "" # your Moonshot AI API key
Usage - LiteLLM Python SDKโ
Non-streamingโ
import os
import litellm
from litellm import completion
os.environ["MOONSHOT_API_KEY"] = "" # your Moonshot AI API key
messages = [{"content": "Hello, how are you?", "role": "user"}]
# Moonshot call
response = completion(
model="moonshot/moonshot-v1-8k",
messages=messages
)
print(response)
Streamingโ
import os
import litellm
from litellm import completion
os.environ["MOONSHOT_API_KEY"] = "" # your Moonshot AI API key
messages = [{"content": "Hello, how are you?", "role": "user"}]
# Moonshot call with streaming
response = completion(
model="moonshot/moonshot-v1-8k",
messages=messages,
stream=True
)
for chunk in response:
print(chunk)
Usage - LiteLLM Proxyโ
Add the following to your LiteLLM Proxy configuration file:
model_list:
- model_name: moonshot-v1-8k
litellm_params:
model: moonshot/moonshot-v1-8k
api_key: os.environ/MOONSHOT_API_KEY
- model_name: moonshot-v1-32k
litellm_params:
model: moonshot/moonshot-v1-32k
api_key: os.environ/MOONSHOT_API_KEY
- model_name: moonshot-v1-128k
litellm_params:
model: moonshot/moonshot-v1-128k
api_key: os.environ/MOONSHOT_API_KEY
Start your LiteLLM Proxy server:
litellm --config config.yaml
# RUNNING on http://0.0.0.0:4000
- OpenAI SDK
- LiteLLM SDK
- cURL
from openai import OpenAI
# Initialize client with your proxy URL
client = OpenAI(
base_url="http://localhost:4000", # Your proxy URL
api_key="your-proxy-api-key" # Your proxy API key
)
# Non-streaming response
response = client.chat.completions.create(
model="moonshot-v1-8k",
messages=[{"role": "user", "content": "hello from litellm"}]
)
print(response.choices[0].message.content)
from openai import OpenAI
# Initialize client with your proxy URL
client = OpenAI(
base_url="http://localhost:4000", # Your proxy URL
api_key="your-proxy-api-key" # Your proxy API key
)
# Streaming response
response = client.chat.completions.create(
model="moonshot-v1-8k",
messages=[{"role": "user", "content": "hello from litellm"}],
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
import litellm
# Configure LiteLLM to use your proxy
response = litellm.completion(
model="litellm_proxy/moonshot-v1-8k",
messages=[{"role": "user", "content": "hello from litellm"}],
api_base="http://localhost:4000",
api_key="your-proxy-api-key"
)
print(response.choices[0].message.content)
import litellm
# Configure LiteLLM to use your proxy with streaming
response = litellm.completion(
model="litellm_proxy/moonshot-v1-8k",
messages=[{"role": "user", "content": "hello from litellm"}],
api_base="http://localhost:4000",
api_key="your-proxy-api-key",
stream=True
)
for chunk in response:
if hasattr(chunk.choices[0], 'delta') and chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
curl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-proxy-api-key" \
-d '{
"model": "moonshot-v1-8k",
"messages": [{"role": "user", "content": "hello from litellm"}]
}'
curl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-proxy-api-key" \
-d '{
"model": "moonshot-v1-8k",
"messages": [{"role": "user", "content": "hello from litellm"}],
"stream": true
}'
For more detailed information on using the LiteLLM Proxy, see the LiteLLM Proxy documentation.
Moonshot AI Limitations & LiteLLM Handlingโ
LiteLLM automatically handles the following Moonshot AI limitations to provide seamless OpenAI compatibility:
Temperature Range Limitationโ
Limitation: Moonshot AI only supports temperature range [0, 1] (vs OpenAI's [0, 2])
LiteLLM Handling: Automatically clamps any temperature > 1 to 1
Temperature + Multiple Outputs Limitationโ
Limitation: If temperature < 0.3 and n > 1, Moonshot AI raises an exception
LiteLLM Handling: Automatically sets temperature to 0.3 when this condition is detected
Tool Choice "Required" Not Supportedโ
Limitation: Moonshot AI doesn't support tool_choice="required"
LiteLLM Handling: Converts this by:
- Adding message: "Please select a tool to handle the current issue."
- Removing the
tool_choice
parameter from the request