Skip to main content

Z.AI (Zhipu AI)

https://z.ai/

We support Z.AI GLM text/chat models, just set zai/ as a prefix when sending completion requests

API Key​

# env variable
os.environ['ZAI_API_KEY']

Sample Usage​

from litellm import completion
import os

os.environ['ZAI_API_KEY'] = ""
response = completion(
model="zai/glm-4.6",
messages=[
{"role": "user", "content": "hello from litellm"}
],
)
print(response)

Sample Usage - Streaming​

from litellm import completion
import os

os.environ['ZAI_API_KEY'] = ""
response = completion(
model="zai/glm-4.6",
messages=[
{"role": "user", "content": "hello from litellm"}
],
stream=True
)

for chunk in response:
print(chunk)

Supported Models​

We support ALL Z.AI GLM models, just set zai/ as a prefix when sending completion requests.

Model NameFunction CallNotes
glm-4.6completion(model="zai/glm-4.6", messages)Latest flagship model, 200K context
glm-4.5completion(model="zai/glm-4.5", messages)128K context
glm-4.5vcompletion(model="zai/glm-4.5v", messages)Vision model
glm-4.5-xcompletion(model="zai/glm-4.5-x", messages)Premium tier
glm-4.5-aircompletion(model="zai/glm-4.5-air", messages)Lightweight
glm-4.5-airxcompletion(model="zai/glm-4.5-airx", messages)Fast lightweight
glm-4-32b-0414-128kcompletion(model="zai/glm-4-32b-0414-128k", messages)32B parameter model
glm-4.5-flashcompletion(model="zai/glm-4.5-flash", messages)FREE tier

Model Pricing​

ModelInput ($/1M tokens)Output ($/1M tokens)Context Window
glm-4.6$0.60$2.20200K
glm-4.5$0.60$2.20128K
glm-4.5v$0.60$1.80128K
glm-4.5-x$2.20$8.90128K
glm-4.5-air$0.20$1.10128K
glm-4.5-airx$1.10$4.50128K
glm-4-32b-0414-128k$0.10$0.10128K
glm-4.5-flashFREEFREE128K

Using with LiteLLM Proxy​

from litellm import completion
import os

os.environ['ZAI_API_KEY'] = ""
response = completion(
model="zai/glm-4.6",
messages=[{"role": "user", "content": "Hello, how are you?"}],
)

print(response.choices[0].message.content)