Skip to main content

Azure Responses API

PropertyDetails
DescriptionAzure OpenAI Responses API
custom_llm_provider on LiteLLMazure/
Supported Operations/v1/responses
Azure OpenAI Responses APIAzure OpenAI Responses API ↗
Cost Tracking, Logging Support✅ LiteLLM will log, track cost for Responses API Requests
Supported OpenAI Params✅ All OpenAI params are supported, See here

Usage​

Create a model response​

Non-streaming​

Azure Responses API
import litellm

# Non-streaming response
response = litellm.responses(
model="azure/o1-pro",
input="Tell me a three sentence bedtime story about a unicorn.",
max_output_tokens=100,
api_key=os.getenv("AZURE_RESPONSES_OPENAI_API_KEY"),
api_base="https://litellm8397336933.openai.azure.com/",
api_version="2023-03-15-preview",
)

print(response)

Streaming​

Azure Responses API
import litellm

# Streaming response
response = litellm.responses(
model="azure/o1-pro",
input="Tell me a three sentence bedtime story about a unicorn.",
stream=True,
api_key=os.getenv("AZURE_RESPONSES_OPENAI_API_KEY"),
api_base="https://litellm8397336933.openai.azure.com/",
api_version="2023-03-15-preview",
)

for event in response:
print(event)

Azure Codex Models​

Codex models use Azure's new /v1/preview API which provides ongoing access to the latest features with no need to update api-version each month.

LiteLLM will send your requests to the /v1/preview endpoint when you set api_version="preview".

Non-streaming​

Azure Codex Models
import litellm

# Non-streaming response with Codex models
response = litellm.responses(
model="azure/codex-mini",
input="Tell me a three sentence bedtime story about a unicorn.",
max_output_tokens=100,
api_key=os.getenv("AZURE_RESPONSES_OPENAI_API_KEY"),
api_base="https://litellm8397336933.openai.azure.com",
api_version="preview", # 👈 key difference
)

print(response)

Streaming​

Azure Codex Models
import litellm

# Streaming response with Codex models
response = litellm.responses(
model="azure/codex-mini",
input="Tell me a three sentence bedtime story about a unicorn.",
stream=True,
api_key=os.getenv("AZURE_RESPONSES_OPENAI_API_KEY"),
api_base="https://litellm8397336933.openai.azure.com",
api_version="preview", # 👈 key difference
)

for event in response:
print(event)