Skip to main content

Azure AI OCR

Overviewโ€‹

PropertyDetails
DescriptionAzure AI OCR provides document intelligence capabilities powered by Mistral, enabling text extraction from PDFs and images
Provider Route on LiteLLMazure_ai/
Supported Operations/ocr
Link to Provider DocAzure AI โ†—

Extract text from documents and images using Azure AI's OCR models, powered by Mistral.

Quick Startโ€‹

LiteLLM SDKโ€‹

SDK Usage
import litellm
import os

# Set environment variables
os.environ["AZURE_AI_API_KEY"] = ""
os.environ["AZURE_AI_API_BASE"] = ""

# OCR with PDF URL
response = litellm.ocr(
model="azure_ai/mistral-document-ai-2505",
document={
"type": "document_url",
"document_url": "https://example.com/document.pdf"
}
)

# Access extracted text
for page in response.pages:
print(page.text)

LiteLLM PROXYโ€‹

proxy_config.yaml
model_list:
- model_name: azure-ocr
litellm_params:
model: azure_ai/mistral-document-ai-2505
api_key: "os.environ/AZURE_AI_API_KEY"
api_base: "os.environ/AZURE_AI_API_BASE"
model_info:
mode: ocr

Document Typesโ€‹

Azure AI OCR supports both PDFs and images.

PDF Documentsโ€‹

PDF OCR
response = litellm.ocr(
model="azure_ai/mistral-document-ai-2505",
document={
"type": "document_url",
"document_url": "https://example.com/document.pdf"
}
)

Image Documentsโ€‹

Image OCR
response = litellm.ocr(
model="azure_ai/mistral-document-ai-2505",
document={
"type": "image_url",
"image_url": "https://example.com/image.png"
}
)

Base64 Encoded Documentsโ€‹

Base64 PDF
import base64

# Read and encode PDF
with open("document.pdf", "rb") as f:
pdf_base64 = base64.b64encode(f.read()).decode()

response = litellm.ocr(
model="azure_ai/mistral-document-ai-2505",
document={
"type": "document_url",
"document_url": f"data:application/pdf;base64,{pdf_base64}"
}
)

Supported Parametersโ€‹

All Parameters
response = litellm.ocr(
model="azure_ai/mistral-document-ai-2505",
document={ # Required: Document to process
"type": "document_url",
"document_url": "https://..."
},
include_image_base64=True, # Optional: Include base64 images
pages=[0, 1, 2], # Optional: Specific pages to process
image_limit=10 # Optional: Limit number of images
)

Response Formatโ€‹

Response Structure
# Response has the following structure
response.pages # List of pages with extracted text
response.model # Model used
response.object # "ocr"
response.usage_info # Token usage information

# Access page content
for page in response.pages:
print(f"Page {page.page_number}:")
print(page.text)

Async Supportโ€‹

Async Usage
import litellm

response = await litellm.aocr(
model="azure_ai/mistral-document-ai-2505",
document={
"type": "document_url",
"document_url": "https://example.com/document.pdf"
}
)

Important Notesโ€‹

URL Conversion

Azure AI OCR endpoints don't have internet access. LiteLLM automatically converts public URLs to base64 data URIs before sending requests to Azure AI.

Supported Modelsโ€‹

  • mistral-document-ai-2505 - Latest Mistral OCR model on Azure AI

Use the Azure AI provider prefix: azure_ai/<model-name>