Gemini Embedding 2 Preview: Multimodal Embeddings on LiteLLM
LiteLLM now supports multimodal embeddings with gemini-embedding-2-previewβgenerating a single embedding from a mix of text, images, audio, video, and PDF content. Available via both the Gemini API (API key) and Vertex AI (GCP credentials).
Supported Input Typesβ
| Modality | Supported Formats |
|---|---|
| Text | Plain text |
| Image | PNG, JPEG |
| Audio | MP3, WAV |
| Video | MP4, MOV |
| Documents |
Input Formatsβ
LiteLLM accepts three input formats for multimodal content:
- Data URIs β Base64-encoded inline:
data:image/png;base64,<encoded_data> - GCS URLs β Cloud Storage paths (Vertex AI):
gs://bucket/path/to/file.png - Gemini File References β Pre-uploaded files (Gemini API):
files/abc123
Quick Startβ
- Gemini API
- Vertex AI
- LiteLLM Proxy
from litellm import embedding
import os
os.environ["GEMINI_API_KEY"] = "your-api-key"
# Text + Image (base64)
response = embedding(
model="gemini/gemini-embedding-2-preview",
input=[
"The food was delicious and the waiter...",
"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
],
)
print(response)
import litellm
from litellm import embedding
litellm.vertex_project = "your-project-id"
litellm.vertex_location = "us-central1"
# Text + Image (GCS URL)
response = embedding(
model="vertex_ai/gemini-embedding-2-preview",
input=[
"Describe this image",
"gs://my-bucket/images/photo.png"
],
)
print(response)
1. Config (config.yaml)
model_list:
- model_name: gemini-embedding-2-preview
litellm_params:
model: gemini/gemini-embedding-2-preview
api_key: os.environ/GEMINI_API_KEY
- model_name: vertex-gemini-embedding-2-preview
litellm_params:
model: vertex_ai/gemini-embedding-2-preview
vertex_project: os.environ/VERTEXAI_PROJECT
vertex_location: os.environ/VERTEXAI_LOCATION
general_settings:
master_key: sk-1234
2. Start proxy
litellm --config config.yaml
3. Call embeddings
curl -X POST http://localhost:4000/embeddings \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-embedding-2-preview",
"input": [
"The food was delicious and the waiter...",
"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
]
}'
Input Format Examplesβ
| Format | Example | Provider |
|---|---|---|
| Data URI | data:image/png;base64,... | Gemini, Vertex AI |
| GCS URL | gs://bucket/path/image.png | Vertex AI |
| File reference | files/abc123 | Gemini API only |
Supported MIME Types for Data URIsβ
- Images:
image/png,image/jpeg - Audio:
audio/mpeg,audio/wav - Video:
video/mp4,video/quicktime - Documents:
application/pdf
GCS URL MIME Inferenceβ
For Vertex AI, MIME types are inferred from file extensions:
.pngβimage/png.jpg/.jpegβimage/jpeg.mp3βaudio/mpeg.wavβaudio/wav.mp4βvideo/mp4.movβvideo/quicktime.pdfβapplication/pdf
Optional Parametersβ
| Parameter | Description | Maps to |
|---|---|---|
dimensions | Output embedding size | outputDimensionality |
response = embedding(
model="gemini/gemini-embedding-2-preview",
input=["text to embed"],
dimensions=768, # Optional: control output vector size
)
