Gemini Image Generation Migration Guide

Who is impacted by this change?

Anyone using the following models with /chat/completions:

gemini/gemini-2.0-flash-exp-image-generation
vertex_ai/gemini-2.0-flash-exp-image-generation

Key Change

info

From v1.77.0, LiteLLM will return the List of images in response.choices[0].message.images instead of a single image in response.choices[0].message.image.

Gemini models now support image generation through chat completions. Images are returned in response.choices[0].message.images with base64 data URLs.

Before and After

Before

from litellm import completion

response = completion(
    model="gemini/gemini-2.0-flash-exp-image-generation",
    messages=[{"role": "user", "content": "Generate an image of a cat"}],
    modalities=["image", "text"],
)


base_64_image_data = response.choices[0].message.content

After

from litellm import completion

response = completion(
    model="gemini/gemini-2.0-flash-exp-image-generation",
    messages=[{"role": "user", "content": "Generate an image of a cat"}],
    modalities=["image", "text"],
)

# Image is now available in the response
image_url = response.choices[0].message.images[0]["image_url"]["url"]  # "data:image/png;base64,..."

Why the change?

Because the newer gemini-2.5-flash-image-preview model sends both text and image responses in the same response. This interface allows a developer to explicitly access the image or text components of the response. Before a developer would have needed to search through the message content to find the image generated by the model.

Why the change from image to images? This is to be consistent with the OpenRouter API, making sure we are using simple, well-known interfaces where possible.

Usage

Using the Python SDK

Key Change:

# Before
-- base_64_image_data = response.choices[0].message.content

# After
++ image_url = response.choices[0].message.images[0]["image_url"]["url"]

Basic Image Generation

from litellm import completion
import os

# Set your API key
os.environ["GEMINI_API_KEY"] = "your-api-key"

# Generate an image
response = completion(
    model="gemini/gemini-2.0-flash-exp-image-generation",
    messages=[{"role": "user", "content": "Generate an image of a cat"}],
    modalities=["image", "text"],
)

# Access the generated image
print(response.choices[0].message.content)  # Text response (if any)
print(response.choices[0].message.images[0])    # Image data

Response Format

The image is returned in the message.images field:

{
    "image_url": {
        "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...",
        "detail": "auto"
    },
    "index": 0,
    "type": "image_url"
}

Using the LiteLLM Proxy Server

Key Change:

# Before
-- "content": "base64-image-data..."

# After  
++ "images": [{
++   "image_url": {
++     "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...",
++     "detail": "auto"
++   },
++   "index": 0,
++   "type": "image_url"
++ }]

Configuration Setup

Configure your models in config.yaml:

model_list:
  - model_name: gemini-image-gen
    litellm_params:
      model: gemini/gemini-2.0-flash-exp-image-generation
      api_key: os.environ/GEMINI_API_KEY
  - model_name: vertex-image-gen  
    litellm_params:
      model: vertex_ai/gemini-2.5-flash-image-preview
      vertex_project: your-project-id
      vertex_location: us-central1

general_settings:
  master_key: sk-1234  # Your proxy API key

Start the proxy server:

litellm --config /path/to/config.yaml

# RUNNING on http://0.0.0.0:4000

Making Requests

Using OpenAI SDK:

from openai import OpenAI

# Point to your proxy server
client = OpenAI(
    api_key="sk-1234",  # Your proxy API key
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="gemini-image-gen",
    messages=[{"role": "user", "content": "Generate an image of a cat"}],
    extra_body={"modalities": ["image", "text"]}
)

# Access the generated image
print(response.choices[0].message.content)  # Text response (if any)
print(response.choices[0].message.image)    # Image data

Using curl:

curl -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
  "model": "gemini-image-gen",
  "messages": [
    {
      "role": "user",
      "content": "Generate an image of a cat"
    }
  ],
  "modalities": ["image", "text"]
}'

Response format from proxy:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1704089632,
  "model": "gemini-image-gen",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Here's an image of a cat for you!",
        "images": [{
          "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...",
          "detail": "auto"
        }
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  }
}

Who is impacted by this change?​

Key Change​

Before and After​

Before​

After​

Why the change?​

Usage​

Using the Python SDK​

Basic Image Generation​

Response Format​

Using the LiteLLM Proxy Server​

Configuration Setup​

Making Requests​