Skip to main content

Agent Gateway (A2A Protocol) - Overview

Add A2A Agents on LiteLLM AI Gateway, Invoke agents in A2A Protocol, track request/response logs in LiteLLM Logs. Manage which Teams, Keys can access which Agents onboarded.



FeatureSupported
Supported Agent ProvidersA2A, Vertex AI Agent Engine, LangGraph, Azure AI Foundry, Bedrock AgentCore, Pydantic AI
Logging✅
Load Balancing✅
Streaming✅
Iteration Budgets✅
tip

LiteLLM follows the A2A (Agent-to-Agent) Protocol for invoking agents.

Adding your Agent​

Add A2A Agents​

You can add A2A-compatible agents through the LiteLLM Admin UI.

  1. Navigate to the Agents tab
  2. Click Add Agent
  3. Enter the agent name (e.g., ij-local) and the URL of your A2A agent

The URL should be the invocation URL for your A2A agent (e.g., http://localhost:10001).

Add Azure AI Foundry Agents​

Follow this guide, to add your azure ai foundry agent to LiteLLM Agent Gateway

Add Vertex AI Agent Engine​

Follow this guide, to add your Vertex AI Agent Engine to LiteLLM Agent Gateway

Add Bedrock AgentCore Agents​

Follow this guide, to add your bedrock agentcore agent to LiteLLM Agent Gateway

Add LangGraph Agents​

Follow this guide to register a LangGraph agent and configure its agent card

Add Pydantic AI Agents​

Follow this guide, to add your pydantic ai agent to LiteLLM Agent Gateway

Invoking your Agents​

See the Invoking A2A Agents guide to learn how to call your agents using:

  • A2A SDK - Native A2A protocol with full support for tasks and artifacts
  • OpenAI SDK - Familiar /chat/completions interface with a2a/ model prefix

Tracking Agent Logs​

After invoking an agent, you can view the request logs in the LiteLLM Logs tab.

The logs show:

  • Request/Response content sent to and received from the agent
  • User, Key, Team information for tracking who made the request
  • Latency and cost metrics

Forwarding LiteLLM Context Headers​

When LiteLLM invokes your A2A agent, it sends special headers that enable:

  • Trace Grouping: All LLM calls from the same agent execution appear under one trace
  • Agent Spend Tracking: Costs are attributed to the specific agent
HeaderPurpose
X-LiteLLM-Trace-IdLinks all LLM calls to the same execution flow
X-LiteLLM-Agent-IdAttributes spend to the correct agent

To enable these features, your A2A server must forward these headers to any LLM calls it makes back to LiteLLM.

Implementation Steps​

Step 1: Extract headers from incoming A2A request

    """Extract X-LiteLLM-* headers from incoming A2A request."""
all_headers = request.call_context.state.get('headers', {})
return {
k: v for k, v in all_headers.items()
if k.lower().startswith('x-litellm-')
}

Step 2: Forward headers to your LLM calls Pass the extracted headers when making calls back to LiteLLM:


headers = get_litellm_headers(request)

client = OpenAI(
api_key="sk-your-litellm-key",
base_url="http://localhost:4000",
default_headers=headers, # Forward headers
)

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}]
)

Result​

With header forwarding enabled, you'll see:

Trace Grouping in Langfuse:

Agent Spend Attribution:

API Reference​

Endpoints​

EndpointMethodPurpose
POST /a2a/{agent_id}JSON-RPC 2.0Primary — all A2A methods (see table below)
POST /a2a/{agent_id}/message/sendJSON-RPCAlias for message/send only
POST /v1/a2a/{agent_id}/message/sendJSON-RPCAlias for message/send only
GET /a2a/{agent_id}/.well-known/agent.jsonAgent cardDiscovery (proxy URL in url field)
GET /a2a/{agent_id}/.well-known/agent-card.jsonAgent cardDiscovery (standard path)

{agent_id} may be the agent UUID or the registered agent name.

Supported JSON-RPC methods​

Send any of these in the method field of POST /a2a/{agent_id}:

MethodDescription
message/sendSend a message; returns a task or message (LiteLLM-integrated path)
message/streamStreaming variant (NDJSON/SSE)
tasks/getGet task status by params.id
tasks/listList tasks (optional params.contextId)
tasks/cancelCancel task by params.id
tasks/resubscribeSubscribe to task updates (streaming)
tasks/pushNotificationConfig/setRegister push notification config
tasks/pushNotificationConfig/getGet push config
tasks/pushNotificationConfig/listList push configs for a task
tasks/pushNotificationConfig/deleteDelete push config
agent/getAuthenticatedExtendedCardExtended agent card

PascalCase SDK names (GetTask, ListTasks, …) are normalized to the slash form automatically.

Routing: message/send and message/stream go through LiteLLM's A2A client (logging, guardrails, spend). All other methods are forwarded to the upstream URL in agent_card_params.url. Task APIs require that URL; completion-bridge-only agents support messaging methods only.

See Supported A2A methods for examples, aliases, and limitations.

Authentication​

Include your LiteLLM Virtual Key in either of two headers — x-litellm-api-key is preferred when the inbound Authorization header may carry a token destined for the backend agent (e.g. when using the convention-based passthrough to forward the caller's identity).

Authorization: Bearer sk-your-litellm-key
# or
x-litellm-api-key: Bearer sk-your-litellm-key

Per-agent permission check​

After the virtual key is authenticated, LiteLLM checks whether the calling key (and its team) is allowed to invoke the requested agent. If not, the response is HTTP 403. See Agent Permission Management for the full intersection model and access groups.

Trace ID enforcement (optional, per-agent)​

An agent can require every inbound request to carry a trace ID for cross-system audit threading. Set require_trace_id_on_calls_to_agent: true in the agent's litellm_params. When set, requests missing x-litellm-trace-id (or x-litellm-session-id) are rejected with HTTP 400.

Register an agent that requires inbound trace IDs
curl -X POST http://localhost:4000/v1/agents \
-H "Authorization: Bearer sk-master-key" \
-H "Content-Type: application/json" \
-d '{
"agent_name": "audit-critical-agent",
"agent_card_params": { ... },
"litellm_params": {
"require_trace_id_on_calls_to_agent": true
}
}'

The reverse direction — enforcing trace ID on outbound calls made by a key owned by an agent — is controlled by require_trace_id_on_calls_by_agent on the same litellm_params block.

Sub-agent identity propagation​

When the backend agent itself calls LiteLLM (for chat completions or to invoke a sub-agent), LiteLLM forwards two headers to maintain trace continuity:

  • X-LiteLLM-Trace-Id — links all calls in the chain to a single trace
  • X-LiteLLM-Agent-Id — attributes spend to the originating agent

The caller's virtual key and end-user ID are not automatically forwarded. If the downstream agent needs the user's identity, propagate it explicitly via extra_headers or the x-a2a-{agent_name_or_id}-{header} convention.

Request Format​

LiteLLM follows the A2A JSON-RPC 2.0 specification:

Request Body
{
"jsonrpc": "2.0",
"id": "unique-request-id",
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [{"kind": "text", "text": "Your message here"}],
"messageId": "unique-message-id"
}
}
}

Response Format​

Response
{
"jsonrpc": "2.0",
"id": "unique-request-id",
"result": {
"kind": "task",
"id": "task-id",
"contextId": "context-id",
"status": {"state": "completed", "timestamp": "2025-01-01T00:00:00Z"},
"artifacts": [
{
"artifactId": "artifact-id",
"name": "response",
"parts": [{"kind": "text", "text": "Agent response here"}]
}
]
}
}

Agent JSON-RPC errors are returned in the error field with the same id as the request when possible. Poll long-running work with tasks/get after message/send returns a submitted task.

Example: tasks/get​

Poll task after message/send
curl -X POST "http://localhost:4000/a2a/my-agent" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": "req-2",
"method": "tasks/get",
"params": {"id": "task-id-from-send-response"}
}'

Agent Registry​

Want to create a central registry so your team can discover what agents are available within your company?

Use the AI Hub to make agents public and discoverable across your organization. This allows developers to browse available agents without needing to rebuild them.