Skip to main content

Incident Report: vLLM Embeddings Broken by encoding_format Parameter

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Date: Feb 16, 2026 Duration: ~3 hours Severity: High (for vLLM embedding users) Status: Resolved

Summary​

A commit (dbcae4a) intended to fix OpenAI SDK behavior broke vLLM embeddings by explicitly passing encoding_format=None in API requests. vLLM rejects this with error: "unknown variant \`, expected float or base64"`.

  • vLLM embedding calls: Complete failure - all requests rejected
  • Other providers: No impact - OpenAI and other providers functioned normally
  • Other vLLM functionality: No impact - only embeddings were affected

Background​

The encoding_format parameter for embeddings specifies whether vectors should be returned as float arrays or base64 encoded strings. Different providers have different expectations:

  • OpenAI SDK: If encoding_format is omitted, the SDK adds a default value of "float"
  • vLLM: Strictly validates encoding_format - only accepts "float", "base64", or complete omission. Rejects None or empty string values.

Root cause​

A well-intentioned fix for OpenAI SDK behavior inadvertently broke vLLM embeddings:

The Breaking Change (dbcae4a):

In litellm/main.py, the code was changed to explicitly set encoding_format=None instead of omitting it:

# Added in dbcae4a
if encoding_format is not None:
optional_params["encoding_format"] = encoding_format
else:
# Omitting causes openai sdk to add default value of "float"
optional_params["encoding_format"] = None

This fix worked correctly for OpenAI - explicitly passing None prevented the SDK from adding its default value. However, vLLM's strict parameter validation rejected None values, causing all embedding requests to fail.


The Fix​

Fix deployed (55348dd). The solution filters out None and empty string values from optional_params before sending requests to OpenAI-like providers (including vLLM).

In litellm/llms/openai_like/embedding/handler.py:

# Before (broken)
data = {"model": model, "input": input, **optional_params}

# After (fixed)
filtered_optional_params = {k: v for k, v in optional_params.items() if v not in (None, '')}
data = {"model": model, "input": input, **filtered_optional_params}

This ensures:

  • Valid values ("float", "base64") are preserved and sent
  • None and empty string values are filtered out (parameter omitted entirely)
  • OpenAI SDK no longer adds defaults because liteLLM handles the parameter upstream

Remediation​

#ActionStatusCode
1Filter None and empty string values in OpenAI-like embedding handler✅ Donehandler.py#L108
2Unit tests for parameter filtering (None, empty string, valid values)✅ Donetest_openai_like_embedding.py
3Transformation tests for hosted_vllm embedding config✅ Donetest_hosted_vllm_embedding_transformation.py
4E2E tests with actual vLLM endpoint✅ Donetest_hosted_vllm_embedding_e2e.py
5Validate JSON payload structure matches vLLM expectations✅ DoneTests verify exact JSON sent to endpoint