Skip to main content


Quick Start​

from litellm import embedding
import os
os.environ['OPENAI_API_KEY'] = ""
response = embedding(model='text-embedding-ada-002', input=["good morning from litellm"])

Proxy Usage​

NOTE For vertex_ai,

export GOOGLE_APPLICATION_CREDENTIALS="absolute/path/to/service_account.json"

Add model to config​

- model_name: textembedding-gecko
model: vertex_ai/textembedding-gecko

master_key: sk-1234

Start proxy​

litellm --config /path/to/config.yaml 



curl --location '' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{"input": [" uses"], "model": "textembedding-gecko", "encoding_format": "base64"}'

Image Embeddings​

For models that support image embeddings, you can pass in a base64 encoded image string to the input param.

from litellm import embedding
import os

# set your api key
os.environ["COHERE_API_KEY"] = ""

response = embedding(model="cohere/embed-english-v3.0", input=["<base64 encoded image>"])

Input Params for litellm.embedding()​


Any non-openai params, will be treated as provider-specific params, and sent in the request body as kwargs to the provider.

See Reserved Params

See Example

Required Fields​

  • model: string - ID of the model to use. model='text-embedding-ada-002'

  • input: string or array - Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (8192 tokens for text-embedding-ada-002), cannot be an empty string, and any array must be 2048 dimensions or less.

input=["good morning from litellm"]

Optional LiteLLM Fields​

  • user: string (optional) A unique identifier representing your end-user,

  • dimensions: integer (Optional) The number of dimensions the resulting output embeddings should have. Only supported in OpenAI/Azure text-embedding-3 and later models.

  • encoding_format: string (Optional) The format to return the embeddings in. Can be either "float" or "base64". Defaults to encoding_format="float"

  • timeout: integer (Optional) - The maximum time, in seconds, to wait for the API to respond. Defaults to 600 seconds (10 minutes).

  • api_base: string (optional) - The api endpoint you want to call the model with

  • api_version: string (optional) - (Azure-specific) the api version for the call

  • api_key: string (optional) - The API key to authenticate and authorize requests. If not provided, the default API key is used.

  • api_type: string (optional) - The type of API to use.

Output from litellm.embedding()​

"object": "list",
"data": [
"object": "embedding",
"index": 0,
"embedding": [

"model": "text-embedding-ada-002-v2",
"usage": {
"prompt_tokens": 10,
"total_tokens": 10

OpenAI Embedding Models​


from litellm import embedding
import os
os.environ['OPENAI_API_KEY'] = ""
response = embedding(
input=["good morning from litellm", "this is another item"],
metadata={"anything": "good day"},
dimensions=5 # Only supported in text-embedding-3 and later models.
Model NameFunction CallRequired OS Variables
text-embedding-3-smallembedding('text-embedding-3-small', input)os.environ['OPENAI_API_KEY']
text-embedding-3-largeembedding('text-embedding-3-large', input)os.environ['OPENAI_API_KEY']
text-embedding-ada-002embedding('text-embedding-ada-002', input)os.environ['OPENAI_API_KEY']

Azure OpenAI Embedding Models​

API keys​

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ['AZURE_API_KEY'] =
os.environ['AZURE_API_BASE'] =
os.environ['AZURE_API_VERSION'] =


from litellm import embedding
response = embedding(
model="azure/<your deployment name>",
input=["good morning from litellm"],
Model NameFunction Call
text-embedding-ada-002embedding(model="azure/<your deployment name>", input=input)

h/t to Mikko for this integration

OpenAI Compatible Embedding Models​

Use this for calling /embedding endpoints on OpenAI Compatible Servers, example

Note add openai/ prefix to model so litellm knows to route to OpenAI


from litellm import embedding
response = embedding(
model = "openai/<your-llm-name>", # add `openai/` prefix to model so litellm knows to route to OpenAI
api_base="" # set API Base of your Custom OpenAI Endpoint
input=["good morning from litellm"]

Bedrock Embedding​

API keys​

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ["AWS_ACCESS_KEY_ID"] = "" # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = "" # Secret access key
os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2


from litellm import embedding
response = embedding(
input=["good morning from litellm"],
Model NameFunction Call
Titan Embeddings - G1embedding(model="amazon.titan-embed-text-v1", input=input)
Cohere Embeddings - Englishembedding(model="cohere.embed-english-v3", input=input)
Cohere Embeddings - Multilingualembedding(model="cohere.embed-multilingual-v3", input=input)

Cohere Embedding Models​


from litellm import embedding
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = embedding(
input=["good morning from litellm", "this is another item"],
input_type="search_document" # optional param for v3 llms
Model NameFunction Call
embed-english-v3.0embedding(model="embed-english-v3.0", input=["good morning from litellm", "this is another item"])
embed-english-light-v3.0embedding(model="embed-english-light-v3.0", input=["good morning from litellm", "this is another item"])
embed-multilingual-v3.0embedding(model="embed-multilingual-v3.0", input=["good morning from litellm", "this is another item"])
embed-multilingual-light-v3.0embedding(model="embed-multilingual-light-v3.0", input=["good morning from litellm", "this is another item"])
embed-english-v2.0embedding(model="embed-english-v2.0", input=["good morning from litellm", "this is another item"])
embed-english-light-v2.0embedding(model="embed-english-light-v2.0", input=["good morning from litellm", "this is another item"])
embed-multilingual-v2.0embedding(model="embed-multilingual-v2.0", input=["good morning from litellm", "this is another item"])

NVIDIA NIM Embedding Models​

API keys​

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ["NVIDIA_NIM_API_KEY"] = "" # api key
os.environ["NVIDIA_NIM_API_BASE"] = "" # nim endpoint url


from litellm import embedding
import os
os.environ['NVIDIA_NIM_API_KEY'] = ""
response = embedding(
input=["good morning from litellm"]

All models listed here are supported:

Model NameFunction Call
NV-Embed-QAembedding(model="nvidia_nim/NV-Embed-QA", input)
nvidia/nv-embed-v1embedding(model="nvidia_nim/nvidia/nv-embed-v1", input)
nvidia/nv-embedqa-mistral-7b-v2embedding(model="nvidia_nim/nvidia/nv-embedqa-mistral-7b-v2", input)
nvidia/nv-embedqa-e5-v5embedding(model="nvidia_nim/nvidia/nv-embedqa-e5-v5", input)
nvidia/embed-qa-4embedding(model="nvidia_nim/nvidia/embed-qa-4", input)
nvidia/llama-3.2-nv-embedqa-1b-v1embedding(model="nvidia_nim/nvidia/llama-3.2-nv-embedqa-1b-v1", input)
nvidia/llama-3.2-nv-embedqa-1b-v2embedding(model="nvidia_nim/nvidia/llama-3.2-nv-embedqa-1b-v2", input)
snowflake/arctic-embed-lembedding(model="nvidia_nim/snowflake/arctic-embed-l", input)
baai/bge-m3embedding(model="nvidia_nim/baai/bge-m3", input)

HuggingFace Embedding Models​

LiteLLM supports all Feature-Extraction + Sentence Similarity Embedding models:


from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
input=["good morning from litellm"]

Usage - Set input_type​

LiteLLM infers input type (feature-extraction or sentence-similarity) by making a GET request to the api base.

Override this, by setting the input_type yourself.

from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
input=["good morning from litellm", "you are a good bot"],
api_base = "",

Usage - Custom API Base​

from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
input=["good morning from litellm"],
api_base = ""
Model NameFunction CallRequired OS Variables
microsoft/codebert-baseembedding('huggingface/microsoft/codebert-base', input=input)os.environ['HUGGINGFACE_API_KEY']
BAAI/bge-large-zhembedding('huggingface/BAAI/bge-large-zh', input=input)os.environ['HUGGINGFACE_API_KEY']
any-hf-embedding-modelembedding('huggingface/hf-embedding-model', input=input)os.environ['HUGGINGFACE_API_KEY']

Mistral AI Embedding Models​

All models listed here are supported


from litellm import embedding
import os

os.environ['MISTRAL_API_KEY'] = ""
response = embedding(
input=["good morning from litellm"],
Model NameFunction Call
mistral-embedembedding(model="mistral/mistral-embed", input)

Gemini AI Embedding Models​

API keys​

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ["GEMINI_API_KEY"] = ""

Usage - Embedding​

from litellm import embedding
response = embedding(
input=["good morning from litellm"],

All models listed here are supported:

Model NameFunction Call
text-embedding-004embedding(model="gemini/text-embedding-004", input)

Vertex AI Embedding Models​

Usage - Embedding​

import litellm
from litellm import embedding
litellm.vertex_project = "hardy-device-38811" # Your Project ID
litellm.vertex_location = "us-central1" # proj location

response = embedding(
input=["good morning from litellm"],

Supported Models​

All models listed here are supported

Model NameFunction Call
textembedding-geckoembedding(model="vertex_ai/textembedding-gecko", input)
textembedding-gecko-multilingualembedding(model="vertex_ai/textembedding-gecko-multilingual", input)
textembedding-gecko-multilingual@001embedding(model="vertex_ai/textembedding-gecko-multilingual@001", input)
textembedding-gecko@001embedding(model="vertex_ai/textembedding-gecko@001", input)
textembedding-gecko@003embedding(model="vertex_ai/textembedding-gecko@003", input)
text-embedding-preview-0409embedding(model="vertex_ai/text-embedding-preview-0409", input)
text-multilingual-embedding-preview-0409embedding(model="vertex_ai/text-multilingual-embedding-preview-0409", input)

Voyage AI Embedding Models​

Usage - Embedding​

from litellm import embedding
import os

os.environ['VOYAGE_API_KEY'] = ""
response = embedding(
input=["good morning from litellm"],

Supported Models​

All models listed here are supported

Model NameFunction Call
voyage-01embedding(model="voyage/voyage-01", input)
voyage-lite-01embedding(model="voyage/voyage-lite-01", input)
voyage-lite-01-instructembedding(model="voyage/voyage-lite-01-instruct", input)

Provider-specific Params​


Any non-openai params, will be treated as provider-specific params, and sent in the request body as kwargs to the provider.

See Reserved Params


Cohere v3 Models have a required parameter: input_type, it can be one of the following four values:

  • input_type="search_document": (default) Use this for texts (documents) you want to store in your vector database
  • input_type="search_query": Use this for search queries to find the most relevant documents in your vector database
  • input_type="classification": Use this if you use the embeddings as an input for a classification system
  • input_type="clustering": Use this if you use the embeddings for text clustering

from litellm import embedding
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = embedding(
input=["good morning from litellm", "this is another item"],
input_type="search_document" # 👈 PROVIDER-SPECIFIC PARAM