Skip to main content

Computer Use

Computer use allows models to interact with computer interfaces by taking screenshots and performing actions like clicking, typing, and scrolling. This enables AI models to autonomously operate desktop environments.

Supported Providers:

  • Anthropic API (anthropic/)
  • Bedrock (Anthropic) (bedrock/)
  • Vertex AI (Anthropic) (vertex_ai/)

Supported Tool Types:

  • computer - Computer interaction tool with display parameters
  • bash - Bash shell tool
  • text_editor - Text editor tool
  • web_search - Web search tool

LiteLLM will standardize the computer use tools across all supported providers.

Quick Startโ€‹

import os 
from litellm import completion

os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

# Computer use tool
tools = [
{
"type": "computer_20241022",
"name": "computer",
"display_height_px": 768,
"display_width_px": 1024,
"display_number": 0,
}
]

messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Take a screenshot and tell me what you see"
},
{
"type": "image_url",
"image_url": {
"url": ""
}
}
]
}
]

response = completion(
model="anthropic/claude-3-5-sonnet-latest",
messages=messages,
tools=tools,
)

print(response)

Checking if a model supports computer useโ€‹

Use litellm.supports_computer_use(model="") -> returns True if model supports computer use and False if not

import litellm

assert litellm.supports_computer_use(model="anthropic/claude-3-5-sonnet-latest") == True
assert litellm.supports_computer_use(model="anthropic/claude-3-7-sonnet-20250219") == True
assert litellm.supports_computer_use(model="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0") == True
assert litellm.supports_computer_use(model="vertex_ai/claude-3-5-sonnet") == True
assert litellm.supports_computer_use(model="openai/gpt-4") == False

Different Tool Typesโ€‹

Computer use supports several different tool types for various interaction modes:

The computer_20241022 tool provides direct screen interaction capabilities.

import os 
from litellm import completion

os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

tools = [
{
"type": "computer_20241022",
"name": "computer",
"display_height_px": 768,
"display_width_px": 1024,
"display_number": 0,
}
]

messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Click on the search button in the screenshot"
},
{
"type": "image_url",
"image_url": {
"url": ""
}
}
]
}
]

response = completion(
model="anthropic/claude-3-5-sonnet-latest",
messages=messages,
tools=tools,
)

print(response)

Advanced Usage with Multiple Toolsโ€‹

You can combine different computer use tools in a single request:

import os 
from litellm import completion

os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

tools = [
{
"type": "computer_20241022",
"name": "computer",
"display_height_px": 768,
"display_width_px": 1024,
"display_number": 0,
},
{
"type": "bash_20241022",
"name": "bash"
},
{
"type": "text_editor_20250124",
"name": "str_replace_editor"
}
]

messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Take a screenshot, then create a file describing what you see, and finally use bash to show the file contents"
},
{
"type": "image_url",
"image_url": {
"url": ""
}
}
]
}
]

response = completion(
model="anthropic/claude-3-5-sonnet-latest",
messages=messages,
tools=tools,
)

print(response)

Specโ€‹

Computer Tool (computer_20241022)โ€‹

{
"type": "computer_20241022",
"name": "computer",
"display_height_px": 768, // Required: Screen height in pixels
"display_width_px": 1024, // Required: Screen width in pixels
"display_number": 0 // Optional: Display number (default: 0)
}

Bash Tool (bash_20241022)โ€‹

{
"type": "bash_20241022",
"name": "bash" // Required: Tool name
}

Text Editor Tool (text_editor_20250124)โ€‹

{
"type": "text_editor_20250124",
"name": "str_replace_editor" // Required: Tool name
}

Web Search Tool (web_search_20250305)โ€‹

{
"type": "web_search_20250305",
"name": "web_search" // Required: Tool name
}