Use Computer Use with an Agent
Assign a cloud desktop to an agent so it can interact with graphical applications. The agent takes screenshots, analyzes them, and performs mouse/keyboard actions in a loop.
Prerequisites
- You are signed in to the XpressAI Platform.
- You have a provisioned desktop in Ready status (see Provision a Cloud Desktop).
- You have a deployed agent.
Steps
- Navigate to the agent's profile page.
- Select the Desktop tab.
- Assign the provisioned desktop to the agent.
- Save the configuration.
How computer use works
When the agent is assigned a desktop, it gains access to the meeseeks_desktop tool. The agent uses a screenshot-action loop:
- The agent takes a screenshot of the desktop.
- It analyzes the screenshot to understand the current state.
- It performs an action (click, type, scroll, etc.).
- It takes another screenshot to verify the result.
- This loop repeats for up to 25 iterations per task.
tip
You can watch the agent interact with the desktop in real time by opening a VNC connection to the same desktop (see Connect via VNC).
Common use cases
- Filling out web forms.
- Interacting with legacy desktop software that has no API.
- Performing data entry across multiple applications.
- Taking screenshots for documentation or reporting.
warning
The 25-iteration limit prevents runaway loops. If the agent's task requires more steps, break it into smaller sub-tasks.
Verify
- The agent's Desktop tab shows the assigned desktop.
- Ask the agent to perform a desktop task (for example, "Open the browser and go to example.com").
- Watch via VNC to confirm the agent is interacting with the desktop.
- The agent reports the result of the task in the conversation.