On Tuesday, Hugging Face unveiled a new demonstration of an artificial intelligence agent capable of handling a range of tasks online. Named Open Computer Agent, this tool is freely accessible through their website. With the ability to operate on web browsers, it can autonomously navigate websites like Google Search, Google Maps, and ticket booking services to effectively carry out various functions.
Open Computer Agent Is Now Available to Everyone
Aymeric Roucher, Project Lead – Agents at Hugging Face, shared the news of the Open Computer Agent’s launch in a post on X (formerly Twitter). As its name implies, the agent is an open-source solution designed to automatically perform a diverse set of tasks. It operates on a Linux virtual machine equipped with a variety of applications and uses the Mozilla Firefox web browser.
According to Roucher, the AI agent is driven by the Qwen2-VL-72B vision-language model, which enables it to pinpoint elements on the screen based on their coordinates. This capability allows the agent to assess what it sees, take appropriate actions, and proceed to subsequent tasks. Hugging Face has integrated this agentic functionality using the smolagents library introduced earlier this year.
We’re launching Computer Use in smolagents! 🥳
-> As vision models become more capable, they become able to power complex agentic workflows. Especially Qwen-VL models, that support built-in grounding, i.e. ability to locate any element in an image by its coordinates, thus to… pic.twitter.com/mI8MuWZkIS
— m_ric (@AymericRoucher) May 6, 2025
This AI agent is available to all users, providing an opportunity to explore its functionalities. Interested parties can click here to visit the Open Computer Agent site. For instance, users can instruct the agent to find directions to a location, and it will navigate to Google Maps, input both the starting point and destination, and initiate navigation to display the results.
Testing conducted by staff members at Gadgets 360 revealed that while the AI agent largely operates as intended, its performance can be comparatively slow. The agent tends to struggle with more complex prompts, sometimes leading to errors or incomplete tasks. Additionally, being a cloud-based tool that is free to use, users may encounter long wait times, potentially delaying the start of task execution.