As the initial excitement surrounding generative artificial intelligence (AI) subsides, a critical question emerges: how can these technologies create tangible benefits in the real world? This inquiry is essential, considering that while AI chatbots serve various purposes—from information retrieval to essay writing and content generation—their utility often hinges on a human operator consistently guiding them to achieve an outcome.
Despite their undeniable contributions to enhancing productivity in certain domains, these AI systems fall short in one vital area that impedes them from functioning as reliable assistants capable of autonomously managing tasks: decision-making. Presently, generative AI can assist with specific work responsibilities but cannot independently execute tasks.
For example, while one can instruct an AI to draft an email informing a client of an unforeseen delay, the AI lacks the ability to send the communication or address any ensuing frustration from the recipient’s response. Similarly, requesting recommendations for the optimal smartphone for videography may yield answers like the iPhone 16 Pro Max or the Samsung Galaxy S24 Ultra, yet the AI cannot search the web for the best pricing or facilitate a purchase.
Recognizing this limitation, technology firms developing large language models (LLMs) have co-opted the term “AI agent.” Researchers assert that AI agents can advance knowledge-based systems into actionable entities that can accomplish end-to-end tasks without human oversight.
The concept gained traction in the latter half of 2024, and it is now being positioned as a solution for various workplace challenges. While there is merit to this notion, the potential of this technology is complex, warranting a thorough examination to clarify its implications and capabilities.
Defining AI Agents
As this technology evolves, there remains no universally accepted definition for what constitutes an AI agent. IBM characterizes it as a system capable of autonomously executing tasks on behalf of users by designing a workflow and utilizing specific tools. In contrast, Google, which introduced its inaugural AI agent, Project Mariner, describes it as an assistant that aids users in task completion.
A broader understanding is presented by Amazon, which defines an AI agent as a software program that interacts with its environment, gathers data, and undertakes self-directed tasks to achieve set goals. Humans initiate the objectives, but the AI agent autonomously determines the optimal actions necessary to fulfill those goals.
In essence, an AI agent represents a system capable of executing actions rather than merely advising users of potential decisions.
Analyzing AI Agents
Typically, an AI agent utilizes a large language model (LLM) as its core, complemented by additional components that facilitate actionable intelligence. These components often include sensors, mechanical apparatus, encoders, and software integrations.
Sensors enable AI agents to collect diverse forms of data, encompassing visual impressions, sound, temperature, and electronic signals. Mechanical components are essential for embodied AI, where physical actions, such as lifting objects or relocating, are necessary. Encoders convert varied signals into information that LLMs can process. Meanwhile, software integration enhances task execution capabilities.
One critical distinction between AI models and AI agents is that AI models rely on a pre-existing database that dictates their knowledge base. Anything beyond this database will not result in an output. An early version of ChatGPT exemplified this limitation, as it lacked access to the internet and was restricted by its knowledge cutoff date, rendering it unable to answer contemporary queries.
In contrast, AI agents, when incorporated with appropriate systems, can autonomously acquire new data, enabling them to tackle problems that exceed the limitations of their existing information repository. For instance, Google’s Project Mariner possesses the capacity to interact with web browsers to identify the best pricing on smartwatches.
AI agents also exhibit the capability to navigate complex tasks effectively. They can employ advanced reasoning to deconstruct intricate assignments into simpler components, completing each step sequentially. This nuanced understanding of articulating a problem and employing a breakdown strategy is inherent in the functioning of AI agents.
An intriguing illustration of this is demonstrated by Gemini’s recently launched Deep Research tool, which allows users to inquire about complex subjects. The AI can craft a multi-step research strategy, segment the topic into manageable parts, discover pertinent research articles, conduct comprehensive inquiries, and synthesize the findings into an in-depth report.
Potential Applications for AI Agents
AI companies are heralding AI agents as versatile solutions across various industries and scenarios. These agents can operate as voice assistants for devices performing specific tasks, such as capturing images or playing audio. They can be embedded into applications or software to execute operations, including making purchases through a browser-based agent. Additionally, they can enhance enterprise systems by identifying fraudulent activity or optimizing operational processes.
Moreover, AI agents are poised to enact transformative changes within specific sectors. In healthcare, they may assist with diagnosis, treatment suggestions, and drug discovery. The automotive industry may deploy them in the creation of self-driving vehicles. Additionally, AI agents could be utilized to navigate drones in disaster-stricken areas, gathering and interpreting data to offer actionable insights for rescue efforts.
Potential uses also span manufacturing, where AI-powered robots may play a significant role, and the gaming industry, where they can function as game developers or non-playing characters (NPCs). In education, these agents could develop personalized study curricula and evaluate examinations similarly to human educators.
Despite the promising narrative surrounding AI agents as comprehensive solutions for intelligent automation, current technology typically limits their application to specific task-driven roles rather than serving as general-purpose tools.
Forecast for AI Agents in 2025
As we consider the future, it is vital to temper expectations regarding the capabilities of AI agents in the coming year. Their integration into critical sectors such as manufacturing, automotive, healthcare, or education seems unlikely at this stage.
However, significant advancements are anticipated in consumer electronics, mobile and desktop applications, and various online platforms. For example, by the end of this year, Google’s Project Mariner may be incorporated into Google Chrome, assisting users in purchasing decisions and file retrieval from the web.
OpenAI is rumored to launch its own AI agent within the same timeframe, potentially augmenting ChatGPT’s functionality to facilitate specific actions on user devices and the internet. Additionally, Anthropic’s Computer Use tool may see a global rollout, aiding users with everyday tasks across devices.
Looking ahead, we may witness enhancements whereby AI agents mimic keyboard strokes, mouse maneuvers, and clicks, performing a range of functions. By the year’s end, developments in agentic tools, such as the coding agent Devin, may lead to capabilities in writing, testing, debugging, and deploying code autonomously. However, incorporating these features into the 2025 landscape would require an optimistic perspective.
In corporate environments, AI agents might take on more substantial responsibilities, including monitoring extensive datasets, preparing analytical reports, and delivering insights and recommendations. They may also find roles in cybersecurity, as illustrated by Meta’s current use of AI to ensure compliance with guidelines, and YouTube’s application of AI for monitoring copyright infringements.
Nonetheless, significant hesitation exists regarding the implementation of AI agents within essential work functions this year, primarily due to the technology’s untested nature and questions about its reliability. Organizations, particularly public ventures or those with substantial funding, generally exhibit cautious behavior in granting access to sensitive information.
Challenges Facing AI Agents
With AI currently at the forefront of technological discourse and possessing the potential to revolutionize various industries, the excitement surrounding AI agents is understandable. However, several pertinent issues must be addressed to enable widespread adoption of this technology, as unchecked, it poses various risks.
One primary concern involves bias and discrimination stemming from training data, which can lead to unequal outcomes. This underscores a parallel issue related to transparency, as the complex nature of AI algorithms makes it difficult to discern how and why agents arrive at specific decisions.
Security and privacy concerns are also significant. From a security standpoint, AI agents are susceptible to adversarial attacks where malicious entities manipulate input data to mislead systems. Additionally, because AI agents require connections to multiple systems and need to acquire extensive data to perform tasks, they also present privacy vulnerabilities.
Given these challenges, AI firms face a substantial task in persuading businesses and individuals of the advantages of the technology while simultaneously addressing its drawbacks. Nonetheless, it is undeniable that AI agents will likely play a crucial role in AI developments anticipated for 2025.