Dario Amodei, the CEO of Anthropic, has asserted that artificial intelligence (AI) models exhibit fewer hallucinations than humans. This remark was made during the company’s inaugural Code With Claude event held on Thursday, where Anthropic unveiled two new Claude 4 models along with several enhanced features, including superior memory capabilities and tool usage. Amodei also commented on the criticism faced by AI, stating that while some are seeking to impose limitations on the technology, “they are nowhere to be seen.”
Anthropic CEO Downplays AI Hallucinations
According to TechCrunch, Amodei addressed these observations during a press briefing, emphasizing that hallucinations should not be viewed as an obstacle for AI in reaching artificial general intelligence (AGI). In response to a question from the publication, he explained, “It really depends on how you measure it, but I suspect that AI models probably hallucinate less than humans, but they hallucinate in more surprising ways.”
He further noted that mistakes are commonplace among TV reporters, politicians, and professionals in various fields. Therefore, the errors made by AI do not diminish its overall intelligence. Nonetheless, Amodei admitted that the phenomenon of AI confidently presenting false information remains a concern.
Earlier this month, Anthropic faced an issue in a courtroom when a citation added by its Claude chatbot was found to be incorrect, as reported by Bloomberg. This incident occurred amid the company’s ongoing legal battle with music publishers regarding alleged copyright violations involving the lyrics of over 500 songs.
In a paper released in October 2024, Amodei suggested that Anthropic could achieve AGI as early as next year. AGI is defined as AI technology that can comprehend, learn, and apply knowledge across a variety of tasks and perform actions independently of human input.
During the developer conference, Anthropic introduced Claude Opus 4 and Claude Sonnet 4, which feature significant advancements in coding, tool integration, and writing capabilities. Notably, Claude Sonnet 4 achieved a score of 72.7 percent on the SWE-Bench benchmark, marking it as a state-of-the-art model in code generation.