One of the newest entrants in the competitive world of artificial intelligence says its new models can outdo anything done by the world’s biggest tech companies.
AI tech startup Anthropic on Monday announced its newest AI models — collectively called Claude 3 — and touted their near-instant abilities to complete complex tasks, including transcribing handwritten notes, analyzing graphs and translating languages.
Claude 3 is made up of three models: Opus, Sonnet and Haiku.
Anthropic said that the most capable of these models, Opus, beat out other industry-leading AI programs, including OpenAI’s GPT-4 and Google’s Gemini 1.0 Ultra, in some of the most common tests given to determine an AI’s capabilities, such as undergraduate-level expert knowledge and graduate-level expert reasoning.
Sonnet and Haiku are the two less intelligent models in the Claude 3 family. Opus and Sonnet are now available in 159 countries, while Haiku has not been released yet.
Anthropic co-founder Daniela Amodei told CNBC that Claude 3 better understands how to navigate risk in responses than its predecessor Claude 2, which she said was sometimes overly restrictive in what kind of questions it would answer.
“In our quest to have a highly harmless model, Claude 2 would sometimes over-refuse,” Amodei said. “When somebody would kind of bump up against some of the spicier topics or the trust and safety guardrails, sometimes Claude 2 would trend a little bit conservative in responding to those questions.”
Anthropic, founded in 2021 by former OpenAI employees, has raised billions in venture capital funding — including from major tech players such as Amazon and Google — and become a leading competitor to the biggest AI technology companies vying to stand out in a rapidly growing field.
Unlike previous iterations, Claude 3 allows users to upload documents such as images, charts and technical diagrams for the models to analyze. The models will not, however, be able to generate images.
“All Claude 3 models show increased capabilities in analysis and forecasting, nuanced content creation, code generation, and conversing in non-English languages like Spanish, Japanese, and French,” Anthropic wrote in a news release.
While Opus exhibits much more intelligence than Claude 2.1, it delivers results with similar speed, the company wrote. But Sonnet is two times faster than the previous model, and outranks it in intelligence despite being not as capable as Opus.
The company said the Claude 3 models will also be able to offer citations for users to verify the accuracy of their answers and touted the models’ increased accuracy and improved context-recall ability.
But in a technical whitepaper, Anthropic identified two of Claude 3’s key weaknesses: hallucinations, which tend to occur when the models misinterpret visual data (such as what an image portrays), and failing to acknowledge when an image is harmful.
And as the 2024 presidential election ramps up in a media ecosystem increasingly prone to the spread of misinformation, Anthropic wrote in the paper that the company is developing new policies around using its tools for political purposes, as well as new methods to evaluate how the models respond to “prompts aimed at election misinformation, bias, and other misuses.”
“Of course no model is perfect, and I think that’s a very important thing to say upfront,” Amodei told CNBC. “We’ve tried very diligently to make these models the intersection of as capable and as safe as possible. Of course there are going to be places where the model still makes something up from time to time.”