TECH & OTHER NEWS

Chatbot showdown: ChatGPT, Google Bard, and Bing Chat put to a real-world test

June 23, 2023

Chatbot AI on phone with coding in background — Olemedia/Getty Images

Ever since ChatGPT surged in popularity in November, the AI chatbot space has become saturated with ChatGPT alternatives. These chatbots vary in LLMs, pricing, UIs, internet access, and more, making it difficult to decide which to use.

To make comparing them easier, the Large Model Systems Organization (LMYSY Org), an open research organization founded by students and faculty from the University of California, Berkeley, created the Chatbot Arena.

Also: Financial and legal professionals see the value in generative AI, according to a study

The Chatbot Arena is a benchmark platform for LLMs where users can put two randomized models to the test by inserting a prompt and selecting the best answer without knowing which LLM is behind either answer.

After users pick a chatbot, they get to see which LLMs were used to generate the output.

The results of the user ratings are used to rank the LLMs on a leaderboard based on an Elo rating system, a widely-used rating system in chess, according to LMSYS Org.

When trying the arena for myself, I used the prompt, “Can you write me an email telling my boss that I will be out because I am going on a vacation that was planned months ago.”

The two responses were very different, with one providing much more context, length, and fill-in-the-blanks that would have been appropriate for the email.

Chatbot Arena — Screenshot by Sabrina Ortiz/ZDNET

After picking “Model B” as the winner, I found out it was the LLM created by LMSYS Org, based on Meta’s LLaMA model, “vicuna-7b.” The losing LLM was “gpt4all-13b-snoozy,” an LLM developed by Nomic AI and finetuned from LLaMA 13B.

The leaderboards unsurprisingly currently place GPT-4, OpenAI’s most advanced LLM, in first place with an Arena Elo rating of 1227. In second place with a rating of 1227 is Claude-v1, an LLM developed by Anthropic.

Leaderboard for best AI chatbots — LMSYS Org

GPT-4 is found in both Bing Chat and ChatGPT Plus making both of those chatbots the best available right now, which aligns with ZDNET’s own AI chatbot rankings.

Also: The AI voice-generating platform that shocked the world is getting an update to fight abuse

Anthropic’s second-ranking Claude is not available to the public just yet, but it does have a waitlist available where users can sign up for early access.

Ranked number eight on the leaderboard is PaLM-Chat-Bison-001, a submodel of PaLM 2, the LLM behind Google Bard. This ranking parallels the general sentiment behind Bard, not the worst but not one of the best.

On the Chatbot Arena site, there is an option where you can select the two different models you want to compare. This feature could be helpful if you want to experiment with specific LLMs.

Artificial Intelligence

Source Link

Chatbot showdown: ChatGPT, Google Bard, and Bing Chat put to a real-world test

Artificial Intelligence

LEAVE A REPLY Cancel reply

TECH NEWS

IDC Predicts: Asia/Pacific Business Leaders to Demand 80% Success Rate on...

The Cooling Conundrum: AI and Automation Push Data Centers Toward 3X...

Gartner Identifies Four Emerging Challenges to Delivering Value from AI Safely...

The Future of Data Protection: A Deep Dive into NAKIVO Backup...

Only 14% of EMEA CIOs to Prioritize Building an Enterprise-Wide Technology...

Gartner Survey Shows AI and Generative AI Top Digital Supply Chain...

TOP STORIES

Next-generation spirits innovation to be shaped by premiumization, convenience, generational shifts,...

Trump Triumph: What it Means for Big Tech, Tariffs, Semiconductors, Automotive...

High- cyber-maturity organizations expect to achieve their business outcomes by 27%...

AI Adoption in 2024: 74% of Companies Struggle to Achieve and...

World Quality Report 2024 shows 68% of Organizations Now Utilizing Gen...

Criminals Reverting to Old-School Tactics with New Twists, Visa’s State of...

Cyber Security

Blockchain-Enhanced Cybersecurity-Safeguarding Digital Identities and Data

New F5 Report Unveils Scary Truths About API Security in the...

SteelFox exploits Foxit PDF Editor and AutoCAD for banking data theft...

Kaspersky identifies new stealthy ransomware

Gartner Survey Shows AI Enhanced Malicious Attacks as Top Emerging Risk...

New cyber campaign targets PC users with fake CAPTCHAs and browser...