TECH & OTHER NEWS

These AI models reason better than their open-source peers – but still can’t rival humans

October 15, 2024

gettyimages-1906503812 — Yaroslav Kushta/Getty Images

Can artificial intelligence (AI) pass cognitive puzzles designed for human IQ tests? The results were mixed.

Researchers from the USC Viterbi School of Engineering Information Sciences Institute (ISI) investigated whether multi-modal large language models (MLLMs) can solve abstract visual tests usually reserved for humans.

Also: The best AI chatbots: ChatGPT, Copilot, and worthy alternatives

Presented at the Conference on Language Modeling (COLM 2024) in Philadelphia last week, the research tested “the nonverbal abstract reasoning abilities of open-source and closed-source MLLMs” by seeing if image-processing models could go a step further and demonstrate reasoning skills when presented with visual puzzles.

“For example, if you see a yellow circle turning into a blue triangle, can the model apply the same pattern in a different scenario?” explained Kian Ahrabian, a research assistant on the project, according to Neuroscience News. This task requires the model to use visual perception and logical reasoning similar to how humans think, making it a more complex challenge.

The researchers tested 24 different MLLMs on puzzles developed from Raven’s Progressive Matrices, a standard type of abstract reasoning — and the AI models didn’t exactly succeed.

“They were really bad. They couldn’t get anything out of it,” Ahrabian said. The models struggled both to understand the visuals and to interpret patterns.

However, the results varied. Overall, the study found that open-source models had more difficulty with visual reasoning puzzles than closed-source models like GPT-4V, though those still didn’t rival human cognitive abilities. The researchers were able to help some models perform better using a technique called Chain of Thought prompting, which guides the model step-by-step through the reasoning portion of the test.

Also: Open-source AI definition finally gets its first release candidate – and a compromise

Closed-source models are thought to perform better in tests like these due to being specially developed, trained with bigger datasets, and having the advantages of private companies’ computing power. “Specifically, GPT-4V was relatively good at reasoning, but it’s far from perfect,” Ahrabian noted.

“We still have such a limited understanding of what new AI models can do, and until we understand these limitations, we can’t make AI better, safer, and more useful,” said Jay Pujara, research associate professor and author. “This paper helps fill in a missing piece of the story of where AI struggles.”

Also: AI can now solve reCAPTCHA tests as accurately as you can

By finding the weaknesses in AI models’ ability to reason, research like this can help direct efforts to flesh out those skills down the line — the goal being to achieve human-level logic. But don’t worry: For the time being, they’re not comparable to human cognition.

Artificial Intelligence

Source Link

These AI models reason better than their open-source peers – but still can’t rival humans

Artificial Intelligence

LEAVE A REPLY Cancel reply

TECH NEWS

Everything Old is New Again: AI-Driven Development and Open Source

Gen AI in Healthcare: The State of Affairs in India

Gartner Predicts Legal, Risk and Compliance Functions to Double Technology Spend...

Microsoft to End Support for Windows Mail, Calendar and People Apps...

IDC Predicts: Asia/Pacific Business Leaders to Demand 80% Success Rate on...

The Cooling Conundrum: AI and Automation Push Data Centers Toward 3X...

TOP STORIES

Seventy Percent of Economies Are Underprepared for AI Disruption

New study shows almost half of tech professionals in India believe...

Organizations Combining Organizational Learning and AI-Specific Learning Are up to 80%...

Nvidia’s AI-driven triumph over Intel powered by strategic innovations

Most banks and insurers adopt cloud solutions with the primary objective...

India’s Web3 Ecosystem Has Over 400 Firms, Karnataka Emerges as Industry...

Cyber Security

AI and Gen AI are set to transform cybersecurity for most...

ThreatQuotient Publishes 2024 Evolution of Cybersecurity Automation Adoption Research Report

Kaspersky predicts quantum-proof ransomware and advancements in mobile financial cyberthreats in...

Rising concerns, lingering gaps: most organizations fear AI-driven cyberattacks but lack...

Tenable Forecasts Data Security in the Cloud to Take Centre Stage...

Blockchain-Enhanced Cybersecurity-Safeguarding Digital Identities and Data