TECH & OTHER NEWS

Meta releases OpenEQA to test how AI understands the world, for home robots and smart glasses

April 12, 2024

An example of how Meta’s OpenEQA could create embodied intelligence in the home

Meta

Meta wants to help AI understand the world around it — and get smarter in the process. The company on Thursday unveiled Open-Vocabulary Embodied Question Answering (OpenEQA) to showcase how AI could understand the spaces around it. The open-source framework is designed to give AI agents sensory inputs that allow it to gather clues from its environment, “see” the space it’s in, and otherwise provide value to humans who will ask for AI assistance in the abstract.

“Imagine an embodied AI agent that acts as the brain of a home robot or a stylish pair of smart glasses,” Meta explained. “Such an agent needs to leverage sensory modalities like vision to understand its surroundings and be capable of communicating in clear, everyday language to effectively assist people.”

Also: Meta unveils second-gen AI training and inference chip

Meta provided a host of examples of how OpenEQA could work in the wild, including asking AI agents where users placed an item they need or if they still have food left in the pantry.

“Let’s say you’re getting ready to leave the house and can’t find your office badge. You could ask your smart glasses where you left it, and the agent might respond that the badge is on the dining table by leveraging its episodic memory,” Meta wrote. “Or if you were hungry on the way back home, you could ask your home robot if there’s any fruit left. Based on its active exploration of the environment, it might respond that there are ripe bananas in the fruit basket.”

It sounds like we’re well on our way to an at-home robot or pair of smart glasses that could help run our lives. There’s still a significant challenge in developing such a technology, however: Meta found that vision+language models (VLMs) perform woefully. “In fact, for questions that require spatial understanding, today’s VLMs are nearly ‘blind’-access to visual content provides no significant improvement over language-only models,” Meta said.

This is precisely why Meta made OpenEQA open source. The company says that developing an AI model that can truly “see” the world around it as humans do, can recollect where things are placed and when, and then can provide contextual value to a human based on abstract queries, is extremely difficult to create. The company believes a community of researchers, technologists, and experts will need to work together to make it a reality.

Also: Meta will add AI labels to Facebook, Instagram, and Threads

Meta says that OpenEQA has more than 1,600 “non-templated” question and answer pairs that could represent how a human would interact with AI. Although the company has validated the pairs to ensure they can be answered correctly by the algorithm, more work needs to be done.

“As an example, for the question ‘I’m sitting on the living room couch watching TV. Which room is directly behind me?’, the models guess different rooms essentially at random without significantly benefitting from visual episodic memory that should provide an understanding of the space,” Meta wrote. “This suggests that additional improvement on both perception and reasoning fronts are needed before embodied AI agents powered by such models are ready for primetime.”

So, it’s still early days. If OpenEQA shows anything, however, it’s that companies are working really hard to get us AI agents that can reshape how we live.

Source Link

Meta releases OpenEQA to test how AI understands the world, for home robots and smart glasses

LEAVE A REPLY Cancel reply

TECH NEWS

Everything Old is New Again: AI-Driven Development and Open Source

Gen AI in Healthcare: The State of Affairs in India

Gartner Predicts Legal, Risk and Compliance Functions to Double Technology Spend...

Microsoft to End Support for Windows Mail, Calendar and People Apps...

IDC Predicts: Asia/Pacific Business Leaders to Demand 80% Success Rate on...

The Cooling Conundrum: AI and Automation Push Data Centers Toward 3X...

TOP STORIES

Organizations Remain Focused on AI; Most Innovations Not Yet Living Up...

Seventy Percent of Economies Are Underprepared for AI Disruption

New study shows almost half of tech professionals in India believe...

Organizations Combining Organizational Learning and AI-Specific Learning Are up to 80%...

Nvidia’s AI-driven triumph over Intel powered by strategic innovations

Most banks and insurers adopt cloud solutions with the primary objective...

Cyber Security

Deepfake Attacks Are Winning in Crypto: 57% Companies Impacted, Regula’s Study...

AI and Gen AI are set to transform cybersecurity for most...

ThreatQuotient Publishes 2024 Evolution of Cybersecurity Automation Adoption Research Report

Kaspersky predicts quantum-proof ransomware and advancements in mobile financial cyberthreats in...

Rising concerns, lingering gaps: most organizations fear AI-driven cyberattacks but lack...

Tenable Forecasts Data Security in the Cloud to Take Centre Stage...