TECH & OTHER NEWS

Google’s new Infini-attention technique lets you input infinite text into LLMs

April 15, 2024

gemini-ai-on-google-messages — Artie Beaty/ZDNET

Today’s large language models (LLMs) have limits on how much information you can input before they give you a result. Google has unveiled a way to change that: a method that allows LLMs to accept an infinite amount of text. The technique, called Infini-attention, works without sacrificing memory and computational power, creating a more efficient — and potentially impactful — LLM result.

“An effective memory system is crucial not just for comprehending long contexts with LLMs, but also for reasoning, planning, continual adaptation for fresh knowledge, and even for learning how to learn,” the authors wrote in a research paper accompanying their announcement.

Context windows play a central role in how LLMs operate, and as of this writing, all popular AI models, including OpenAI’s GPT-4 and Anthropic’s Claude 3, have a finite context window. Claude 3, for example, allows for up to 200,000 tokens, or alphanumeric characters, in a single query. GPT-4’s context window allows for 128,000 tokens.

Also: What is Gemini? Everything you should know about Google’s new AI model

The context window matters a lot for LLMs. The more tokens allowable in the context window, the more data users can input to generate their desired result. LLM creators therefore try to increase the number of tokens with each new iteration to make their models more effective at learning, understanding, and delivering results.

In order to do so, however, tech companies need to accommodate for memory and computing requirements. With every doubling of an LLM’s context window, the memory and computational requirements increase by a factor of four, the Google researchers wrote. Each increase in memory and computational power is naturally not just resource intensive, but exceedingly expensive.

Google’s Infini-attention solves for this problem by using existing memory and computational requirements. When the researchers input additional detail into a context window beyond the limitations of the models they tested, they transferred all of the data up to the limit into what’s called “compressive memory” and removed it from active memory, which was then freed up for the additional context. Once all of the data was inputted, the model was able to pair the compressive memory with all the input in its active memory to deliver a response. This technique enables “a natural extension of existing LLMs to infinitely long contexts via continual pre-training and finetuning,” the researchers wrote.

Armed with the ability to put as much context into their models as they wished, the researchers compared their Infini-attention technique against existing LLMs and found their option was superior. “Our approach can naturally scale to a million length regime of input sequences, while outperforming the baselines on long-context language modeling benchmark and book summarization tasks,” the researchers wrote.

The researchers didn’t share their data or proof that their method indeed performs better than existing models. It stands to reason, however, that if they can eliminate context window limitations, models equipped with this technique should outperform those with limits in place.

Google’s technique could pave the way for dramatic improvements in LLM performance, allowing for companies to create new applications, generate additional insights, and more. For now, though, Infini-attention is purely research. It’s unclear whether the technique will make its way to broadly-available LLMs.

Source Link

Google’s new Infini-attention technique lets you input infinite text into LLMs

LEAVE A REPLY Cancel reply

TECH NEWS

Everything Old is New Again: AI-Driven Development and Open Source

Gen AI in Healthcare: The State of Affairs in India

Gartner Predicts Legal, Risk and Compliance Functions to Double Technology Spend...

Microsoft to End Support for Windows Mail, Calendar and People Apps...

IDC Predicts: Asia/Pacific Business Leaders to Demand 80% Success Rate on...

The Cooling Conundrum: AI and Automation Push Data Centers Toward 3X...

TOP STORIES

Organizations Remain Focused on AI; Most Innovations Not Yet Living Up...

Seventy Percent of Economies Are Underprepared for AI Disruption

New study shows almost half of tech professionals in India believe...

Organizations Combining Organizational Learning and AI-Specific Learning Are up to 80%...

Nvidia’s AI-driven triumph over Intel powered by strategic innovations

Most banks and insurers adopt cloud solutions with the primary objective...

Cyber Security

Deepfake Attacks Are Winning in Crypto: 57% Companies Impacted, Regula’s Study...

AI and Gen AI are set to transform cybersecurity for most...

ThreatQuotient Publishes 2024 Evolution of Cybersecurity Automation Adoption Research Report

Kaspersky predicts quantum-proof ransomware and advancements in mobile financial cyberthreats in...

Rising concerns, lingering gaps: most organizations fear AI-driven cyberattacks but lack...

Tenable Forecasts Data Security in the Cloud to Take Centre Stage...