TECH & OTHER NEWS

Google’s DataGemma is the first large-scale Gen AI with RAG – why it matters

September 17, 2024

google-datagemma-splash-image-2 — Google

The increasingly popular generative artificial intelligence technique known as retrieval-augmented generation — or RAG, for short — has been a pet project of enterprises, but now it’s coming to the AI main stage.

Google last week unveiled DataGemma, which is a combination of Google’s Gemma open-source large language models (LLMs) and its Data Commons project for publicly available data. DataGemma uses RAG approaches to fetch the data before giving an answer to a query prompt.

The premise is to ground generative AI, to prevent “hallucinations,” says Google, “by harnessing the knowledge of Data Commons to enhance LLM factuality and reasoning.”

Also: What are o1 and o1-mini? OpenAI’s mystery AI models are finally here

While RAG is becoming a popular approach for enabling enterprises to ground LLMs in their proprietary corporate data, using Data Commons represents the first implementation to date of RAG at the scale of cloud-based Gen AI.

Data Commons is an open-source development framework that lets one build publicly available databases. It also gathers actual data from institutions such as the United Nations that have made their data available to the public.

In connecting the two, Google notes, it is taking “two distinct approaches.”

The first approach is to use the publicly available statistical data of Data Commons to fact-check specific questions entered into the prompt, such as, “Has the use of renewables increased in the world?” Google’s Gemma will respond to the prompt with an assertion that cites particular stats. Google refers to this as “retrieval-interleaved generation,” or RIG.

In the second approach, full-on RAG is used to cite sources of the data, “and enable more comprehensive and informative outputs,” states Google. The Gemma AI model draws upon the “long-context window” of Google’s closed-source model, Gemini 1.5. Context window represents the amount of input in tokens — usually words — that the AI model can store in temporary memory to act on.

Also: Understanding RAG: How to integrate generative AI LLMs with your business knowledge

Gemini advertises Gemini 1.5 at a context window of 128,000 tokens, though versions of it can juggle as much as a million tokens from input. Having a larger context window means that more data retrieved from Data Commons can be held in memory and perused by the model when preparing a response to the query prompt.

“DataGemma retrieves relevant contextual information from Data Commons before the model initiates response generation,” states Google, “thereby minimizing the risk of hallucinations and enhancing the accuracy of responses.”

The research is still in development; you can dig into the details in the formal research paper by Google researcher Prashanth Radhakrishnan and colleagues.

Google says there’s more testing and development to be done before DataGemma is made available publicly in Gemma and Google’s closed-source model, Gemini.

Already, claims Google, the RIG and RAG have lead to improvements in quality of output such that “users will experience fewer hallucinations for use cases across research, decision-making or simply satisfying curiosity.”

Also: First Gemini, now Gemma: Google’s new, open AI models target developers

DataGemma is the latest example of how Google and other dominant AI firms are building out their offerings with things that go beyond LLMs.

OpenAI last week unveiled its project internally code-named “Strawberry” as two models that use a machine learning technique called “chain of thought,” where the AI model is directed to spell out in statements the factors that go into a particular prediction it is making.

Artificial Intelligence

Source Link

Google’s DataGemma is the first large-scale Gen AI with RAG – why it matters

Artificial Intelligence

LEAVE A REPLY Cancel reply

TECH NEWS

Gartner Predicts Legal, Risk and Compliance Functions to Double Technology Spend...

Microsoft to End Support for Windows Mail, Calendar and People Apps...

IDC Predicts: Asia/Pacific Business Leaders to Demand 80% Success Rate on...

The Cooling Conundrum: AI and Automation Push Data Centers Toward 3X...

Gartner Identifies Four Emerging Challenges to Delivering Value from AI Safely...

The Future of Data Protection: A Deep Dive into NAKIVO Backup...

TOP STORIES

Most banks and insurers adopt cloud solutions with the primary objective...

India’s Web3 Ecosystem Has Over 400 Firms, Karnataka Emerges as Industry...

Next-generation spirits innovation to be shaped by premiumization, convenience, generational shifts,...

Trump Triumph: What it Means for Big Tech, Tariffs, Semiconductors, Automotive...

High- cyber-maturity organizations expect to achieve their business outcomes by 27%...

AI Adoption in 2024: 74% of Companies Struggle to Achieve and...

Cyber Security

Tenable Forecasts Data Security in the Cloud to Take Centre Stage...

Blockchain-Enhanced Cybersecurity-Safeguarding Digital Identities and Data

New F5 Report Unveils Scary Truths About API Security in the...

SteelFox exploits Foxit PDF Editor and AutoCAD for banking data theft...

Kaspersky identifies new stealthy ransomware

Gartner Survey Shows AI Enhanced Malicious Attacks as Top Emerging Risk...