Mobile News

Meta Llama 3 AI Models With 8B and 70B Parameters Launched, Said to Outperform Google’s Gemini 1.5 Pro

April 19, 2024

Meta introduced the next generation of its artificial intelligence (AI) models, Llama 3 8B and 70B, on Thursday. Shortened for Large Language Model Meta AI, Llama 3 comes with improved capabilities over its predecessor. The company also adopted new training methods to optimise the efficiency of the models. Interestingly, with Llama 2, the largest model was 70B, but this time the company said its large models will contain more than 400 billion parameters. Notably, a report last week revealed that Meta will unveil its smaller AI models in April and its larger models later in the summer.

Those interested in trying out the new AI models are in luck as Meta is taking a community-first approach with the Llama 3. The new foundation models will be open source just like previous models. Meta stated in its blog post, “Llama 3 models will soon be available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, and with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.”

The list includes all major cloud, hosting, and hardware platforms, which should make it easier for enthusiasts to get their hands on the AI models. Further, Meta has also integrated Llama 3 with its own Meta AI that can be accessed via Facebook Messenger, Instagram, and WhatsApp in supported countries.

Coming to the performance, the social media giant shared benchmark scores of Llama 3 for both its pre-trained and instruct models. For reference, pre-trained is the general conversational AI whereas the instruct models are aimed at completing specific tasks. The pre-trained model of Llama 3 70B outscored Google’s Gemini 1.0 Pro in the MMLU (79.5 vs 71.8), BIG-Bench Hard (81.3 vs 75.0), and DROP (79.7 vs 74.1) benchmarks, wheres the 70B Instruct model outscored the Gemini 1.5 Pro model in MMLU, HumanEval, and GSM-8K benchmarks, based on data shared by the company.

Meta has opted for a decoder-only transformer architecture for the new AI models but has made several improvements over the predecessor. Llama 3 now uses a tokeniser with a vocabulary of 128K tokens, and the company has adopted grouped query attention (GQA) to improve inference efficiency. GQA helps in improving the attention of the AI so it does not move outside of its designated context when answering queries. The social media giant has pre-trained the models with more than 15T tokens, which it claims to have sourced from publicly available data.

Affiliate links may be automatically generated – see our ethics statement for details.

Source Link

Meta Llama 3 AI Models With 8B and 70B Parameters Launched, Said to Outperform Google’s Gemini 1.5 Pro

LEAVE A REPLY Cancel reply

TECH NEWS

Data Center Trends 2025: Vertiv Predicts Industry Efforts to Support, Enable,...

Everything Old is New Again: AI-Driven Development and Open Source

Gen AI in Healthcare: The State of Affairs in India

Gartner Predicts Legal, Risk and Compliance Functions to Double Technology Spend...

Microsoft to End Support for Windows Mail, Calendar and People Apps...

IDC Predicts: Asia/Pacific Business Leaders to Demand 80% Success Rate on...

TOP STORIES

Organizations Remain Focused on AI; Most Innovations Not Yet Living Up...

Seventy Percent of Economies Are Underprepared for AI Disruption

New study shows almost half of tech professionals in India believe...

Organizations Combining Organizational Learning and AI-Specific Learning Are up to 80%...

Nvidia’s AI-driven triumph over Intel powered by strategic innovations

Most banks and insurers adopt cloud solutions with the primary objective...

Cyber Security

Deepfake Attacks Are Winning in Crypto: 57% Companies Impacted, Regula’s Study...

AI and Gen AI are set to transform cybersecurity for most...

ThreatQuotient Publishes 2024 Evolution of Cybersecurity Automation Adoption Research Report

Kaspersky predicts quantum-proof ransomware and advancements in mobile financial cyberthreats in...

Rising concerns, lingering gaps: most organizations fear AI-driven cyberattacks but lack...

Tenable Forecasts Data Security in the Cloud to Take Centre Stage...