UHG
Search
Close this search box.

Comprehensive RAG Benchmark Aims to Advance Retrieval-Augmented Question Answering

Meta AI Introduces New Dataset and Evaluation to Push Limits of Knowledge-Grounded Language Models

Share

Listen to this story

Researchers at Meta AI have created a new benchmark called CRAG (Comprehensive Retrieval-Augmented Generation Benchmark) to spur advancements in retrieval-augmented question answering systems that combine large language models with external knowledge sources. 

The goal is to develop more reliable and trustworthy question answering capabilities that overcome hallucinations and knowledge gaps in today’s language models.

The CRAG benchmark consists of 4,409 question-answer pairs spanning finance, sports, music, movies, and general topics. 

It includes diverse question types like comparisons, aggregations, multi-hop queries, and false premises. The dataset incorporates facts with varying dynamism from real-time to static, as well as varying entity popularity from head to long-tail.

Crucially, CRAG provides mock web search results and APIs to simulate retrieving information from the internet and knowledge graphs. This allows benchmarking the full pipeline of retrieval, synthesis, and generation required for knowledge-grounded question answering.

Evaluations highlighted major gaps in current systems. The most advanced language models achieved only 34% accuracy on CRAG, while straightforward retrieval-augmentation improved this to just 44%. 

Even industry-leading retrieval-augmented systems answered only 63% of questions without hallucinations, struggling especially with dynamic, long-tail, and complex queries.

“CRAG reveals the challenges in building fully trustworthy question answering systems that can reliably incorporate information from the real world,” said Xiao Yang, a research scientist at Meta AI and co-lead of the project. “We hope this benchmark spurs innovation and allows tracking progress toward this critical goal.”

The CRAG dataset formed the basis for the KDD Cup 2024 challenge hosted by Meta AI, attracting thousands of participants working to advance retrieval-augmented generation capabilities. The researchers plan to continue expanding and improving CRAG to push forward research in this area.

📣 Want to advertise in AIM? Book here

Picture of Gopika Raj

Gopika Raj

With a Master's degree in Journalism & Mass Communication, Gopika Raj infuses her technical writing with a distinctive flair. Intrigued by advancements in AI technology and its future prospects, her writing offers a fresh perspective in the tech domain, captivating readers along the way.
Related Posts
Association of Data Scientists
Tailored Generative AI Training for Your Team
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.