6 Techniques to Reduce Hallucinations in LLMs

Using techniques like better prompts, knowledge graphs, and advanced RAG can help prevent hallucinations and create more robust LLM systems.

Share

Published on May 19, 2024

by Sukriti Gupta

Listen to this story

LLMs hallucinate—generate incorrect, misleading, or nonsensical information. Some, like OpenAI CEO Sam Altman, consider AI hallucinations creativity, and others believe hallucinations might be helpful in making new scientific discoveries. However, they aren’t a feature but a bug in most cases where providing a correct response is important.

So, what’s the way to reduce LLM hallucinations? Long-context? RAG? Fine-tuning?

Well, long-context LLMs are not foolproof, vector search RAG is bad, and fine-tuning comes with its own challenges and limitations.

Here are some advanced techniques that you can use to reduce LLM hallucinations.

Using Advanced Prompts

There’s a lot of debate on whether using better or more advanced prompts can solve LLM hallucinations.

Source: X

While some believe that writing more detailed prompts doesn’t help the case, others like Google Brain co-founder Andrew Ng see a potential there.

Ng believes that the reasoning capability of GPT-4 and other advanced models makes them quite good at interpreting complex prompts with detailed instructions.

“With many-shot learning, developers can give dozens, even hundreds of examples in the prompt, and this works better than few-shot learning,” he wrote.

Source: X

Many new developments are also being done to make prompts better like Anthropic releasing a new ‘Prompt Generator’ tool that can turn simple descriptions into advanced prompts optimised for LLMs.

You can now generate production-ready prompts in the Anthropic Console.

Describe what you want to achieve, and Claude will use prompt engineering techniques like chain-of-thought reasoning to create more effective, precise and reliable prompts. pic.twitter.com/TqylVRkfP5
— Anthropic (@AnthropicAI) May 10, 2024

Recently, Marc Andreessen also said that with the right prompting, we can unlock the latent super genius in AI models. “Prompting crafts in many different domains such that you’re kind of unlocking the latent super genius,” he added.

CoVe by Meta AI

Chain-of-Verification (CoVe) by Meta AI is another technique. This method reduces hallucination in LLMs by breaking down fact-checking into manageable steps, enhancing response accuracy, and aligning with human-driven fact-checking processes.

CoVe involves generating an initial response, planning verification questions, answering these questions independently, and producing a final verified response. This method significantly improves the accuracy of the model by systematically verifying and correcting its own outputs.

It enhances performance across various tasks, such as list-based questions, closed-book QA, and long-form text generation, by reducing hallucinations and increasing factual correctness.

Chain-of-Verification (CoVe) reduces hallucinations in LLMs.

It does this by using verification questions to fact-check the answers of an LLM.

In experiments, CoVe decreases hallucinations across a variety of tasks. pic.twitter.com/ri2DRNFoL2
— TuringPost (@TheTuringPost) September 29, 2023

Knowledge Graphs

RAG is not limited to vector database matching anymore, there are many advanced RAG techniques being introduced that improve retrieval significantly.

For example, the integration of Knowledge Graphs (KGs) into RAG. By leveraging the structured and interlinked data from KGs, the reasoning capabilities of current RAG systems can be greatly enhanced.

Source: X

RAPTOR

Another technique is Raptor, a method to address questions that span multiple documents by creating a higher level of abstraction. It is particularly useful in answering queries that involve concepts from multiple documents.

Methods like Raptor go really well with long context LLMs because you can just embed full documents without any chunking.

This method reduces hallucinations by integrating external retrieval mechanisms with a transformer model. When a query is received, Raptor first retrieves relevant and verified information from external knowledge bases.

This retrieved data is then embedded into the model’s context alongside the original query. By grounding the model’s responses in factual and pertinent information, Raptor ensures that the generated content is accurate and contextually appropriate.

Mitigating LLM Hallucinations via Conformal Abstention

The paper ‘Mitigating LLM Hallucinations via Conformal Abstention’ introduces a method to reduce hallucinations in LLMs by employing conformal prediction techniques to determine when the model should abstain from providing a response.

By using self-consistency to evaluate response similarity and leveraging conformal prediction for rigorous guarantees, the method ensures that the model only responds when confident in its accuracy.

This approach effectively bounds the hallucination rate while maintaining a balanced abstention rate, particularly benefiting tasks requiring long-form answers. It significantly improves the reliability of model outputs by avoiding incorrect or nonsensical responses.

Reducing Hallucination in Structured Outputs via RAG

Recently, ServiceNow reduced hallucinations in structured outputs through RAG, enhancing LLM performance and enabling out-of-domain generalisation while minimising resource usage.

The technique involves a RAG system, which retrieves relevant JSON objects from external knowledge bases before generating text. This ensures the generation process is grounded in accurate and relevant data.

By incorporating this pre-retrieval step, the model is less likely to produce incorrect or fabricated information, thereby reducing hallucinations. Additionally, this approach allows the use of smaller models without compromising performance, making it efficient and effective.

Researchers at ServiceNow Propose a Machine Learning Approach to Deploy a Retrieval Augmented LLM to Reduce Hallucination and Allow Generalization in a Structured Output Task

Quick read: https://t.co/7TRlPqSqFS

Paper: https://t.co/J2kX5XTDFQ @ServiceNow…
— Marktechpost AI Research News ⚡ (@Marktechpost) April 26, 2024

All these methods and more can help prevent hallucinations and create more robust LLM systems.

📣 Want to advertise in AIM? Book here

Sukriti Gupta

Having done her undergrad in engineering and masters in journalism, Sukriti likes combining her technical know-how and storytelling to simplify seemingly complicated tech topics in a way everyone can understand