The rise of RAGs: from traditional RAGs to GraphRAGs

RAG (Retrieval Augmented Generation) is a promising technology that overcomes the constraints of LLMs, such as limited knowledge from recent data training and the well-known issue of hallucinations.

Romain Albrand

Jun 5, 2024

Talk to a GraphRAG expert

Get a quick demo ->

The rise of RAGs: from traditional RAGs to GraphRAGs

Today, large language models (LLMs) have widely taken the world by storm as new models are released every week, constantly pushing back the boundaries to the delight of tech enthusiasts and ordinary users alike.

However, if their performances remain breathtaking, these models, generally not trained continuously, have limited knowledge and, worse, may also suffer from hallucinations.

Fortunately, a new technology has recently become all the rage, overcoming the limitations of LLMs: the Retrieval-Augmented Generation (RAG).

Architecture of a classic Retrieval Augmented Generation (RAG)

‍

Traditional RAGs and their challenges

A RAG aims at answering a user query based on a corpus of data.

It enhances the performance of generative AI models as the context of the LLM (pretrained or finetuned) is fed with “relevant” facts fetched from external sources. The retriever model queries a vector database containing embedded chunks and returns the closest embeddings to the user query, with respect to the selected distance metric (dot product, cosine similarity, euclidean distance, …).

If LLMs mainly suffer from hallucinations, RAGs also face important challenges such as:

the relevance of the retrieved documents that can dramatically impact the overall performance of the RAGs.As stated in the paper How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs’ internal prior, a weak LLM may deviate from its prior knowledge if the retrieved documents are irrelevant. Our experience with Lettria's real customer data shows that the majority of wrong answers are due to poorly retrieved documents, which raises the issue of how to find and evaluate a good retriever. Notwithstanding, these findings may vary depending on the field of the corpus.Let's add that the use of a good Reranker model can significantly improve the quality of the returned documents.
The number of retrieved documents and the order in which they are assembled can also significantly impact the final output of the LLM. As detailed in the paper Lost in the Middle: How Language Models Use Long Contexts, LLMs tend to prioritize relevant information when the latter is situated at the beginning or the end of long contexts.
One can also evoke the chunk splitting done at the very beginning or the prompting at the very end as other key issues to optimise the overall performance.

Want to learn how to build a private ChatGPT using open-source technology?

Download our free white paper →

GraphRAGs: structuring knowledge for better insights

An alternative to traditional RAGs involves utilising a knowledge graph to provide contextual information for the LLM, thereby populating the context with more relevant content. Indeed, depending on a vector search for semantically similar text can sometimes perform poorly if there's nothing in the query that points towards the correct information.

The graph approach, known as GraphRAG can unlock new possibilities of data analysis by connecting distant topics or by having a greater perspective on all the documents. Incidentally, it is not surprising to see Microsoft Research harnessing the full capabilities of LLMs by focusing on GraphRAGs. To learn more about their project, you can visit this page.

In addition, unlike LLMs, GraphRAGs are less prone to hallucinate because the knowledge graphs offer more relevant, varied, engaging, coherent, and reliable data to the LLM, ultimately leading to the generation of accurate and factual text.

At Lettria, we are also convinced of the benefits of this approach as it also echoes the original purpose of the company: structuring and extracting information in unstructured texts (leading to text-to-graph approaches) thanks to syntactic and morphological text analysis, part of speech analysis, desambiguation, study of relations etc…

Our recent work on GraphRAGs marks the beginning of an exciting journey full of technical challenges and opportunities. For the time being, a hybrid approach is being developed to take advantage of the satisfying performance of classic RAGs for extractive questions as well as the promising level of comprehension GraphRAGs can boast of. This hybrid approach is mainly justified by the fact that the state of our work on GraphRAG is still under development.

Evaluation of a RAG model

However, before going any further, it is essential to emphasize that model evaluation is a necessary step in all areas of machine learning, whether it be standard classification models or more advanced models such as RAGs, for which literature and tutorials remain quite scarce to this day.

Several components in the pipeline of RAGs are worth being evaluated. However, based on our experience at Lettria and given the significant impact of the Retrieval component, we have decided to delve into the creation of dataset(s) to evaluate embedder models.

We will also see to what extent these datasets can be used to fine-tune embedding models.

If you are eager to learn more about our approach to design evaluation datasets for embedding model, stay tuned!

‍

References

How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs’ internal prior There is an underlying tension between a model’s prior knowledge and the information presented in retrieved documents.
Lost in the Middle: How Language Models Use Long Contexts Performance is often the highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts, even for explicitly long-context models.
Project GraphRAG by Microsoft Research

‍

Romain Albrand

Romain, a dedicated Data Scientist at Lettria, is passionate about Computer Science and Optimization. In addition to his professional pursuits, he is an enthusiastic member of a rock band and has a love for golf.