Knowledge Integration of LLMs with External Retrieval (More Storage)

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2024/23-531

Type:

NAISS Small Storage

Principal Investigator:

Mehrdad Farahani

Affiliation:

Chalmers tekniska högskola

Start Date:

2024-09-23

End Date:

2025-10-01

Primary Classification:

10208: Language Technology (Computational Linguistics)

Webpage:

Allocation

Mimer at C3SE: 2500 GiB

Abstract

To address the challenges faced by Large Language Models (LLMs) in handling specialized and current information, the Retrieval-augmented Generation (RAG) system has been developed. It combines internal parametric knowledge with external non-parametric knowledge. This approach helps LLMs to have access to world knowledge on a small scale, it could involve accessing Wikipedia, but the way these models respond to queries, such as factual questions, is still poorly understood in the NLP research community. How much of the generated output is based on the retriever, and how much is based on the internal information from the model itself? Another exciting puzzle is that the retriever and the LLMs were not trained together, so keeping track of their contribution to the output would be a mystery.