Natural Language Processing Research (year 5)

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/23-161

Type:

NAISS Small Storage

Principal Investigator:

Lovisa Hagström

Affiliation:

Chalmers tekniska högskola

Start Date:

2025-04-01

End Date:

2025-11-01

Primary Classification:

10208: Natural Language Processing

Webpage:

Allocation

Mimer at C3SE: 1300 GiB

Abstract

I intend to continue my investigations of retrieval augmented language models and interpretability methods for NLP during the last year of my PhD studies. Since many of my ongoing projects are based on previous projects and as some of my previous projects still are under review, I need to keep previously finished projects live (among them, the project requiring pre-computed Wikipedia indices of sizes around 1TB). Apart from this, I'm working with models such as LLaMA (https://github.com/meta-llama/llama) with up to 70B parameters and knowledge graphs based on Wikidata, which also requires additional storage. Moreover, I'm working with the Natural Questions dataset which requires about 50 GB of storage. Therefore, my request is to retain the same size of the storage allocation as the previous one. Few months remain for my doctoral studies and my current focus is mainly to wrap things up as efficiently as possible.