SUPR
Storage for Instruction-Tuned Language Models
Dnr:

NAISS 2024/23-248

Type:

NAISS Small Storage

Principal Investigator:

Ehsan Doostmohammadi

Affiliation:

Linköpings universitet

Start Date:

2024-04-15

End Date:

2025-05-01

Primary Classification:

10208: Language Technology (Computational Linguistics)

Webpage:

Allocation

Abstract

Our research aims to investigate the behavior of instruction-tuned language models, focusing on the perplexity variations observed across different training and testing language pairs. We have identified notable patterns such as increased perplexity in LLaMA2 7b trained on BactrianX and tested on various datasets, correlating directly with language similarity rather than training data distribution. Furthermore, we plan to explore downstream task performance across various languages and model sizes, hypothesizing that certain languages exhibit higher vulnerability to forgetting and affecting others with lexical similarity. To achieve comprehensive results, we require GPU resources for extensive model training and evaluation, particularly for LLaMA 13b and GPT-SW3, as well as conducting perplexity analysis on Wikipedia articles to assess forgetting of facts. This research will contribute valuable insights into the impact of model size, language similarity, and instruction tuning on language model behavior, aiding in the development of more robust and efficient language models.