SUPR
Investigating the Impact of Factual Editing in Language Models using Rank-One Model Editing (ROME)
Dnr:

NAISS 2024/22-202

Type:

NAISS Small Compute

Principal Investigator:

Mehrdad Farahani

Affiliation:

Chalmers tekniska högskola

Start Date:

2024-02-12

End Date:

2025-03-01

Primary Classification:

10208: Language Technology (Computational Linguistics)

Webpage:

Allocation

Abstract

This study aims to explore the intricate dynamics of factual knowledge in language models, particularly focusing on the Rank-One Model Editing (ROME) technique. While previous research has demonstrated the ability to edit specific factual associations in autoregressive transformer language models, the broader implications of such edits on related facts remain unclear. Our research, a collaborative effort with KTH University, investigates how altering one fact (e.g., "Eiffel Tower is located in Paris" to "Eiffel Tower is located in Berlin") affects the model's recall and consistency regarding related factual associations (e.g., "Louvre Museum is located in"). By analyzing the model's responses to a series of systematically edited prompts, this study seeks to uncover the extent to which a single factual edit influences the interconnected web of knowledge within the model. The findings aim to provide valuable insights into the associative learning mechanisms of language models, contributing to the development of more reliable and contextually aware AI systems.