NAISS
SUPR
NAISS Projects
SUPR
Multistability of Self-Attention Dynamics in Transformers
Dnr:

NAISS 2025/22-1730

Type:

NAISS Small Compute

Principal Investigator:

Chenglong Li

Affiliation:

Linköpings universitet

Start Date:

2025-12-12

End Date:

2026-07-01

Primary Classification:

20202: Control Engineering

Webpage:

Allocation

Abstract

In machine learning, a self-attention dynamics is a continuous-time multiagent-like model of the attention mechanisms of transformers. In this paper we show that such dynamics is related to a multiagent version of the Oja flow, a dynamical system that computes the principal eigenvector of a matrix corresponding for transformers to the value matrix. We classify the equilibria of the “single-head” self-attention system into four classes: consensus, bipartite consensus, clustering and polygonal equilibria. Multiple asymptotically stable equilibria from the first three classes often coexist in the self-attention dynamics. Interestingly, equilibria from the first two classes are always aligned with the eigenvectors of the value matrix, often but not exclusively with the principal eigenvector.