Refining Transformers for Structured Equation Generation in Physics

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2026/4-1000

Type:

NAISS Small

Principal Investigator:

Eliel Camargo-Molina

Affiliation:

Uppsala universitet

Start Date:

2026-05-27

End Date:

2027-06-01

Primary Classification:

10301: Subatomic Physics

Webpage:

https://huggingface.co/JoseEliel/BART-Lagrangian

Allocation

Arrhenius Disk at NAISS: 1000 GiB
Arrhenius GPU at NAISS: 200 GPU-h/month
Arrhenius CPU at NAISS: 1 x 1000 core-h/month

Abstract

In this continuation project, we will extend our recent work on transformer models for symbolic mathematics in particle physics. Our previous NAISS project built on our paper showing that transformers can predict particle-physics Lagrangians from particle lists. During the previous allocation, we made good progress, but the project developed in a more efficient direction than expected. Instead of doing large-scale retraining immediately, we explored activation steering: guiding the model at inference time by modifying its internal activations. This worked well and reduced our GPU and CPU usage, since we could use existing trained models rather than repeatedly retraining new ones. This also clarified the next step. The model clearly contains useful internal structure, but the present model and dataset are too limited for a clean study of theory space. We now need to train larger and more controlled models on expanded datasets, in order to separate genuine physics structure from training artifacts. The continuation will therefore focus on larger-scale training, activation-level analysis, and more realistic symbolic-generation tasks relevant for phenomenology. We will also extend the approach beyond particle-physics Lagrangians to broader symbolic mathematics and equation-generation tasks. We expect higher NAISS usage in this cycle because the next stage requires new large training runs and larger CPU-generated datasets.