In this continuation project, we will extend our recent work on transformer models for symbolic mathematics in particle physics. Our previous NAISS project built on our paper showing that transformers can predict particle-physics Lagrangians from particle lists.
During the previous allocation, we made good progress, but the project developed in a more efficient direction than expected. Instead of doing large-scale retraining immediately, we explored activation steering: guiding the model at inference time by modifying its internal activations. This worked well and reduced our GPU and CPU usage, since we could use existing trained models rather than repeatedly retraining new ones.
This also clarified the next step. The model clearly contains useful internal structure, but the present model and dataset are too limited for a clean study of theory space. We now need to train larger and more controlled models on expanded datasets, in order to separate genuine physics structure from training artifacts.
The continuation will therefore focus on larger-scale training, activation-level analysis, and more realistic symbolic-generation tasks relevant for phenomenology. We will also extend the approach beyond particle-physics Lagrangians to broader symbolic mathematics and equation-generation tasks.
We expect higher NAISS usage in this cycle because the next stage requires new large training runs and larger CPU-generated datasets.