Fine-tuning large language models (LLMs) has become an increasing computational and storage challenge as these models now contain billions of parameters. Low-rank adaptation (LoRA) is a recent framework for parameter-efficient fine-tuning that addresses this bottleneck by constraining task-specific weight updates to be low rank. Concretely, for a given weight matrix, the update is parameterized as ΔW=BA, where A and B are much smaller matrices whose parameters are trained while the pretrained weights remain frozen. This reparameterization substantially reduces memory and compute requirements while maintaining performance competitive with full fine-tuning.
Despite strong empirical results, the theoretical foundations of LoRA remain limited. In particular, a key hyperparameter is the rank, which directly controls the number of trainable parameters and is often selected using ad hoc rules of thumb. The aim of this project is to derive a theoretically grounded, data-driven rule for choosing the rank, together with a principled initialization strategy for the adaptation matrices A and B. In particular, we will investigate the properties of the neural tangent kernel induced by the gradients of the pretrained model. We expect that this procedure will reduce the performance gap between full fine-tuning and LoRA, while further improving efficiency.
Main supervisor: Alexandre Proutiere, KTH.