NAISS
SUPR
NAISS Projects
SUPR
Knowledge Distillation and Efficient LLMs
Dnr:

NAISS 2025/22-1727

Type:

NAISS Small Compute

Principal Investigator:

Xaver Raphael Davey

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2025-12-10

End Date:

2026-12-01

Primary Classification:

10210: Artificial Intelligence

Webpage:

Allocation

Abstract

As Large Language Models (LLMs) continue to dominate the deep learning landscape, their deployment is increasingly hindered by prohibitive computational and memory costs. Knowledge Distillation (KD) offers a powerful solution; however, prevailing paradigms are becoming computationally intractable for billion-parameter models. The current KD ecosystem lacks the standardized tooling required to implement more efficient approaches. Researchers are currently forced to build brittle, ad-hoc pipelines to align tensors and manage intermediate supervision signals. This project focuses on bridging this gap by developing a novel, modular toolkit to facilitate granular, block-wise distillation "out of the box." Unlike existing libraries that prioritize global-wise (end-to-end) logit matching, this toolkit targets the internal feature representations of deep networks, enabling precise architectural transformations. Specifically, these computational resources will help develop this software. Additionally, with this developed KD toolkit, various more efficient versions of LLMs (via compression, quantization, or even architectural changes, such as the linearization of the self-attention mechanism) can be more thoroughly researched. Access to high-performance GPU resources is critical for this phase of research. Validating the developed KD framework requires extensive benchmarking of teacher-student pairs at the billion-parameter scale and rigorous ablation studies to quantify the trade-offs among various training regimes. This grant would enable the thorough evaluation of these compression paradigms, ultimately providing the community with a standardized methodology for creating efficient, accessible foundation models.