Diversity and Calibration of Large Language Models in Supervised Fine-Tuning

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/22-1607

Type:

NAISS Small Compute

Principal Investigator:

Oleksandr Cherednichenko

Affiliation:

Umeå universitet

Start Date:

2025-11-19

End Date:

2026-12-01

Primary Classification:

10208: Natural Language Processing

Webpage:

Allocation

Alvis at C3SE: 750 GPU-h/month
Mimer at C3SE: 500 GiB

Abstract

Our goal is to address the problem of degradation both diversity and calibration durning Supervised Fine-Tuning (SFT) of Large Language Models (LLMs). Recently, there are multiple studies that investigates this issue and suggest different loss functions to preserve or increase diversity. However, we found that all of them fails in finding a reasonable tradeoff between quality, diversity and calibration. To bridge the gap we propose our modification to the standard loss function CE + regularized entropy. To validate our approach we aim to fine-tune 8 different quantized LLMs on instruct dataset and evaluate them on two novel diversity datasets. Finally, we will compare our loss function to the alternative and show the difference in both diversity and calibration metrics (ECE-informed).