Fine Tuning LLM to improve code quality

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2026/4-1030

Type:

NAISS Small

Principal Investigator:

Nadim Hagatulah

Affiliation:

Lunds universitet

Start Date:

2026-06-03

End Date:

2027-07-01

Primary Classification:

10201: Computer Sciences

Webpage:

Allocation

Arrhenius GPU at NAISS: 300 GPU-h/month
Arrhenius Disk at NAISS: 250 GiB

Abstract

This project proposal seeks to improve software maintainability by applying Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to post-train an LLM for advanced code refactoring. Currently, models in the 7B to 30B parameter range struggle with structural code improvements, frequently defaulting to superficial edits like variable renaming rather than resolving underlying design flaws. Our objective is to post-train Nvidia’s Nemotron-3 Nano to accurately identify and structurally transform bad code design. Early benchmarking of the base model shows a 10% pass rate (passing tests post-refactoring) and an 8% effective refactoring rate (passing tests while objectively increasing code quality). To drive this improvement, we will use the CodeScene CLI tool to generate a CodeHealth score (1–10). This objective metric will be used to curate a distilled synthetic dataset for SFT and to establish the reward signal for RL. Because full fine-tuning at this scale is highly resource-intensive, we will utilize LoRA adapters. However, hardware requirements remain significant, with an estimated need of at least 130 GB of VRAM for the SFT phase and 200 GB of VRAM for the RL phase. I (Nadim Hagatulah) am a doctoral student at Lunds University, My Supervisor is Markus Borg, an adjunct here at Lunds University.