Software testing using large language models

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/22-880

Type:

NAISS Small Compute

Principal Investigator:

Muhammad Laiq

Affiliation:

Mittuniversitetet

Start Date:

2025-06-11

End Date:

2026-07-01

Primary Classification:

10205: Software Engineering

Webpage:

https://www.miun.se/en/Research/researchgroups/software-engineering-and-education-see-research-group/

Allocation

Alvis at C3SE: 250 GPU-h/month

Abstract

In this project, we will investigate the application of large language models (LLMs) to software testing, focusing on the automatic generation of unit test cases. Our goal is to evaluate the effectiveness of LLMs in understanding code, writing compilable test cases, and achieving desirable test coverage with minimal human input. We will conduct experiments using open-source LLMs, specifically models from the LLaMA, Mistral, and DeepSeek families, to assess their performance and limitations in this context. Additionally, we aim to explore the impact of different prompting strategies, such as basic prompting versus prompt aggregation, on the quality and reliability of generated test cases. The outcomes of this work will help inform the design of more robust LLM-assisted tools for automatic testing.