CyberSec

CyberSec

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/22-1115

Type:

NAISS Small Compute

Principal Investigator:

Srijita Basu

Affiliation:

Göteborgs universitet

Start Date:

2025-08-20

End Date:

2026-04-01

Primary Classification:

10205: Software Engineering

Webpage:

Allocation

Alvis at C3SE: 500 GPU-h/month
Mimer at C3SE: 500 GiB

Abstract

The increasing reliance on machine learning and large language models (LLMs) for automated code generation raises critical questions about their ability to produce secure and standards-compliant software in safety-critical domains such as automotive systems. In this study, we investigate the effectiveness of LLMs in vulnerability remediation and secure code synthesis. Using well-established datasets containing vulnerable code paired with their corrected counterparts, we evaluate the performance of ten state-of-the-art LLMs in generating secure fixes when prompted with vulnerable inputs. Based on these results, the top-performing models are further tested in the context of automotive system-on-chip (SoC) development, specifically on functions such as parsing Wi-Fi frames used in automotive SoC driver codes. We examine the correctness and quality of generated code under varying prompt conditions, including general natural language instructions, CWE-guided prompts, and MISRA C guideline–oriented prompts. Our findings provide comparative insights into the security-awareness and reliability of LLMs, highlighting both their potential and limitations in supporting vulnerability remediation and safety-critical code generation. This work contributes to the growing body of knowledge on integrating LLMs into secure automotive software development lifecycles, offering guidance on their suitability for different use cases and prompt strategies.