Exploring New Reinforcement Learning with Large Language Models

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/22-1693

Type:

NAISS Small Compute

Principal Investigator:

Alkis Sygkounas

Affiliation:

Örebro universitet

Start Date:

2025-12-10

End Date:

2026-04-01

Primary Classification:

10210: Artificial Intelligence

Webpage:

Allocation

Mimer at C3SE: 500 GiB
Alvis at C3SE: 125 GPU-h/month

Abstract

This project uses large-scale compute to automatically discover new reinforcement-learning (RL) paradigms that can outperform today’s hand-designed algorithms. Instead of tuning parameters of existing approaches, we evolve the underlying learning rules themselves, with a large language model proposing new algorithmic updates and GPU-accelerated simulations providing objective performance scores. The search requires running many RL trainings in parallel, each over millions of interaction steps, making HPC resources essential. The goal is to build a scalable system that can explore a vast algorithmic space, identify promising learning strategies, and deepen our understanding of what makes RL methods stable and effective across diverse tasks.