Model-based Distribution Agents for Randomly Delayed Reinforcement Learning

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/22-675

Type:

NAISS Small Compute

Principal Investigator:

John Wikman

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2025-05-12

End Date:

2026-06-01

Primary Classification:

10210: Artificial Intelligence

Webpage:

Allocation

Alvis at C3SE: 1000 GPU-h/month
Mimer at C3SE: 500 GiB
Klemming at PDC: 500 GiB
Dardel-GPU at PDC: 200 GPU-h/month
Dardel at PDC: 1 x 1000 core-h/month

Abstract

We are developing a new kind of algorithm where agents in reinforcement learning can adapt to random unobservable delays. This is done by training up a model of the system and using that to compute distributions over possible future states to make more informed decisions. We use deep neural network to represent the models.