Generalization in Reinforcement Learning with Pretrained Neural Policies under Environment Variation

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2026/4-1156

Type:

NAISS Small

Principal Investigator:

Abbas Pasdar

Affiliation:

Linköpings universitet

Start Date:

2026-06-18

End Date:

2027-07-01

Primary Classification:

20202: Control Engineering

Webpage:

Allocation

Arrhenius Disk at NAISS: 5000 GiB
Arrhenius GPU at NAISS: 500 GPU-h/month

Abstract

Reinforcement learning has achieved strong results in simulated control, robotics, and sequential decision-making. Still, current methods often remain sensitive to changes in environment dynamics, task parameters, reward structure, and observation distribution. This lack of generalization limits the use of reinforcement learning in real-world control and robotic systems, where operating conditions are rarely identical to the training environment. This project focuses on studying generalization in reinforcement learning under environment variation and distribution shift. The central goal is to train, adapt, and evaluate neural policies that can transfer across related control and robotic tasks, rather than only solving a single fixed environment. The project will investigate standard reinforcement learning algorithms together with different policy architectures, including multilayer perceptrons, convolutional policies, transformer-based policies, and pretrained neural policies where appropriate. We will use simulated environments with randomized dynamics, task parameters, initial conditions, and observation settings. Policies will be trained on families of environments and evaluated on unseen variations to study robustness, sample efficiency, and transfer performance. The experiments will include repeated runs over several random seeds, comparisons between policy architectures, and limited adaptation of pretrained models using parameter-efficient methods.