NAISS
SUPR
NAISS Projects
SUPR
Healthy AI Lab Medium Compute
Dnr:

NAISS 2025/5-568

Type:

NAISS Medium Compute

Principal Investigator:

Fredrik Johansson

Affiliation:

Chalmers tekniska högskola

Start Date:

2025-11-01

End Date:

2026-11-01

Primary Classification:

10201: Computer Sciences

Allocation

Abstract

This project should support the computing needs of the Healthy AI Lab led by Fredrik Johansson. It is, among other things, intended to merge several existing projects, as instructed by C3SE earlier in 2025. The group continually hosts 7-12 members, depending on the number of active PhD students, MSc students and research assistants. The project covers several research directions, all active during 2025-2026: ## Distributional shift & federated learning Generalization under distributional shift is a central problem in machine learning. The Healthy AI Lab investigates many aspects of this problem, including (i) verifying theoretical advances to guarantee performance in realistic learning problems with distributional shift, (ii) studying the value of learning using side/auxiliary information for increasing model robustness, (iii) algorithm development for federated learning. ## Tabular representation learning & out-of-variable generalization Classical supervised machine learning with pre-defined inputs and outputs relies heavily on that the system is trained on the same task it is intended to solve. In many cases, this paradigm is limiting, such as when a) some variables are unavailable at training or test time (missing values) or b) when the task of interest changes after deployment. In this line of work, we study the use of large language models for exploiting task and variable descriptions in generalization between tasks. ## Identifiable latent-variable models and representation learning In this direction, we develop architectures for identifiable latent variable models (LVM) that can be used as representations for a variety of downstream tasks. Identifiable LVMs are a special case of models that are guaranteed to converge to a form that is consistent with the data-generating process under a given set of assumptions. Having such guarantees allows users to reason about the quality that a pre-trained representation will bring to a new task. ## Prediction models for tasks with missing values Missing values are a pervasive threat to the performance of machine learning algorithms in a vast variety of applications and domains. The Healthy AI Lab develops methods that are especially suited to cope with such problems and support robust decision-making when important features are missing at test time. ## Image classification with text-based privileged information. In this project, we explore how image classifiers can be improved in terms of sample efficiency by incorporating privileged or auxiliary information during training that is not available at test time. For example, reports written about the images would require a human to interpret them and would not be available when the image classifier is used but could be provided as part of the training set.