Information-Theoretic Modelling for Machine Learning

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2026/3-443

Type:

NAISS Medium

Principal Investigator:

Ragnar Thobaben

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2026-06-01

End Date:

2027-06-01

Primary Classification:

20205: Signal Processing

Secondary Classification:

10202: Information Systems (Social aspects at 50804)

Tertiary Classification:

10106: Probability Theory and Statistics (Statistics with medical aspects at 30118 and with social aspects at 50907)

Webpage:

Allocation

Arrhenius GPU at NAISS: 1500 GPU-h/month
Arrhenius Disk at NAISS: 750 GiB

Abstract

Across the different applications of Machine Learning (ML), including, supervised, semi-supervised, and generative modelling, the perspective of inductive biases has become an important component for explaining learning. Inductive biases are induced, either implicitly or explicitly, by a practitioner’s choice of optimizer, architecture, or loss function. By establishing a rigorous theoretical framework for key ML design decisions, this project seeks to utilize high-performance computing to test the importance of various inductive biases. We have previously utilized well-developed mathematical tools from communication theory and related fields to design and analyze inductive biases. These tools have enabled both theoretical and empirical investigations of common inductive biases. In a previous NAISS project (2023/22–844), we induced an explicit inductive bias on the signal space representations: we characterized the optimal separation between signal space representations, as well as provided multiple approaches to achieve near-optimal separation in practice [3]. In subsequent work (NAISS projects 2024/22–1101 and 2025/22–665), we scaled this approach to large-scale computer vision benchmarks [4]. In this project, we seek to explore two research directions. Firstly, we will run follow-up experiments on large-scale computer vision benchmarks to extend our previous results in supervised settings to generative modeling. Secondly, we aim to conduct a new exploration in the domain of generative modeling. Recently, generative adversarial networks (GANs) have shown to be a competitive baseline in the generative modeling space [2]. Key to this resurgence is the choice of loss function: which is distinct from the usual loss function the original GAN models considered. We seek to theoretically analyze and verify the importance of the inductive bias induced by the change of loss function. In addition, we seek to explore the literature of gradient flow methods, where the introduction of (de)-regularized loss functions has shown to be key inductive biases for this family of generative models [1]. [1] Zonghao Chen, Aratrika Mustafi, Pierre Glaser, Anna Korba, Arthur Gretton, and Bharath K Sriperumbudur. (De)-regularized Maximum Mean Discrepancy Gradient Flow. Journal of Machine Learning Research, 26(235):1–77, 2025. [2] Yiwen Huang, Aaron Gokaslan, Volodymyr Kuleshov, and James Tompkin. The GAN is dead; long live the GAN! A Modern GAN Baseline. Advances in Neural Information Processing Systems, 37:44177–44215, 2024. [3] Martin Lindström, Borja Rodríguez-Gálvez, Ragnar Thobaben, and Mikael Skoglund. A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry. In Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM), volume 251 of Proceedings of Machine Learning Research, pages 78–91. PMLR, 2024. [4] Martin Lindström, Ragnar Thobaben, and Mikael Skoglund. On the Importance of Separation and Labelling on the Hypersphere. Journal of Machine Learning Research, 2026. Under Review.