Gauge Equivariant Convolutional Neural Networks

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/5-613

Type:

NAISS Medium Compute

Principal Investigator:

Daniel Persson

Affiliation:

Chalmers tekniska högskola

Start Date:

2025-11-01

End Date:

2026-11-01

Primary Classification:

10799: Other Natural Sciences

Webpage:

https://gapindnns.github.io/

Allocation

Alvis at C3SE: 2000 GPU-h/month

Abstract

Despite the overwhelming success of deep neural networks we are still at a loss for explaining exactly how deep learning works, and why it works so well. What is the mathematical framework underlying deep learning? One promising direction is to consider symmetries as an underlying design principle for network architectures. This can be implemented by constructing deep neural networks on a group G that acts transitively on the input data. This is directly relevant for instance in the case of spherical signals where G is a rotation group. Even more generally, it is natural to consider the question of how to train neural networks in the case of "non-Euclidean data''. Relevant applications include omnidirectional computer vision, biomedicine, and climate observations, just to mention a few situations where data is naturally "non-flat''. Mathematically, this calls for developing a theory of deep learning on manifolds, or even more exotic structures, like graphs or algebraic varieties. The aim of this project is to use techniques and theorems from mathematics and physics to develop a general framework for efficiently applying convolutional neural networks (CNNs) non-Euclidean data. The project aims to apply this formalism to concrete problems arising in autonomous driving, where a general framework for applying CNNs to non-Euclidean data is highly desirable. In particular, it will be applied to image recognition problems and object detection from Fisheye cameras, as well as for interpolated point clouds arising from the Lidars, mounted on the self-driving vehicle. This part of the project will be pursued in collaboration with Zenseact. Another direction of our research is to apply GDL frameworks tailored to medical imaging modalities, with an emphasis on tasks that require understanding the geometry of the human body and of the acquisition process. A first concrete application of this agenda is in ultrasound imaging, a modality that is highly operator-dependent and widely used for bedside diagnostics and point-of-care screening. Ultrasound image quality is strongly tied to probe placement and orientation, yet current AI tools mostly target fully automated measurement once a “good” view is found. There is a notable gap for objective, geometry-aware image quality assessment that can be used as a training signal for novice operators or even fully automated probe guidance systems. We propose to build an AI model that predicts a continuous “distance-to-perfect” score (e.g. 0–100 %) for an ultrasound frame or short video snippet, measuring how close the current view is to a standard, expert-accepted plane of a given organ. We will train and evaluate the model on a curated database of ultrasound videos that capture the trajectory from first contact to the final expert-accepted frame. We collaborate with Dr. Carl Hallgren at Sahlgrenska who will provide the ultrasound database for the project. This project is a continuation of the ongoing project (NAISS 2023/5-393) resulting in the publications https://arxiv.org/abs/2105.13926 https://arxiv.org/abs/2105.05400 https://arxiv.org/abs/2202.03990 https://arxiv.org/abs/2307.07313 https://arxiv.org/abs/2406.06504 https://arxiv.org/abs/2502.15376 https://arxiv.org/abs/2505.17720