Equivariant Neural Differential Equations

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2023/22-869

Type:

NAISS Small Compute

Principal Investigator:

Fredrik Ohlsson

Affiliation:

Umeå universitet

Start Date:

2023-08-31

End Date:

2024-09-01

Primary Classification:

10799: Other Natural Sciences not elsewhere specified

Webpage:

Allocation

Mimer at C3SE: 500 GiB
Alvis at C3SE: 500 GPU-h/month

Abstract

Deep learning models, in particular realized in the form of deep convolutional neural networks (CNNs), have enjoyed tremendous success on an impressive number of complex problems. A key feature of neural networks is the compositional structure obtained by stacking multiple layers to create complex nonlinear functions. Indeed, even though rigorous results on mathematical properties are typically restricted to single layer networks, empirical evidence suggests that the depth, obtained by composition of many layers, is crucial for performance. Unfortunately, with increasing depth CNNs quickly become very computationally expensive and the representations extracted become difficult to interpret. In addition, deep neural networks are prone to instability under perturbations of the input and heavily fine-tuning the architecture and using poorly understood problem specific modifications is usually required for competitive performance of deep networks. Neural differential equations (NDEs) constitutes a recent development in deep learning, exploring the connection to differential equations and dynamical systems in the continuum limit of infinitely deep networks. Formulated in terms of continuous dynamics propagating information through the network, powerful numerical techniques for differential equations and the extensive theory of dynamical systems are brought to bear on the learning problem in an attempt to alleviate several of the common issues encountered using discrete network architectures. However, the fundamental understanding and mathematical description of the models are still lacking, presenting interesting research problem spanning several areas of mathematics related to differential geometry and dynamical systems. A promising approach to address these open problems is to elicit a more fundamental understanding of NDEs and their solutions by incorporating the geometry of input data and the differential equations themselves into the mathematical description, generally referred to as geometric deep learning, which has previously not been systematically considered for NDEs. Making geometric structures of the data manifold manifest, using differential geometry and group theory to construct models that are equivariant (or invariant) under the action of a symmetry group, amounts to incorporating prior knowledge of the system to facilitate learning and maximize the amount of information extracted from finite data sets. The purpose of this project is to develop the mathematical foundations of the emerging field of neural differential equations by incorporating symmetries and investigating the properties of solutions. To this end, we will extend geometric deep learning to the continuum limit of neural networks by developing a manifestly geometric theory of equivariant NDEs, incorporating symmetry groups of the data manifold and the differential equations. Subsequently, we will extend the theory to partial NDEs and explore the connections of our equivariant NDEs to other equivariant models in geometric deep learning and to constrained physical systems. The theoretical models developed will be applied to concrete classification and segmentation problems for standard datasets in the ML/AI research domain (e.g. MNIST). Such practical applications are crucial both to guide the development of the mathematical framework and to demonstrate its viability and competitiveness in practice.