Geometry of Linear Regions for Deep ReLU Networks

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2023/5-476

Type:

NAISS Medium Compute

Principal Investigator:

Mårten Björkman

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2023-12-21

End Date:

2025-01-01

Primary Classification:

10207: Computer Vision and Robotics (Autonomous Systems)

Webpage:

Allocation

Alvis at C3SE: 5000 GPU-h/month

Abstract

Large deep networks achieve state of the art performance on several classification tasks, while at the same time are able to fully memorize arbitrary labelings of the training data. Recently, state of the art models have been observed to express smooth functions of their input data, and such regularity has been connected to a potential implicit regularization effect induced by the model architecture and stochastic gradient optimizers. In this project, we study how the geometry of learning is affected by model size in the overparameterized regime, in relationship to the test error, for deep neural networks trained in practice, with the goal of understanding how model size biases learning towards recovering smooth interpolating functions of the training data, that at the same time are capable of generalizing to unseen data. After establishing a relationship between robust interpolation and generalization in supervised learning, in this project we extend our study to self-supervised learning.