NAISS
SUPR
NAISS Projects
SUPR
Midlevel representation feedforward reconstructions
Dnr:

NAISS 2026/3-512

Type:

NAISS Medium

Principal Investigator:

Magnus Oskarsson

Affiliation:

Lunds universitet

Start Date:

2026-08-15

End Date:

2027-03-01

Primary Classification:

10207: Computer graphics and computer vision (System engineering aspects at 20208)

Secondary Classification:

20208: Computer Vision and learning System (Computer Sciences aspects in 10207)

Tertiary Classification:

10210: Artificial Intelligence

Webpage:

Allocation

Abstract

In this project we will develop efficient, robust, and interpretable methods for 3D scene understanding from visual data. It addresses fundamental problems in computer vision, including 3D reconstruction, Structure from Motion (SfM), and Simultaneous Localization and Mapping (SLAM), with the aim of bridging the gap between precise geometric estimation and higher-level scene understanding. The project focuses on learning mid-level 3D representations, that is, representations of local scene geometry that are not tied to semantic object categories, but still support full 3D scene understanding. The central hypothesis is that such representations provide a principled intermediate level between sparse geometric features and high-level semantic representations, while remaining computationally efficient and interpretable. The project will investigate explicit and implicit models for mid-level 3D geometry, including learned feedforward reconstruction methods based on different geometric primitives, such as lines, curves and cylinders. It will develop geometric multi-view constraints for bootstrapping the learning of 3D scene geometry, reducing the need for dense supervision. It will also develop datasets and benchmark protocols designed to improve robustness and generalization across scenes and conditions. Finally, the developed methods will be validated in realistic SLAM scenarios. The project brings together geometric modeling, optimization, and learning, and is expected to generate new methods for pose estimation, reconstruction, and scene understanding, with applications in robotics, localization, autonomous navigation, and augmented reality.