3D Perception for Autonomy

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/5-425

Type:

NAISS Medium Compute

Principal Investigator:

Patric Jensfelt

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2025-09-01

End Date:

2026-03-01

Primary Classification:

20201: Robotics and automation

Secondary Classification:

10207: Computer graphics and computer vision (System engineering aspects at 20208)

Tertiary Classification:

10210: Artificial Intelligence

Webpage:

Allocation

Alvis at C3SE: 3000 GPU-h/month

Abstract

3D perception is a multidisciplinary research field dedicated to the extraction of spatial information from two-dimensional images or point clouds, enabling the creation of three-dimensional representations of the visual world. Combining computer vision with geometric reasoning, this facilitates applications ranging from object recognition and scene reconstruction to autonomous navigation systems. In autonomous driving, multi-modal sensor data, including lidar and camera, is often available. Fusing this multi-modal information into a reliable scene representation is crucial in traffic scenarios involving multiple agents, where collision risk is significant. Researchers are working on developing better data representations using neural networks to improve modeling and prediction in multi-agent scenarios. While this task is manageable with ground truth (GT) labels, it becomes increasingly difficult in their absence. Furthermore, due to sensor differences between self-driving datasets, models trained on one dataset cannot be easily deployed on another dataset or vehicle. Self-supervised representation learning can help models extract meaningful features without GT labels. Additionally, in combination with contrastive and adversarial learning approaches it can contribute to developing data-invariant models.