Real-Time Scene Graph Generation

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2026/4-349

Type:

NAISS Small

Principal Investigator:

Maëlic Neau

Affiliation:

Umeå universitet

Start Date:

2026-02-18

End Date:

2026-09-01

Primary Classification:

10210: Artificial Intelligence

Webpage:

Allocation

Alvis at C3SE: 500 GPU-h/month
Mimer at C3SE: 500 GiB

Abstract

Symbolic representations of scenes, also know as Scene Graphs, can be used in various downstream tasks, such as Visual Question Answering (VQA) or Image Captioning to understand scene dynamics at a fine-grained level. Recently, we have seen the rise of Scene Graphs for reasoning of embodied agents but also in Robotics. These new applications require real-time and low-resources approaches, which as sparked the new field of Real-Time Scene Graph Generation. In this work, we tackle this issue by proposing a new model for Real-Time Scene Graph Generation based on latest development of transformers architecture (DINO, Deformable Transformers, Low-Rank Adapter etc). We aim to push the boundaries of relationships modeling while maintaining low parameters count for efficient trade-off between accuracy and latency. We will explore different methods such as hyperparameters tuning, fine-tuning or low-rank adaptation of object detector backbones.