Semantic Communications for Collaborative Multi-agent Perception

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2026/3-254

Type:

NAISS Medium

Principal Investigator:

Carlo Fischione

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2026-04-29

End Date:

2027-05-01

Primary Classification:

20208: Computer Vision and learning System (Computer Sciences aspects in 10207)

Secondary Classification:

20203: Communication Systems

Tertiary Classification:

10210: Artificial Intelligence

Webpage:

Allocation

Alvis at C3SE: 800 GPU-h/month
Mimer at C3SE: 500 GiB

Abstract

This project studies semantic/task-oriented communication for collaborative perception in connected autonomous vehicles. We investigate a vehicle-to-infrastructure (V2I) setting in which multiple vehicles upload compact semantic representations to a roadside unit under a strict decision interval and a communication budget. The central research question is how to balance perception accuracy and communication cost when the shared information must be optimized for the final task rather than for raw signal fidelity. Our current formulation uses a two-round mechanism within a 100 ms decision interval. In the first round, vehicles transmit compact metadata and coarse importance map to support communication decisions. In the second round, they transmit refined semantic features with more bits allocated to task-relevant regions. On the infrastructure side, we study a multi-gate mixture-of-experts (MMoE) design with two complementary fusion experts: a dense expert for high-fidelity cooperative fusion and a sparse expert for communication-efficient fusion. Task-specific gates combine these experts for important region prediction and additional patch generation. The project will initially use OPV2V together with the OpenCOOD framework to emulate the agent-center architecture and to establish reproducible baselines. It will then extend to more realistic V2X settings such as V2X-Sim and DAIR-V2X. The main experimental outputs will be communication-accuracy tradeoff curves, no-communication and reduced-expert baselines, budget sweeps, and agent-number sweeps. The expected outcome is a scalable learning-based framework for collaborative perception under bandwidth constraints, together with empirical evidence on when sparse or dense communication is preferable under different resource budgets.