Scalable GNN and Computer‑Vision Training on Alvis for Large‑Scale Simulation & Learning

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/5-643

Type:

NAISS Medium Compute

Principal Investigator:

Jose Nunez-yanez

Affiliation:

Linköpings universitet

Start Date:

2025-12-02

End Date:

2026-12-01

Primary Classification:

10206: Computer Engineering

Webpage:

https://github.com/VKHS/n-HDP-GNN

Allocation

Alvis at C3SE: 2000 GPU-h/month

Abstract

We request a NAISS “Medium” Compute allocation on Alvis to train and evaluate graph‑neural‑network (GNN) methods for image super‑resolution, object detection, and image classification at scale. On the graph side we use large OGBN datasets (e.g., ogbn‑products, ogbn‑papers100M) to develop graph backbones and message‑passing components; on the vision side we run large‑batch pretraining/finetuning on ImageNet‑1k and comparable detection/SR corpora. Workloads are CUDA/NCCL‑based (PyTorch, DGL/PyG), containerized, and engineered for multi‑GPU A40/A100 execution with robust checkpointing. We request 20,000 weighted GPU‑hours per month to sustain iterative training, sweeps, and ablations. Goals & scope (what we will do) Unify GNNs with CV tasks. Use GNNs to encode region‑level relations (detection), refine feature graphs (classification), and enforce cross‑pixel/patch consistency (super‑resolution), comparing GNN blocks against pure CNN/Transformer baselines. Large‑scale training & transfer. Pretrain CV backbones on ImageNet; insert graph modules and measure gains on detection and SR. Train graph models on OGBN (ogbn‑products, ogbn‑papers100M) and transfer relational priors to vision tasks. Thorough evaluation. CV metrics: Top‑1, mAP, PSNR/SSIM; GNN metrics: accuracy/AUROC; plus compute/energy and scaling curves. Reproducible artifacts. Release containers, configs, and checkpoints; document graph‑CV integration recipes.