Machine Learning for Computer Vision

NAISS 2023/1-24


NAISS Large Compute

Principal Investigator:

Michael Felsberg


Linköpings universitet

Start Date:


End Date:


Primary Classification:

10207: Computer Vision and Robotics (Autonomous Systems)



Recently, image representations based on convolutional neural networks (CNNs) have demonstrated significant improvements over the state-of-the-art in many computer vision applications including image classification, object detection, scene recognition, semantic segmentation, action recognition, and visual tracking. CNNs consist of a series of convolution and pooling operations followed by one or more fully connected (FC) layers. Deep networks are trained using raw image pixels with a fixed input size or sparse point clouds in a finite volume. These networks require large amounts of labelled training data. The introduction of large datasets (e.g. ImageNet, 14 million images, semantic 3D datasets, and synthetic datasets) and the parallelism enabled by modern GPUs have facilitated the rapid deployment of deep networks for many visual tasks. This development has led to what many peers call the deep learning revolution in computer vision. CVL is currently working on seven different research tasks within the DeepVision project for which GPU-resources are requested. 1. Visual object tracking and segmentation challenge (VOTS2023) 2. Human motion analysis from videos (new assistant professor with 3 PhD students) 3. Deep learning for large scale remote sensing scene analysis (new assistant professor) 4. Probabilistic 3D computation from time-of-flight measurements 5. Spatio-temporal networks for scene flow estimation 6. Injection of geometry into Deep Learning 7. WASP NEST _main_ (hybrid machine learning)