SUPR
Hybrid Inference
Dnr:

NAISS 2024/22-790

Type:

NAISS Small Compute

Principal Investigator:

Shashi Nagarajan

Affiliation:

Linköpings universitet

Start Date:

2024-06-01

End Date:

2025-06-01

Primary Classification:

10201: Computer Sciences

Allocation

Abstract

Overall Project Description: Spatio-temporal prediction is a central task in many domains and is usually addressed by either data-driven methods or sophisticated simulation models, both suffering from respective failure modes: Simulation is limited by the underlying (inevitably simplifying) assumptions, and machine learning degrades in boundary cases with scarce data or non-stationary data distributions. We therefore aim at optimally integrating both approaches to achieve zero failure. In principle, the system will take both the real data and the simulator as inputs and learn a predictor end-to-end. Two specific sub-projects we wish to work on are: I. Neural Kalman Filters Two modalities have traditionally been employed for the problem of dynamics learning: simulation-based (probabilistic) inference and machine-learning-based (data-driven) inference. Both modalities, however, are limited; for the former to be most accurate, learning must make use of a (probabilistic) model that faithfully represents all the complex true dynamics of the system – this can be a formidable and/or impractical task – and for the latter to be most accurate, learning may require a very large sample of data, so that edge cases of the true dynamics are sufficiently and proportionally represented – yet again, it may be difficult to get such data. Recent research [1-2] explores the merits of methods that combine principles of both these modalities. The benefits of these methods come at the cost of resource-intensive training of over-parameterised models. Our algorithm better exploits the temporal dependencies implied in the data and is able to deliver better performance with much more resource efficiency. [1] V. Garcia Satorras, Z. Akata, and M. Welling, “Combining Generative and Discriminative Models for Hybrid Inference,” NeurIPS, 2019. [2] P. Becker et al., “Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces,” ICML, 2019 II. Land Cover Classification Mapping out Land Cover classes/segments from global satellite imagery has proven to be a non-trivial problem. More specifically, the modern image classifiers/segmentation models pretrained on large image corpora fine-tuned on annotated satellite images fail to respect distribution shifts in data – for instance, urban land covers in Brasília, Brazil appear very different compared to those in Paris, France. Recent research on related problems [1-2] use meta-learning to first obtain a region-agnostic model and then fine-tune this model with a few annotated examples corresponding to a region of interest for improved predictive performance. Meta-learning, however, is well known to be computationally expensive and often involves wasteful training computations. We propose a model that uses a spatial Bayesian prior to account for local data distributions, through which we can do away with the meta-learning proposed earlier. [1] Rußwurm, Marc, et al. "Meta-learning for few-shot land cover classification." Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition workshops. 2020. [2] Tseng, Gabriel, Hannah Kerner, and David Rolnick. "TIML: Task-Informed Meta-Learning for Agriculture." arXiv preprint arXiv:2202.02124 (2022)