Due to the fact that objects in the world may be of different size and at different distances from the camera, there may in general be substantial a (priori unknown) scaling variabilities in the image data generated from a natural environment. Traditional deep networks are by default, however, not robust to such scaling variabilities. To address this problem, we will in this project develop scale-covariant deep networks, which obey provable covariance properties under spatial scaling transformations. Specifically, we will study extensions a previouslyproposed notion of scale-covariant and scale-invariant Gaussian derivative networks, to enable classification at scales that are not spanned by the training data.
The research that we will perform will comprise extensions of the network architecture, including design parameters, to handle more complex data sets than considered in our previous work (Lindeberg 2022, Perzanowski and Lindeberg 2024), including extensive experimental work on comparing different scale-covariant network architectures on both single-scale image classification tasks and multi-scale scale generalisation tasks.
The reason why we need better GPUs, than those that we have access to currently, is to explore larger networks, that have more parameters and thereby a much better ability to learn the image structures needed to handle more complex datasets, which we believe could substantially improve the scale generalisation properties.
References:
Lindeberg (2022) "Scale-covariant and scale-invariant Gaussian derivative networks", Journal of Mathematical Imaging and Vision, 64(3): 223-242.
Perzanowski and Lindeberg (2024) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations", arXiv preprint arXiv:2409.11140.