Rotation-equivariant denoising of cryo-EM projection images using data-driven models

NAISS 2023/5-194


NAISS Medium Compute

Principal Investigator:

Joakim Andén-Pantera


Kungliga Tekniska högskolan

Start Date:


End Date:


Primary Classification:

10207: Computer Vision and Robotics (Autonomous Systems)




Cryo-EM is a method for 3D imaging of biological macromolecules through transmission electron microscopes. The problem has a natural geometric structure but involves a high degree of noise, requiring sophisticated mathematical methods to obtain high accuracy. Standard methods for 3D reconstruction in typically only make minimal assumptions on the nature of the images, instead leveraging the geometry of the problem to reconstruct the molecular structure. As a result, these results break down at very high noise levels where the regularity induced by the imaging geometry is lost. It has therefore become necessary to develop algorithms that leverage additional information about the problem, such as prior information about the particular objects that are imaged – in this case biomolecules. An important step in the cryo-EM processing pipeline is denoising. Several methods for this have been proposed, some based on traditional Wiener filter methods, while others rely on class averaging, which identifies similar images (up to rotation and translation) and then aligns and averages them to reduce the noise variance. While these methods have enjoyed significant success, they break down at high noise levels since similar images cannot be accurately identified. We propose an approach based on neural networks, where prior information on the structure of the projection images is used to denoise the images. Since the distribution of projection images is invariant to in-plane rotation, the networks must be equivariant to such transformations. In an earlier project (SNIC 2022/22-498), we investigated various neural network architectures for single-image denoising. The purpose of this project is to extend this into networks that can simultaneously denoise sets of images (starting with pairs, then working with larger sets), building on the conventional class averaging method. This will be performed using a transformer architecture, where each token is a projection image and the self-attention block serves to share information between images and improve denoising. Once this is working satisfactorily, we will look at evaluating these networks on experimental data. Given the larger neural network architectures involved in this proposal, a higher amount of computational resources is required.