SUPR
Data-driven methods for cryo-EM reconstruction
Dnr:

NAISS 2024/22-618

Type:

NAISS Small Compute

Principal Investigator:

Joakim Andén-Pantera

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2024-05-02

End Date:

2025-06-01

Primary Classification:

10207: Computer Vision and Robotics (Autonomous Systems)

Webpage:

Allocation

Abstract

Cryo-EM is a method for 3D imaging of biological macromolecules through transmission electron microscopes. The problem has a natural geometric structure but involves a high degree of noise, requiring sophisticated mathematical methods to obtain high accuracy. Standard methods for 3D reconstruction in typically only make minimal assumptions on the nature of the images, instead leveraging the geometry of the problem to reconstruct the molecular structure. As a result, these results break down at very high noise levels where the regularity induced by the imaging geometry is lost. It has therefore become necessary to develop algorithms that leverage additional information about the problem, such as prior information about the particular objects that are imaged – in this case biomolecules. An important step in the cryo-EM processing pipeline is denoising. While conventional methods have enjoyed significant success for this task, they all break down at high noise levels. We propose an approach based on neural networks, where prior information on the structure of the projection images is used to denoise the images. In an earlier project (NAISS 2023/5-195), we investigated various neural network architectures for multiple-image denoising building on a conventional class-averaging method. For this, a transformer architecture was found to perform well, where each projection image is represented by a token and the self-attention block serves to share information between the images. This resulted in improved performance when aligning and averaging small sets (up to thirty-two) of cryo-EM images already classified. This proof of concept indicates that the transformer architecture has significant potential for this task and merits further investigation. We plan on studying several extensions of this basic network architecture. The first is to scale the method to larger sets of images necessary to deal with standard cryo-EM datasets. The second is to incorporate a classification step, allowing the network itself to identify images that are the same up to rotation and then aligning those. Finally, we aim to extend these results to experimental data. If there is extra time on the project, we also plan to use this denoising step as the first part of a pipeline for full 3D reconstruction based on method of moments estimators. The resulting pipeline should then be possible to train end-to-end to yield improved accuracy in the final 3D reconstruction. In parallel with the above, we also aim to continue our work in using atomic models for 3D model fitting and reconstruction. This was also investigated in the previous project (NAISS 2023/5-195) and involved parameterizing the 3D reconstruction using bond angles between atoms in a molecular structure. This structure was then fitted to a given 3D density obtained from cryo-EM reconstruction. This method has also shown promise in incorporating a strong (in this case physical) prior on the data. We aim to further develop this method and extend it to the full 3D reconstruction case, where an atomic model is fitted to projection images instead of reconstruction densities.