Data-driven methods for cryo-EM reconstruction

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/5-397

Type:

NAISS Medium Compute

Principal Investigator:

Joakim Andén-Pantera

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2025-07-01

End Date:

2026-07-01

Primary Classification:

10210: Artificial Intelligence

Secondary Classification:

10601: Structural Biology

Tertiary Classification:

10106: Probability Theory and Statistics (Statistics with medical aspects at 30118 and with social aspects at 50907)

Webpage:

https://people.kth.se/~janden/

Allocation

Alvis at C3SE: 1500 GPU-h/month

Abstract

Cryo-EM is a method for 3D imaging of biological macromolecules through transmission electron microscopes. The problem has a natural geometric structure but involves a high degree of noise, requiring sophisticated mathematical methods to obtain high accuracy. Standard methods for 3D reconstruction in typically only make minimal assumptions on the nature of the images, instead leveraging the geometry of the problem to reconstruct the molecular structure. As a result, these results break down at very high noise levels where the regularity induced by the imaging geometry is lost. It has therefore become necessary to develop algorithms that leverage additional information about the problem, such as prior information about the particular objects that are imaged – in this case biomolecules. An important step in the cryo-EM processing pipeline is denoising. While conventional methods have enjoyed significant success for this task, they all break down at high noise levels. We propose an approach based on neural networks, where prior information on the structure of the projection images is used to denoise the images. In an earlier project (NAISS 2024/5-366), we investigated various neural network architectures for multiple-image denoising building on a conventional class-averaging method. For this, a transformer architecture was found to perform well, where the self-attention block serves to share information between the images. This resulted in improved performance when aligning and averaging small sets of cryo-EM images already classified. The architecture was also able to classify images, automatically clustering them and aligning the images within each cluster. Additionally, the denoised images were used as input to an ab initio reconstruction algorithm (based on common lines) where it was found to yield higher-resolution reconstructions compared to the alternative denoising methods (DnCNN, U-Net). We plan on studying several extensions of this basic network architecture. The first is to scale the method to larger sets of images necessary to deal with standard cryo-EM datasets. The second is to extend these results to experimental data. The latter will require fine-tuning the networks on experimental datasets and possibly leveraging more sophisticated (deeper) architectures. If there is extra time on the project, we also plan to use this denoising step as the first part of a pipeline for full 3D reconstruction based on method of moments estimators. Such a pipeline would then be trainend end-to-end to yield improved 3D reconstruction accuracy. In parallel with the above, we also aim to continue our work in using atomic models for 3D atomic model reconstruction. This was also investigated in the previous project and involved parameterizing the 3D reconstruction using bond angles between atoms in a molecular structure. This structure was then fitted to a dataset of cryo-EM projection images and found to perform well. This method has also shown promise in incorporating a strong (in this case physical) prior on the data. We are currently in the process of finishing up a manuscript describing this method but need to conduct some additional experiments before submitting.