Deep learning for protein design

NAISS 2023/5-230


NAISS Medium Compute

Principal Investigator:

Leo Hanke


Karolinska Institutet

Start Date:


End Date:


Primary Classification:

10606: Microbiology (medical to be 30109 and agricultural to be 40302)

Secondary Classification:

10602: Biochemistry and Molecular Biology

Tertiary Classification:

10601: Structural Biology



Viral glycoproteins play crucial roles in viral entry and immune evasion, making them important for vaccine development, virology research, as targets for antiviral therapies, and diagnostics. However, their inherent instability and complex folding requirements often make them challenging proteins to recombinantly produce in large quantities. Usually, these viral glycoproteins need to be engineered to render them suitable for structural analysis or biomedical application. To leverage the recent improvements in computational protein design, we have recently begun integrating them to improve vaccine antigen designs, primarily focusing on increased stability. We use guided diffusion models and deep-learning-based protein sequence design to explore sequence and structure space while preserving the immunologically relevant epitopes. We use alphafold2 for computational validation of the designed proteins and validate top-ranking proteins by expression and biophysical characterization. In addition, we use a similar approach to classify single-domain antibody (nanobody) libraries that target these viral antigens. This allows us to better select suitable candidates and increases the likelihood of identifying broadly neutralizing antiviral agents.