SUPR
Generation of training data for AI
Dnr:

NAISS 2023/22-1235

Type:

NAISS Small Compute

Principal Investigator:

Louise Persson

Affiliation:

Uppsala universitet

Start Date:

2023-11-16

End Date:

2024-12-01

Primary Classification:

10603: Biophysics

Webpage:

Allocation

Abstract

Molecular dynamics (MD) simulations find great use in combination with native mass spectrometry experiments for studying proteins. However, the aspect of how charges are distributed on a protein in the experiments is enigmatic. Currently, it cannot be detected experimentally, and methods for computational predictions are very computationally expensive. I plan to train a deep learning algorithm for predicting the distribution of charges on proteins under native mass spectrometry conditions, that can be used for performing this type of MD simulations. First, I need to generate data to train on, which is what I will use the resources in this project for. I will need to generate multiple charge configurations for a set of proteins, and perform short MD simulations to capture their conformational flexibility. I will then use DFT to obtain the total energy of the charge configurations, which I will use as ground truth in training of the algorithm.