SUPR
Generative AI for peptides against Pneumonia causing pathogens
Dnr:

NAISS 2025/5-213

Type:

NAISS Medium Compute

Principal Investigator:

Vaibhav Srivastava

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2025-04-29

End Date:

2025-11-01

Primary Classification:

10203: Bioinformatics (Computational Biology) (Applications at 10610)

Allocation

Abstract

Artificial intelligence has become an integral part of problem-solving in many life science projects including drug discovery. The machine learning and deep learning models for predicting various drug-like properties such as binding affinity, permeability, toxicity and solubility are helping us to identify hits and lead compounds for treating various diseases. In this project, we aim to develop machine learning and deep learning models for predicting various drug-like properties of small molecules and peptides that can serve as therapeutics against pathogens causing pneumonia (viruses, bacteria, and fungi). The models will be based on various empirical descriptors (1D, 2D and 3D in nature) and molecular fingerprints. These models will be used for screening huge chemical libraries such as IMPPAT, ZINC, Enamine Real database and peptide libraries. In addition to this, Generative AI based molecular and peptide design will be undertaken. For small molecules, the training will be carried out on ZINC and ChEMBL and fine tuning will be carried out on small molecules with known antimicrobial activities. In the case of peptides, the training will be carried out on a curated peptide database and the fine-tuning will be carried out on antimicrobial peptide database. The aim of this project is to create novel antimicrobial peptides (AMPs) that target the lungs and affect microbial species. We will use generative deep learning models, like Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Recurrent Neural Networks (RNNs), and Transformers. GANs will iteratively improve peptide quality through generator-discriminator training. Meanwhile, VAEs, RNNs, and Transformers will generate sequences via masked prediction, complete generation, and partial sequence completion. Once peptides are generated, their antimicrobial activity will be assessed using classification models built using machine learning and deep learning techniques. These classifiers will determine whether a peptide will exhibit the properties of AMP or not and, if so, identify the specific microbes it targets. In addition, validation will be done on the organics and peptides through structure-based approaches such as molecular docking, molecular dynamics and free energy calculations.