SUPR
AlphaFold predictions of unknown protein products from biosynthetic gene clusters
Dnr:

NAISS 2024/22-1167

Type:

NAISS Small Compute

Principal Investigator:

Tom Resink

Affiliation:

Karolinska Institutet

Start Date:

2024-10-28

End Date:

2025-11-01

Primary Classification:

10610: Bioinformatics and Systems Biology (methods development to be 10203)

Webpage:

Allocation

Abstract

The human microbiome consists of an assortment of commensal and mutualistic microbes, which utilize secondary metabolites, among other signals, to shape their microbial ecosystem and mediate microbe-microbe, microbe-host, and microbe-environment interactions. These secondary metabolites can be an excellent source of antibiotics or other medically relevant bioactive compounds. The synthesis of these compounds can often be attributed to spatially neighboring clusters of genes, known as biosynthetic gene clusters (BGCs). Despite the increased interest in BGCs, within the clinical and biotechnological fields, it can be challenging to functionally characterize members of these clusters. Specifically, the departure from homology-based BGC annotation towards deep learning annotation approaches, while powerful for identifying evolutionarily unique or unknown clusters, has hindered functional annotation and subsequent identification of interesting bioactive secondary metabolites. Our project will begin to delve into the likely function of currently uncharacterized BGCs. First through bioinformatic approaches, before selecting promising candidates for further biochemical and structural experiments. Specifically, we intend to utilize the allocated computational resources to derive ab initio protein structure predictions for translated coding sequences identified in BGCs. The obtained structural models, with the input sequences, will be used to assist in the functional annotation of these BGCs through other bioinformatic tools and methods.