SUPR
Large-scale structure prediction of resistance proteins
Dnr:

NAISS 2024/5-66

Type:

NAISS Medium Compute

Principal Investigator:

Johan Bengtsson-Palme

Affiliation:

Chalmers tekniska högskola

Start Date:

2024-03-27

End Date:

2025-04-01

Primary Classification:

10203: Bioinformatics (Computational Biology) (applications to be 10610)

Secondary Classification:

10606: Microbiology (medical to be 30109 and agricultural to be 40302)

Tertiary Classification:

30109: Microbiology in the medical area

Allocation

Abstract

Biocides and metals are compounds extensively used to control bacterial growth in various human activities and environments, such as food production, healthcare facilities, and industry. Similar to antibiotics, these compounds face resistance. The genes encoding biocide and metal resistance are often encoded in mobile genetic elements, alongside antibiotic resistance genes. Consequently, exposing bacteria to biocides potentially generates selecting pressure for biocide/metals and antibiotic resistance genes simultaneously, worsening the antibiotic crisis. BacMet is a centralized database that compiles information about biocide and metal resistance genes, representing the most comprehensive database of its kind to date. The database consists of two main parts: an experimentally validated database, and a predicted database. The experimentally validated database includes genes for which the role in biocide and metal resistance has been confirmed through experiments documented in the literature. The predicted database utilizes the experimental database as a model to identify homologous proteins, which are predicted from genomic and metagenomic data available in public databases. Currently, BacMet is undergoing updates. In this update, it is necessary to include homologous proteins for both previously added and new genes in the database, as genomic and metagenomic information in databases has significantly increased since the last BacMet update. Previous updates of the predicted database were conducted using protein BLAST, with homologous proteins selected based on identity thresholds. However, this approach has limitations, as it may fail to identify distantly related homologues with low identity but identical structure, which potentially means they have the same function. To overcome this limitation, structure comparisons can be used instead, where protein structures are predicted and compared through superposition to determine the level of homology among them. This approach allows for the identification of homology between proteins with minimal amino acid identity but exact tertiary structure matches. Consequently, BacMet's predicted database can become far more comprehensive and useful for researchers, enabling access not only to closely related homologues but also to distantly related ones, all within a single database. This expansion of research possibilities extends not only to proteins with well-defined annotations but also to the vast array of proteins of unknown function or hypothetical proteins encoded in bacterial genome sequences.