Infectious diseases remain among the leading causes of global mortality, with tuberculosis (TB) alone responsible for over 1.2 million deaths in 2024, while antimicrobial resistance (AMR) is projected to cause millions of deaths annually by 2050. Our group works on three interconnected challenges, each requiring analysis of large sequencing and multiomics datasets. On the host side, we study the immune mechanisms that determine whether Mycobacterium tuberculosis (Mtb) infection is contained or progresses to disease, aiming to identify biomarkers of imminent progression risk, which remain lacking. On the pathogen side, we are extending this work to population-scale genomic analysis of Mtb to resolve the processes driving its evolution and transmission. In parallel, to connect antibiotic use with the spread of resistance, we are developing computational models that track the plasmid-mediated dissemination of AMR genes (ARGs) across bacterial hosts and environments by linking mobile genetic elements to their microbial hosts. Together, these workstreams span host immunity, pathogen genomics, and AMR, and rely on a shared computational infrastructure dominated by GPU-based language-model inference and large-scale sequence mining.
The work is organised around three complementary axes. First, host response in TB will combine plasma proteomics, targeted transcriptomics, scRNA-seq, cellular phenotyping, and Mtb-specific antibody profiling in the Swedish TB cohort and the prospective African ERASE-TB cohort. In preliminary work, integrative analyses identified correlates of Mtb infection control and refined an early-progression signature, now being validated against WHO Target Product Profile thresholds. Second, pathogen genomics in the Swedish setting will leverage whole-genome sequencing data of Mtb isolates from Swedish TB patients to reconstruct transmission networks in a low-incidence country, characterise lineage distribution and drug-resistance mutations, and link pathogen genotypes to the host immune phenotypes profiled. This integration will enable host–pathogen association analyses that test whether specific Mtb lineages or genomic features modulate host response trajectories and progression risk. Third, AMR spread will be tracked through a plasmid host-prediction model that integrates sequence features from the DNA language model Evo2 with bacterial DNA methylation patterns, enabling species-level attribution of ARG-carrying plasmids in real-world metagenomic data from clinical, environmental, and OneHealth settings. Cross-axis methodological transfer, particularly petabase-scale public sequence mining via the recently developed tool MetaGraph, will ensure that signatures and pipelines generalise beyond the discovery cohorts.
Collectively, the project will deliver clinically validated host-response signatures of Mtb infection control and progression, a genomic epidemiology framework linking Mtb transmission and lineage diversity to host immune phenotypes in Sweden, and a generalisable pipeline for tracing plasmid-mediated ARG dissemination across ecosystems. By coupling host immunology, pathogen evolution, and resistance ecology within a shared computational platform, this work supports the WHO End TB Strategy through presymptomatic intervention and transmission interruption, and establishes a methodological foundation transferable to other chronic intracellular infections and to broader AMR surveillance.