SUPR
CNN model of DNA methylation prediction
Dnr:

NAISS 2025/22-612

Type:

NAISS Small Compute

Principal Investigator:

Marek Bartosovic

Affiliation:

Stockholms universitet

Start Date:

2025-04-17

End Date:

2026-05-01

Primary Classification:

10616: Molecular Biology

Webpage:

Allocation

Abstract

In our lab, we’ve developed a new method called TRADE-seq (Transposase-assisted profiling of DNA methylation), which allows us to profile DNA methylation at single-cell resolution in a more scalable and accessible way. Unlike traditional methods like bisulfite sequencing, which require harsh chemical treatments and extremely high sequencing depth, TRADE-seq relies on engineered MBD-Tn5 fusion proteins that selectively bind methylated CpG regions and perform in situ tagmentation in intact nuclei. This allows us to generate methylation-enriched libraries with minimal processing—streamlined into a single-tube reaction. We’ve optimized the system using LAND (Lithium Assisted Nucleosome Depletion) to reduce background from open chromatin and improve specificity. TRADE-seq works efficiently on both bulk and droplet-based single-cell platforms like 10x Chromium, and we’ve shown that it can distinguish between cell types, such as K562 and IMR90, based on their methylation profiles. One exciting development is that we’re now training a convolutional neural network (CNN) model to predict actual DNA methylation percentages from TRADE-seq enrichment data. Since TRADE-seq is an enrichment-based method, it doesn’t provide single-nucleotide resolution or absolute methylation levels. To address this, we’re using matched whole-genome bisulfite sequencing (WGBS) data as ground truth and training the CNN to learn the relationship between signal enrichment patterns and true % methylation across genomic regions. Early results are promising—the model captures both regional methylation density and chromatin context to refine predictions. Additionally, we’ve extended TRADE-seq into a multi-omic framework (multi-TRADE-seq) that enables simultaneous profiling of DNA methylation, histone modifications, and open chromatin in the same sample. This is particularly powerful for studying epigenetic regulation in complex tissues like the adult brain, where we’ve begun mapping multiple layers of regulation in parallel. Our goal is to provide the field with a scalable, multimodal, and machine learning-enhanced approach to interrogate epigenetic regulation at single-cell resolution—with applications in development, disease, and precision medicine.