NAISS
SUPR
NAISS Projects
SUPR
Electrocardiogram foundation models
Dnr:

NAISS 2025/22-1227

Type:

NAISS Small Compute

Principal Investigator:

Jiawei Li

Affiliation:

Uppsala universitet

Start Date:

2025-09-12

End Date:

2026-10-01

Primary Classification:

10610: Bioinformatics and Computational Biology (Methods development to be 10203)

Webpage:

Allocation

Abstract

Electrocardiography (ECG) is one of the most widely used and cost-effective tools for detecting cardiovascular disease. However, existing machine learning approaches are typically trained on limited, homogeneous datasets and struggle to generalize across populations, leads, and recording lengths. This project addresses these limitations by developing and benchmarking ECG foundation models, which are large neural architectures pretrained on diverse ECG corpora and designed to provide generalizable representations of cardiac signals. The project pursues three main objectives. First, we will benchmark existing ECG foundation models [1–3] on standardized arrhythmia classification tasks, with a particular focus on atrial fibrillation detection. Second, we will pretrain new ECG foundation models on large, heterogeneous datasets by exploring novel tokenization strategies. Third, we will evaluate transferability of these models across multiple downstream tasks, assessing robustness with respect to different ECG lengths, numbers of leads, and patient populations. The expected outcomes of this project are threefold: (1) we will provide a framework for training and evaluating ECG foundation models that is transparent and reproducible, (2) we aim to deliver models that are more robust and generalizable than current approaches, and (3) we plan to disseminate the results through two publications: one journal submission and one conference paper in the fields of artificial intelligence and bioinformatics. Methodologically, our work requires significant computational resources for pretraining, hyperparameter exploration, and evaluation across multiple datasets. Resource allocation on the server will enable us to scale models to millions of beats, compare architectures systematically, and establish reproducible benchmarks. [1] Coppola, Edoardo, et al. "HuBERT-ECG as a self-supervised foundation model for broad and scalable cardiac applications." medRxiv (2024): 2024-11. [2] McKeen, Kaden, et al. "Ecg-fm: An open electrocardiogram foundation model." arXiv preprint arXiv:2408.05178 (2024). [3] Li, Jun, et al. "An electrocardiogram foundation model built on over 10 million recordings with external evaluation across multiple domains." arXiv preprint arXiv:2410.04133 (2024).