Machine Learning models of infant Acute Lymphoblastic Leukemia




Principal Investigator:

Jan Komorowski


Uppsala universitet

Start Date:


End Date:


Primary Classification:

30203: Cancer and Oncology


  • Castor /proj at UPPMAX: 14000 GiB
  • Cygnus /proj at UPPMAX: 14000 GiB
  • Castor /proj/nobackup at UPPMAX: 6000 GiB
  • Cygnus /proj/nobackup at UPPMAX: 6000 GiB
  • Bianca at UPPMAX: 2 x 1000 core-h/month


Cancer is a leading cause of morbidity and according to the World Health Organization (WHO), it accounts for nearly one in six deaths. Approximately 400 000 children and adolescents between the ages of 0 to 19 years develop cancer yearly, where the most common types include leukaemia, brain cancers, lymphomas, and solid tumours [1]. Importantly, many cancers can be cured if detected early and treated effectively, preferably using a strategy if precision medicine directed against the cause of the disease. Infant acute lymphoblastic leukaemia, iALL, is a rare (<5% of all childhood ALL) haematological disease with poor prognosis arising during the first year of life, with a six-year event-free survival of 46% (Interfant-06) [2, 3]. Mixed lineage leukaemia (MLL/KMT2A) rearrangements are the strongest prognostic factor and occur in 74% of the iALL patients. The MLL/KMT2A gene is a methyltransferase acting on histone H3 lysine 4 (H3K4), resulting in increased protein stability and gene activation. Importantly, large sequencing efforts have concluded that iALL patients uniquely harbour very few additional mutations [4]. The observation that MLL rearrangements (MLL-r) occur at a higher frequency in infant than in paediatric cases has prompted the hypothesis that the malignant clone arises already at an early stage of development e.g., in a hematopoietic progenitor cell, where strict developmental control is key. We will utilize scRNA-seq and BCR clonotyping to identify cells descendent from a common ancestor and connect identified gene expression signatures to the epigenome in patient iALL samples with and without MLL-r. Functional studies of potential target genes will be performed in iALL cell lines constituting relevant models of the disease. In collaboration with Professor Helene Jernberg Wiklund, we now present innovative approaches to study infant ALL (iALL) and its heterogeneity in unique model systems. With these we aim to increase our understanding of the underlying mechanisms that cause drug resistance in iALL, to define groups of iALL patients that may benefit from drugs already in clinical use, to evaluate intra-tumoral subpopulations that contribute to leukemogenesis, and to identify new targets by mapping the transcriptome on a single-cell resolution. To successfully fulfil the aims, methods for global epigenetic analysis in primary patient cells and strategies enabling functional analysis of candidate genes by physiological and pharmacological approaches have already been successfully implemented, and biobank approval for unique patient samples have been obtained. In the initiated projects, large-scale scRNA-seq of a cohort of 15 scRNA-seq iALL patients including 12 carrying KMT2A-r and 3 that are KMT2A-wt will be analysed. In addition, we will be analysing bulk RNA-seq data from a total of 26 subtypes of childhood leukaemia (n = 1665) together with scRNA-seq on normal foetal bone marrow and CD34+ cord blood samples. This strategy will allow us to unravel mechanisms underlying aberrant transcriptomic regulation in iALL with or without genetic alterations.