Development of machine learning models on scRNA-seq data

NAISS 2023/22-569


NAISS Small Compute

Principal Investigator:

Adam Malik


Chalmers tekniska högskola

Start Date:


End Date:


Primary Classification:

10105: Computational Mathematics




As technological advances are being made in the field of next-generation sequencing, an increasing amount of data is being generated. One such technology is RNA sequencing, which can be performed on the level of a single cell. One of the most recent technologies is Perturb-seq, also known as CRISP-seq, which is a high-throughput method of performing single cell RNA sequencing following genetic perturbations. Methods like these have applications in cancer research, where the overarching goal is to understand cell heterogeneity, both between patients and within tumors, and how cells respond to treatment. The datasets being generated are large and high-dimensional, typically resulting in expression levels for tens of thousands of genes and millions of cells. Such datasets call for methods that can reduce the dimension and provide biologically interpretable insights. To this end, our project is devoted to the development of novel machine learning methods tailored for RNA sequencing datasets. Our aims include 1. A comparative study of current methods developed for scRNA-seq datasets, and their applicability on Perturb-seq datasets. 2. Extension of current methods to fully utilize multiple datasets, both with and without genetic perturbations.