SUPR
Ross King Lab - ErikB RNA analysis storage
Dnr:

NAISS 2024/23-155

Type:

NAISS Small Storage

Principal Investigator:

Erik Bjurström

Affiliation:

Chalmers tekniska högskola

Start Date:

2024-03-28

End Date:

2025-04-01

Primary Classification:

10610: Bioinformatics and Systems Biology (methods development to be 10203)

Webpage:

Allocation

Abstract

The Ross King lab at Chalmers would like to request storage and computational allocation on the SUPR cluster to perform RNAseq analysis. Our lab is developing an automated knowledge generating system, Genesis, which will be employed in the field of yeast systems biology to improve model accuracy. Genesis will autonomously perform the experimental loop (hypothesis generation, perform experiment, interpret results, validate the hypothesis, generate a new hypothesis based on the result, and so on) in the field of yeast systems biology which could potentially generate new knowledge at an accelerated pace. The work-flow of Genesis is as follows: A yeast cultivation platform generates cells (grown under differing conditions and having specific genes deleted from its genome), the cells are then analyzed using a multi-omics platform (metabolomics, transcriptomics, and phenomics), the data is then processed and fit into a mathematical model of the cell, after which an AI interprets the results and compares it to the previous iteration of the model. The AI then generates a new hypothesis from the model discrepancies by abduction, and then tests the new hypothesis by designing a cultivation experiment, thus closing the experimental loop. However, the lead times of the in silico components in the loop are incredibly short compared to the in vitro ones, meaning that the potential speed of this system is heavily bottlenecked by the wet lab components. Therefore, in order to keep up with the potential speed of an AI, a high throughput cultivation + analysis method must be developed. The most time-consuming omics analysis in our case is transcriptomics. We are therefore exploring the implementation of a RNAseq method that in the future can keep up with a high throughput cultivation platform (10,000 microchemostats). Our current plan is to extract the RNA through pre-made kits, sequence the RNA using illumina sequencing, and then process the raw reads through nf-core/rnaseq. Therefore, we would need storage for the raw .fastq files, output files (MultiQC reports and tabular gene counts) and also computational resources to process up to potentially 10,000 RNAseq samples per week. At some point at least. However, we currently have not scaled up to that number of samples yet and probably won’t for a couple of years. Instead, the scale of our current experiments is quite small. Our most recent experiments had a sample size of between 12-15 samples, which were WT vs deletant studies sampled during diauxic shift. We will at some point submit a request for larger storage and computational resources as a group, but for now it would be helpful for me to have an individual small storage and computer allocation to process the small sized sample data I currently possess.