There is great interest in generating data and developing methods to study single cell responses to perturbations of gene expression, with particular interest in quantitative gene dosage modulation (Jost et al, 2020; Domingo et al, 2025). To address analytical needs, we have developed ‘bayesDREAM’ (a Bayesian model for studying Dosage Response Effects Across Modalities).
Based on cis gene expression, bayesDREAM leverages the inherent variability of single cell CRISPRi and CRISPRa screens to fit a dosage response function to various trans modalities. BayesDREAM first corrects for batch effects using non-targetted cells, then uses cis gene expression data to estimate the true, underlying cis gene dosage at single cell resolution. In bayesDREAM’s third and final step, the fitted cis gene dosage is used to model how the measured trans effect responds to the cis gene dosage, by fitting the sum of two Hill functions. Within bayesDREAM, multiple trans distributions are implemented, allowing various single cell response data-types to be modeled. BayesDREAM is implemented in pyro and fitted using SVI, with univariate distributions running on CPU, but multivariate distributions requiring GPU-boosted fitting.
We have tested bayesDREAM on a preliminary dataset (Domingo et al.) at both the trans gene expression level (using a negative binomial distribution) and the individual splice junction level (using a binomial distribution) showing excellent power and specificity (demonstrated via permutation simulations). Compatibility with multinomially distributed data (e.g. isoform, or donor site usage) and student-T distributed data (continuous scores) is in development.
We are currently optimising bayesDREAM to scale to more typical single-cell (sc)CRISPR screen data. Most scCRISPR screens are limited to 2-3 guides per target gene (instead of the ~20-40 of the preliminary dataset), but sequence the whole transcriptome (instead of <100 putative trans genes). Preliminary analyses of Morris et al., 2023 and unpublished pilot data from our lab have demonstrated promising results. To scale our analysis to the even larger dataset, the current bottleneck is computation. To address this limitation, we have recently implemented gpu-boosted fitting and now have the ability to run bayesDREAM on whole transcriptome data. Now, we want to benchmark the method's power and accuracy using a large number of simulations, requiring substantial, gpu-boosted compute.