Abstract
Single-cell RNA-sequencing (scRNA-seq) is a powerful tool to quantify transcriptional states in thousands to millions of cells. It is increasingly common for scRNA-seq data to be collected in multiple experimental conditions, yet quantifying differences between scRNA-seq datasets remains an analytical challenge. Previous efforts at quantifying such differences focus on discrete regions of the transcriptional state space such as clusters of cells. Here, we describe a continuous measure of the effect of an experiment across the transcriptomic space. First, we use the manifold assumption to model the cellular state space as a graph (or network) with cells as nodes and edges connecting cells with similar transcriptomic profiles. Next, we create an Enhanced Experimental Signal (EES) that estimates the likelihood of observing cells from each condition at every point in the manifold. We show that the EES has useful properties and information that can be extracted. The EES can be used to identify how gene expression is affected by a given perturbation, including identifying non-monotonic changes from only two conditions. We also show that we can use both the magnitude and frequency of the EES, using an algorithm we call vertex frequency clustering, to derive subsets of cells at appropriate levels of granularity (tailored to areas that change) that are enriched in the experimental or control conditions or that are unaffected between conditions. We demonstrate both algorithms using a combination of biological and synthetic datasets. Implementations are provided in the MELD Python package, which is available at https://github.com/KrishnaswamyLab/MELD.