Abstract
Many single-cell RNA-sequencing studies have collected time-series data to investigate transcriptional changes concerning various notions of biological time, such as cell differentiation, embryonic development, and response to stimulus. Accordingly, several unsupervised and supervised computational methods have been developed to construct single-cell pseudotime embeddings for extracting the temporal order of transcriptional cell states from these time-series scRNA-seq datasets. However, existing methods, such as psupertime, suffer from low predictive accuracy, and this problem becomes even worse when we try to generalize to other data types such as scATAC-seq or microscopy images. To address this problem, we propose Sceptic, a support vector machine model for supervised pseudotime analysis. Whereas psupertime employs a single joint regression model, Sceptic simultaneously trains multiple classifiers with separate score functions for each time point and also allows for non-linear kernel functions. Sceptic first generates a probability vector for each cell and then aims to predict chronological age via conditional expectation. We demonstrate that Sceptic achieves significantly improved prediction power (accuracy improved by 1.4 − 38.9%) for six publicly available scRNA-seq data sets over state-of-the-art methods, and that Sceptic also works well for single-nucleus image data. Moreover, we observe that the pseudotimes assigned by Sceptic show stronger correlations with nuclear morphology than the observed times, suggesting that these pseudotimes accurately capture the heterogeneity of nuclei derived from a single time point and thus provide more informative time labels than the observed times. Finally, we show that Sceptic accurately captures sex-specific differentiation timing from both scATAC-seq and scRNA-seq data.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Figure 2 revised. Supplemental files updated.