Abstract
Single-cell sequencing (scRNA-seq) is expanding rapidly. Multiple scRNA-seq analysis packages and tools are becoming available. But currently, without advanced programing skills it is difficult to go all the way from raw data to biomarker discovery. Here we present DIscBIO, an open, multi-algorithmic pipeline for easy, fast and efficient analysis of cellular sub-populations and the molecular signatures that characterize them. The pipeline consists of four successive steps: data pre-processing, cellular clustering with pseudo-temporal ordering, defining differential expressed genes and biomarker identification. All the steps are integrated into an interactive notebook where comprehensive explanatory text, code, output data and figures are displayed in a coherent and sequential narrative. Advanced users can fully personalise DIscBIO with new algorithms and outputs. We also provide a user-friendly, cloud version of the notebook that allows non-programmers to efficiently go from raw scRNAseq data to biomarker discovery without the need of downloading any software or packages used in the pipeline. We showcase all pipeline features using a myxoid liposarcoma dataset.
Availability GitHub: https://github.com/SystemsBiologist/PSCAN; Interactive demo on Binder: https://mybinder.org/v2/gh/SystemsBiologist/PSCAN/discbio-pub?filepath=DIscBIO.ipynb”
Contact: salim.ghannoum{at}medisin.uio.no








