PT - JOURNAL ARTICLE AU - Yang, Andrian AU - Troup, Michael AU - Lin, Peijie AU - Ho, Joshua W. K. TI - Falco: A quick and flexible single-cell RNA-seq processing framework on the cloud AID - 10.1101/064006 DP - 2016 Jan 01 TA - bioRxiv PG - 064006 4099 - http://biorxiv.org/content/early/2016/10/27/064006.short 4100 - http://biorxiv.org/content/early/2016/10/27/064006.full AB - Summary Single-cell RNA-seq (scRNA-seq) is increasingly used in a range of biomedical studies. Nonetheless, current RNA-seq analysis tools are not specifically designed to efficiently process scRNA-seq data due to their limited scalability. Here we introduce Falco, a cloud-based framework to enable paralellisation of existing RNA-seq processing pipelines using big data technologies of Apache Hadoop and Apache Spark for performing massively parallel analysis of large scale transcriptomic data. Using two public scRNA-seq data sets and two popular RNA-seq alignment/feature quantification pipelines, we show that the same processing pipeline runs 2.6 – 145.4 times faster using Falco than running on a highly optimised single node analysis. Falco also allows user to the utilise low-cost spot instances of Amazon Web Services (AWS), providing a 65% reduction in cost of analysis.Availability Falco is available via a GNU General Public License at https://github.com/VCCRI/Falco/Contact j.ho{at}victorchang.edu.auSupplementary information Supplementary data are available at BioRXiv online.