Abstract
Recent technical improvements in single-cell RNA sequencing (scRNA-seq) have enabled massively parallel profiling of transcriptomes, thereby promiting large-scale studies encompassing a wide range of cell types of multicellular organisms. With this background, we propose CellFishing.jl, a new method for searching atlas-scale data sets for similar cells with high accuracy and throughput. Using multiple scRNA-seq data sets, we validate that our method demonstrates comparable accuracy to and is markedly faster than the state-of-the-art software. Moreover, CellFishing.jl is scalable to more than one million cells, and the throughput of the search is approximately 1,350 cells per second (i.e., 0.74 ms per cell).
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.