RT Journal Article SR Electronic T1 geneBasis: an iterative approach for unsupervised selection of targeted gene panels from scRNA-seq JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.08.10.455720 DO 10.1101/2021.08.10.455720 A1 Alsu Missarova A1 Jaison Jain A1 Andrew Butler A1 Shila Ghazanfar A1 Tim Stuart A1 Maigan Brusko A1 Clive Wasserfall A1 Harry Nick A1 Todd Brusko A1 Mark Atkinson A1 Rahul Satija A1 John Marioni YR 2021 UL http://biorxiv.org/content/early/2021/08/10/2021.08.10.455720.abstract AB The problem of selecting targeted gene panels that capture maximum variability encoded in scRNA-sequencing data has become of great practical importance. scRNA-seq datasets are increasingly being used to identify gene panels that can be probed using alternative molecular technologies, such as spatial transcriptomics. In this context, the number of genes that can be probed is an important limiting factor, so choosing the best subset of genes is vital. Existing methods for this task are limited by either a reliance on pre-existing cell type labels or by difficulties in identifying markers of rare cell types. We resolve this by introducing an iterative approach, geneBasis, for selecting an optimal gene panel, where each newly added gene captures the maximum distance between the true manifold and the manifold constructed using the currently selected gene panel. We demonstrate, using a variety of metrics and diverse datasets, that our approach outperforms existing strategies, and can not only resolve cell types but also more subtle cell state differences. Our approach is available as an open source, easy-to-use, documented R package (https://github.com/MarioniLab/geneBasisR).Competing Interest StatementThe authors have declared no competing interest.