Abstract
Feature selection is a relevant step in the analysis of single-cell RNA sequencing datasets. Triku is a feature selection method that favours genes defining the main cell populations. It does so by selecting genes expressed by groups of cells that are close in the nearest neighbor graph. Triku efficiently recovers cell populations present in artificial and biological benchmarking datasets, based on mutual information and silhouette coefficient measurements. Additionally, gene sets selected by triku are more likely to be related to relevant Gene Ontology terms, and contain fewer ribosomal and mitochondrial genes. Triku is available at https://gitlab.com/alexmascension/triku.
Competing Interest Statement
The authors have declared no competing interest.
5 Abbreviations
- scRNA-seq
- Single-cell RNA sequencing
- FS
- Feature Selection
- FE
- Feature Extraction
- PCA
- Principal Component Analysis
- NB
- Negative Binomial
- (NMI)
- Normalized Mutual Information
- FACS
- Fluorescence Activated Cell Sorting
- GO
- Gene Ontology
- GOEA
- Gene Ontology Enrichment Analysis
- PBMC
- Peripheral Blood Mononuclear Cells
- UMAP
- Uniform Manifold Approximation and Projection
- kNN
- k-Nearest Neighbors
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.