PT - JOURNAL ARTICLE AU - April R. Kriebel AU - Joshua D. Welch TI - Nonnegative matrix factorization integrates single-cell multi-omic datasets with partially overlapping features AID - 10.1101/2021.04.09.439160 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.04.09.439160 4099 - http://biorxiv.org/content/early/2021/04/11/2021.04.09.439160.short 4100 - http://biorxiv.org/content/early/2021/04/11/2021.04.09.439160.full AB - Single-cell genomic technologies provide an unprecedented opportunity to define molecular cell types in a data-driven fashion, but present unique data integration challenges. Integration analyses often involve datasets with partially overlapping features, including both shared features that occur in all datasets and features exclusive to a single experiment. Previous computational integration approaches require that the input matrices share the same number of either genes or cells, and thus can use only shared features. To address this limitation, we derive a novel nonnegative matrix factorization algorithm for integrating single-cell datasets containing both shared and unshared features. The key advance is incorporating an additional metagene matrix that allows unshared features to inform the factorization. We demonstrate that incorporating unshared features significantly improves integration of single-cell RNA-seq, spatial transcriptomic, SHARE-seq, and cross-species datasets. We have incorporated the UINMF algorithm into the open-source LIGER R package (https://github.com/welch-lab/liger).Competing Interest StatementA patent application on LIGER has been submitted by The Broad Institute, Inc., and The General Hospital Corporation with J.D.W. listed as an inventor. The remaining authors declare no competing interests.