Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

scvi-tools: a library for deep probabilistic analysis of single-cell omics data

View ORCID ProfileAdam Gayoso, View ORCID ProfileRomain Lopez, View ORCID ProfileGalen Xing, View ORCID ProfilePierre Boyeau, View ORCID ProfileKatherine Wu, View ORCID ProfileMichael Jayasuriya, View ORCID ProfileEdouard Melhman, View ORCID ProfileMaxime Langevin, View ORCID ProfileYining Liu, View ORCID ProfileJules Samaran, View ORCID ProfileGabriel Misrachi, View ORCID ProfileAchille Nazaret, View ORCID ProfileOscar Clivio, View ORCID ProfileChenling Xu, View ORCID ProfileTal Ashuach, View ORCID ProfileMohammad Lotfollahi, View ORCID ProfileValentine Svensson, View ORCID ProfileEduardo da Veiga Beltrame, View ORCID ProfileCarlos Talavera-López, View ORCID ProfileLior Pachter, View ORCID ProfileFabian J. Theis, View ORCID ProfileAaron Streets, View ORCID ProfileMichael I. Jordan, View ORCID ProfileJeffrey Regier, View ORCID ProfileNir Yosef
doi: https://doi.org/10.1101/2021.04.28.441833
Adam Gayoso
1Center for Computational Biology, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Adam Gayoso
Romain Lopez
2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Romain Lopez
Galen Xing
1Center for Computational Biology, University of California, Berkeley, Berkeley, USA
3Chan Zuckerberg Biohub, San Francisco, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Galen Xing
Pierre Boyeau
2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, USA
4École Normale Supérieure Paris-Saclay, Gif-sur-Yvette, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pierre Boyeau
Katherine Wu
2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Katherine Wu
Michael Jayasuriya
2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael Jayasuriya
Edouard Melhman
4École Normale Supérieure Paris-Saclay, Gif-sur-Yvette, France
5Centre de Mathématiques Appliquées, École polytechnique, Palaiseau, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Edouard Melhman
Maxime Langevin
5Centre de Mathématiques Appliquées, École polytechnique, Palaiseau, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Maxime Langevin
Yining Liu
2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yining Liu
Jules Samaran
6Mines Paristech, PSL University, Paris, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jules Samaran
Gabriel Misrachi
5Centre de Mathématiques Appliquées, École polytechnique, Palaiseau, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gabriel Misrachi
Achille Nazaret
5Centre de Mathématiques Appliquées, École polytechnique, Palaiseau, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Achille Nazaret
Oscar Clivio
4École Normale Supérieure Paris-Saclay, Gif-sur-Yvette, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Oscar Clivio
Chenling Xu
1Center for Computational Biology, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chenling Xu
Tal Ashuach
1Center for Computational Biology, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tal Ashuach
Mohammad Lotfollahi
7Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
8School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mohammad Lotfollahi
Valentine Svensson
9Serqet Therapeutics, Cambridge, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Valentine Svensson
Eduardo da Veiga Beltrame
10Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Eduardo da Veiga Beltrame
Carlos Talavera-López
11Cellular Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
12EMBL - EBI, Wellcome Genome Campus, Hinxton, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Carlos Talavera-López
Lior Pachter
10Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
13Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lior Pachter
Fabian J. Theis
7Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
8School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Fabian J. Theis
Aaron Streets
1Center for Computational Biology, University of California, Berkeley, Berkeley, USA
3Chan Zuckerberg Biohub, San Francisco, USA
14Department of Bioengineering, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Aaron Streets
Michael I. Jordan
1Center for Computational Biology, University of California, Berkeley, Berkeley, USA
2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, USA
15Department of Statistics, University of California, Berkeley, Berkeley, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael I. Jordan
Jeffrey Regier
16Department of Statistics, University of Michigan, Ann Arbor, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jeffrey Regier
Nir Yosef
1Center for Computational Biology, University of California, Berkeley, Berkeley, USA
2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, USA
3Chan Zuckerberg Biohub, San Francisco, USA
17Ragon Institute of MGH, MIT and Harvard, Cambridge, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nir Yosef
  • For correspondence: niryosef@berkeley.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Probabilistic models have provided the underpinnings for state-of-the-art performance in many single-cell omics data analysis tasks, including dimensionality reduction, clustering, differential expression, annotation, removal of unwanted variation, and integration across modalities. Many of the models being deployed are amenable to scalable stochastic inference techniques, and accordingly they are able to process single-cell datasets of realistic and growing sizes. However, the community-wide adoption of probabilistic approaches is hindered by a fractured software ecosystem resulting in an array of packages with distinct, and often complex interfaces. To address this issue, we developed scvi-tools (https://scvi-tools.org), a Python package that implements a variety of leading probabilistic methods. These methods, which cover many fundamental analysis tasks, are accessible through a standardized, easy-to-use interface with direct links to Scanpy, Seurat, and Bioconductor workflows. By standardizing the implementations, we were able to develop and reuse novel functionalities across different models, such as support for complex study designs through nonlinear removal of unwanted variation due to multiple covariates and reference-query integration via scArches. The extensible software building blocks that underlie scvi-tools also enable a developer environment in which new probabilistic models for single cell omics can be efficiently developed, benchmarked, and deployed. We demonstrate this through a code-efficient reimplementation of Stereoscope for deconvolution of spatial transcriptomics profiles. By catering to both the end user and developer audiences, we expect scvi-tools to become an essential software dependency and serve to formulate a community standard for probabilistic modeling of single cell omics.

Competing Interest Statement

O.C. is supported by the EPSRC Centre for Doctoral Training in Modern Statistics and Statistical Machine Learning (EP/S023151/1) and Novo Nordisk. V.S. is a full-time employee of Serqet Therapuetics and has ownership interest in Serqet Therapeutics. F.J.T. reports receiving consulting fees from Roche Diagnostics GmbH and Cellarity Inc., and ownership interest in Cellarity, Inc.

Footnotes

  • ↵† Work done while interning in the Yosef Lab, UC Berkeley

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted April 29, 2021.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
scvi-tools: a library for deep probabilistic analysis of single-cell omics data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
scvi-tools: a library for deep probabilistic analysis of single-cell omics data
Adam Gayoso, Romain Lopez, Galen Xing, Pierre Boyeau, Katherine Wu, Michael Jayasuriya, Edouard Melhman, Maxime Langevin, Yining Liu, Jules Samaran, Gabriel Misrachi, Achille Nazaret, Oscar Clivio, Chenling Xu, Tal Ashuach, Mohammad Lotfollahi, Valentine Svensson, Eduardo da Veiga Beltrame, Carlos Talavera-López, Lior Pachter, Fabian J. Theis, Aaron Streets, Michael I. Jordan, Jeffrey Regier, Nir Yosef
bioRxiv 2021.04.28.441833; doi: https://doi.org/10.1101/2021.04.28.441833
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
scvi-tools: a library for deep probabilistic analysis of single-cell omics data
Adam Gayoso, Romain Lopez, Galen Xing, Pierre Boyeau, Katherine Wu, Michael Jayasuriya, Edouard Melhman, Maxime Langevin, Yining Liu, Jules Samaran, Gabriel Misrachi, Achille Nazaret, Oscar Clivio, Chenling Xu, Tal Ashuach, Mohammad Lotfollahi, Valentine Svensson, Eduardo da Veiga Beltrame, Carlos Talavera-López, Lior Pachter, Fabian J. Theis, Aaron Streets, Michael I. Jordan, Jeffrey Regier, Nir Yosef
bioRxiv 2021.04.28.441833; doi: https://doi.org/10.1101/2021.04.28.441833

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4369)
  • Biochemistry (9545)
  • Bioengineering (7068)
  • Bioinformatics (24767)
  • Biophysics (12559)
  • Cancer Biology (9923)
  • Cell Biology (14297)
  • Clinical Trials (138)
  • Developmental Biology (7929)
  • Ecology (12074)
  • Epidemiology (2067)
  • Evolutionary Biology (15954)
  • Genetics (10903)
  • Genomics (14705)
  • Immunology (9843)
  • Microbiology (23582)
  • Molecular Biology (9454)
  • Neuroscience (50691)
  • Paleontology (369)
  • Pathology (1535)
  • Pharmacology and Toxicology (2674)
  • Physiology (3997)
  • Plant Biology (8638)
  • Scientific Communication and Education (1505)
  • Synthetic Biology (2388)
  • Systems Biology (6415)
  • Zoology (1344)