Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

NEBULA: a fast negative binomial mixed model for differential expression and co-expression analyses of large-scale multi-subject single-cell data

Liang He, Alexander M. Kulminski
doi: https://doi.org/10.1101/2020.09.24.311662
Liang He
1Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: lh235@duke.edu alexander.kulminski@duke.edu
Alexander M. Kulminski
1Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: lh235@duke.edu alexander.kulminski@duke.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

The growing availability of large-scale single-cell data revolutionizes our understanding of biological mechanisms at a finer resolution. In differential expression and co-expression analyses of multi-subject single-cell data, it is important to take into account both subject-level and cell-level overdispersions through negative binomial mixed models (NBMMs). However, the application of NBMMs to large-scale single-cell data is computationally demanding. In this work, we propose an efficient NEgative Binomial mixed model Using a Large-sample Approximation (NEBULA)), which analytically solves the high-dimensional integral in the marginal likelihood instead of using the Laplace approximation. Our benchmarks show that NEBULA dramatically reduces the running time by orders of magnitude compared to existing tools. We showed that NEBULA controlled false positives in identifying marker genes, while a simple negative binomial model produced spurious associations. Leveraging NEBULA, we decomposed between-subject and within-subject overdispersions of an snRNA-seq data set in the frontal cortex comprising ∼80,000 cells from a cohort of 48 individuals for Alzheimer’s diseases (AD). We observed that subpopulations and known subject-level covariates contributed substantially to the overdispersions. We carried out cell-type-specific transcriptome-wide within-subject co-expression analysis of APOE. The results revealed that APOE was most co-expressed with multiple AD-related genes, including CLU and CST3 in astrocytes, TREM2 and C1q genes in microglia, and ITM2B, an inhibitor of the amyloid-beta peptide aggregation, in both cell types. We found that the co-expression patterns were different in APOE2+ and APOE4+ cells in microglia, which suggest an isoform-dependent regulatory role in the immune system through the complement system in microglia. NEBULA opens up a new avenue for the broad application of NBMMs in the analysis of large-scale multi-subject single-cell data.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://www.synapse.org/#!Synapse:syn3219045

  • https://www.synapse.org/#!Synapse:syn18485175

  • https://github.com/lhe17/nebula

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted September 25, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
NEBULA: a fast negative binomial mixed model for differential expression and co-expression analyses of large-scale multi-subject single-cell data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
NEBULA: a fast negative binomial mixed model for differential expression and co-expression analyses of large-scale multi-subject single-cell data
Liang He, Alexander M. Kulminski
bioRxiv 2020.09.24.311662; doi: https://doi.org/10.1101/2020.09.24.311662
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
NEBULA: a fast negative binomial mixed model for differential expression and co-expression analyses of large-scale multi-subject single-cell data
Liang He, Alexander M. Kulminski
bioRxiv 2020.09.24.311662; doi: https://doi.org/10.1101/2020.09.24.311662

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3504)
  • Biochemistry (7346)
  • Bioengineering (5321)
  • Bioinformatics (20259)
  • Biophysics (10013)
  • Cancer Biology (7742)
  • Cell Biology (11298)
  • Clinical Trials (138)
  • Developmental Biology (6437)
  • Ecology (9950)
  • Epidemiology (2065)
  • Evolutionary Biology (13318)
  • Genetics (9360)
  • Genomics (12581)
  • Immunology (7700)
  • Microbiology (19016)
  • Molecular Biology (7439)
  • Neuroscience (41029)
  • Paleontology (300)
  • Pathology (1229)
  • Pharmacology and Toxicology (2135)
  • Physiology (3157)
  • Plant Biology (6860)
  • Scientific Communication and Education (1272)
  • Synthetic Biology (1895)
  • Systems Biology (5311)
  • Zoology (1089)