Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

TWO-SIGMA-G: A New Competitive Gene Set Testing Framework for scRNA-seq Data Accounting for Inter-Gene and Cell-Cell Correlation

Eric Van Buren, Ming Hu, Liang Cheng, John Wrobel, Kirk Wilhelmsen, Lishan Su, Yun Li, View ORCID ProfileDi Wu
doi: https://doi.org/10.1101/2021.01.24.427979
Eric Van Buren
1Department of Biostatistics, Harvard T.H. Chan School of Public Health
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ming Hu
2Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Liang Cheng
3Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill
4Department of Microbiology and Immunology, The University of North Carolina at Chapel Hill
5Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Wuhan University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John Wrobel
3Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kirk Wilhelmsen
6Departments of Genetics and Neurology, Renaissance Computing Institute, University of North Carolina at Chapel Hill
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lishan Su
3Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill
4Department of Microbiology and Immunology, The University of North Carolina at Chapel Hill
7Departments of Pharmacology, Microbiology & Immunology University of Maryland School of Medicine
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yun Li
8Department of Biostatistics, The University of North Carolina at Chapel Hill
9Department of Genetics, The University of North Carolina at Chapel Hill
10Department of Computer Science, The University of North Carolina at Chapel Hill
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: did@email.unc.edu yun_li@med.unc.edu
Di Wu
8Department of Biostatistics, The University of North Carolina at Chapel Hill
11Division of Oral and Craniofacial Health Sciences, Adams School of Dentistry, The University of North Carolina at Chapel Hill
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Di Wu
  • For correspondence: did@email.unc.edu yun_li@med.unc.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

We propose TWO-SIGMA-G, a competitive gene set test designed for scRNA-seq data. TWO-SIGMA-G uses the mixed-effects regression modelling approach of our previously published TWO-SIGMA to test for differential expression at the gene-level. This regression-based approach can analyze complex designs while accommodating zero-inflated and overdispersed counts and within-sample cell-cell correlation. TWO-SIGMA-G uses a novel approach to adjust for inter-gene-correlation (IGC) at the set-level, which can inflate type-I error when ignored. Simulations demonstrate that TWO-SIGMA-G preserves type-I error and increases power in the presence of IGC compared to other methods designed for bulk and single-cell RNA-seq data. Application to two real datasets of HIV infection in mice and Alzheimer’s disease progression in humans reveal biologically meaningful results. TWO-SIGMA-G is available at https://github.com/edvanburen/twosigma.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • Funding: This work was supported by the National Institute of Health [R01GM105785 and U54HD079124 to YL, R01HL129132 to YL and EVB, UM1HG011585 to M.H., R03DE028983 to DW], the National Cancer Institute [R35CA197449 to EVB], and the University of North Carolina Computational Medicine Program [to DW and LS].

  • https://github.com/edvanburen/twosigma

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted January 26, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
TWO-SIGMA-G: A New Competitive Gene Set Testing Framework for scRNA-seq Data Accounting for Inter-Gene and Cell-Cell Correlation
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
TWO-SIGMA-G: A New Competitive Gene Set Testing Framework for scRNA-seq Data Accounting for Inter-Gene and Cell-Cell Correlation
Eric Van Buren, Ming Hu, Liang Cheng, John Wrobel, Kirk Wilhelmsen, Lishan Su, Yun Li, Di Wu
bioRxiv 2021.01.24.427979; doi: https://doi.org/10.1101/2021.01.24.427979
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
TWO-SIGMA-G: A New Competitive Gene Set Testing Framework for scRNA-seq Data Accounting for Inter-Gene and Cell-Cell Correlation
Eric Van Buren, Ming Hu, Liang Cheng, John Wrobel, Kirk Wilhelmsen, Lishan Su, Yun Li, Di Wu
bioRxiv 2021.01.24.427979; doi: https://doi.org/10.1101/2021.01.24.427979

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4399)
  • Biochemistry (9637)
  • Bioengineering (7128)
  • Bioinformatics (24959)
  • Biophysics (12679)
  • Cancer Biology (10003)
  • Cell Biology (14406)
  • Clinical Trials (138)
  • Developmental Biology (7992)
  • Ecology (12155)
  • Epidemiology (2067)
  • Evolutionary Biology (16031)
  • Genetics (10957)
  • Genomics (14785)
  • Immunology (9911)
  • Microbiology (23750)
  • Molecular Biology (9517)
  • Neuroscience (51103)
  • Paleontology (370)
  • Pathology (1547)
  • Pharmacology and Toxicology (2694)
  • Physiology (4038)
  • Plant Biology (8700)
  • Scientific Communication and Education (1512)
  • Synthetic Biology (2406)
  • Systems Biology (6461)
  • Zoology (1350)