Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Finding associations in a heterogeneous setting: Statistical test for aberration enrichment

Aziz M. Mezlini, Sudeshna Das, Anna Goldenberg
doi: https://doi.org/10.1101/2020.03.23.002972
Aziz M. Mezlini
1Harvard Medical School. Boston
2Massachusetts General Hospital. Boston
3Department of Computer Science. University of Toronto
4Hospital for sick children. Toronto
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mmezlini@mgh.harvard.edu
Sudeshna Das
1Harvard Medical School. Boston
2Massachusetts General Hospital. Boston
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anna Goldenberg
3Department of Computer Science. University of Toronto
4Hospital for sick children. Toronto
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Most two-group statistical tests are implicitly looking for a broad pattern such as an overall shift in mean, median or variance between the two groups. Therefore, they operate best in settings where the effect of interest is uniformly affecting everyone in one group versus the other. In real-world applications, there are many scenarios where the effect of interest is heterogeneous. For example, a drug that works very well on only a proportion of patients and is equivalent to a placebo on the remaining patients, or a disease associated gene expression dysregulation that only occurs in a proportion of cases whereas the remaining cases have expression levels indistinguishable from the controls for the considered gene. In these examples with heterogeneous effect, we believe that using classical two-group statistical tests may not be the most powerful way to detect the signal. In this paper, we developed a statistical test targeting heterogeneous effects and demonstrated its power in a controlled simulation setting compared to existing methods. We focused on the problem of finding meaningful associations in complex genetic diseases using omics data such as gene expression, miRNA expression, and DNA methylation. In simulated and real data, we showed that our test is complementary to the traditionally used statistical tests and is able to detect disease-relevant genes with heterogeneous effects which would not be detectable with previous approaches.

Footnotes

  • We fixed a link to the Supplement in the main paper.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted March 25, 2020.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Finding associations in a heterogeneous setting: Statistical test for aberration enrichment
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Finding associations in a heterogeneous setting: Statistical test for aberration enrichment
Aziz M. Mezlini, Sudeshna Das, Anna Goldenberg
bioRxiv 2020.03.23.002972; doi: https://doi.org/10.1101/2020.03.23.002972
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Finding associations in a heterogeneous setting: Statistical test for aberration enrichment
Aziz M. Mezlini, Sudeshna Das, Anna Goldenberg
bioRxiv 2020.03.23.002972; doi: https://doi.org/10.1101/2020.03.23.002972

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2235)
  • Biochemistry (4302)
  • Bioengineering (2958)
  • Bioinformatics (13483)
  • Biophysics (5959)
  • Cancer Biology (4633)
  • Cell Biology (6641)
  • Clinical Trials (138)
  • Developmental Biology (3939)
  • Ecology (6240)
  • Epidemiology (2053)
  • Evolutionary Biology (9181)
  • Genetics (6883)
  • Genomics (8803)
  • Immunology (3918)
  • Microbiology (11286)
  • Molecular Biology (4458)
  • Neuroscience (25625)
  • Paleontology (183)
  • Pathology (722)
  • Pharmacology and Toxicology (1209)
  • Physiology (1776)
  • Plant Biology (3999)
  • Scientific Communication and Education (892)
  • Synthetic Biology (1194)
  • Systems Biology (3627)
  • Zoology (654)