Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies

Andrew E Teschendorff, Charles E Breeze, Shijie C Zheng, Stephan Beck
doi: https://doi.org/10.1101/101709
Andrew E Teschendorff
1CAS Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai 20031, China.
2Department of Women‘s Cancer, University College London, 74 Huntley Street, London WC1E 6AU, United Kingdom.
3Statistical Cancer Genomics, Paul O’Gorman Building, UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, United Kingdom.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: a.teschendorff@ucl.ac.uk
Charles E Breeze
4Medical Genomics, Paul O’Gorman Building, UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, United Kingdom.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shijie C Zheng
1CAS Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai 20031, China.
5University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stephan Beck
4Medical Genomics, Paul O’Gorman Building, UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, United Kingdom.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Background Intra-sample cellular heterogeneity presents numerous challenges to the identification of biomarkers in large Epigenome-Wide Association Studies (EWAS). While a number of reference-based deconvolution algorithms have emerged, their potential remains underexplored and a comparative evaluation of these algorithms beyond tissues such as blood is still lacking.

Results Here we present a novel framework for reference-based inference, which leverages cell-type specific DNAse Hypersensitive Site (DHS) information from the NIH Epigenomics Roadmap to construct an improved reference DNA methylation database. We show that this leads to a marginal but statistically significant improvement of cell-count estimates in whole blood as well as in mixtures involving epithelial cell-types. Using this framework we compare a widely used state-of-the-art reference-based algorithm (called constrained projection) to two non-constrained approaches including CIBERSORT and a method based on robust partial correlations. We conclude that the widely-used constrained projection technique may not always be optimal. Instead, we find that the method based on robust partial correlations is generally more robust across a range of different tissue types and for realistic noise levels. We call the combined algorithm which uses DHS data and robust partial correlations for inference, EpiDISH (Epigenetic Dissection of Intra-Sample Heterogeneity). Finally, we demonstrate the added value of EpiDISH in an EWAS of smoking.

Conclusions Estimating cell-type fractions and subsequent inference in EWAS may benefit from the use of non-constrained reference-based cell-type deconvolution methods.

  • Abbreviations

    DMC
    differentially methylated CpG
    EWAS
    Epigenome-Wide-Association Study
    DNAm
    DNA methylation
    FDR
    false discovery rate
    RUV
    Removing Unwanted Variation
    DHS
    DNAse Hypersensitive Site
    sDMC
    smoking-associated differentially methylated CpG
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
    Back to top
    PreviousNext
    Posted January 19, 2017.
    Download PDF

    Supplementary Material

    Email

    Thank you for your interest in spreading the word about bioRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies
    (Your Name) has forwarded a page to you from bioRxiv
    (Your Name) thought you would like to see this page from the bioRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies
    Andrew E Teschendorff, Charles E Breeze, Shijie C Zheng, Stephan Beck
    bioRxiv 101709; doi: https://doi.org/10.1101/101709
    Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
    Citation Tools
    A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies
    Andrew E Teschendorff, Charles E Breeze, Shijie C Zheng, Stephan Beck
    bioRxiv 101709; doi: https://doi.org/10.1101/101709

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Bioinformatics
    Subject Areas
    All Articles
    • Animal Behavior and Cognition (2517)
    • Biochemistry (4964)
    • Bioengineering (3469)
    • Bioinformatics (15181)
    • Biophysics (6885)
    • Cancer Biology (5380)
    • Cell Biology (7711)
    • Clinical Trials (138)
    • Developmental Biology (4518)
    • Ecology (7135)
    • Epidemiology (2059)
    • Evolutionary Biology (10210)
    • Genetics (7497)
    • Genomics (9767)
    • Immunology (4822)
    • Microbiology (13179)
    • Molecular Biology (5129)
    • Neuroscience (29367)
    • Paleontology (203)
    • Pathology (835)
    • Pharmacology and Toxicology (1460)
    • Physiology (2129)
    • Plant Biology (4734)
    • Scientific Communication and Education (1008)
    • Synthetic Biology (1337)
    • Systems Biology (4002)
    • Zoology (768)