Abstract
Adjusting for cell composition is critical in epigenome-wide association studies of whole blood samples. Using DNA methylation of whole blood samples (as opposed to purified cell types) and complete blood counts/flow cytometry data from 2530 participants in the Health and Retirement Study, we trained and tested a computational model that extends the number of estimated leukocyte subtypes to fifteen compared to established models with six or seven cell types. Our model, which can be applied to both Illumina 450k and EPIC microarrays, explained a larger proportion of the observed variance in whole blood DNA methylation levels than popular reference-based cell deconvolution approaches, and vastly reduced the number of false-positive findings in a reanalysis of an epigenome-wide association study of chronological age.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
<heiss.jonathan{at}gmail.com>, <bakulski{at}umich.edu>, <thya0003{at}umn.edu>, <crimmin{at}usc.edu>, <jfaul{at}umich.edu>, <jonahfisher{at}hsph.harvard.edu>, <allan.just{at}mssm.edu>
Abbreviations
- CBC
- complete blood count
- CPT
- cryopreservation tube
- DNAm
- DNA methylation
- EWAS
- epigenome-wide association study
- HRS
- Health and Retirement Study
- pp
- percentage points