Abstract
High costs and technical limitations of cell sorting and single-cell techniques currently restrict the collection of large-scale, cell-type-specific DNA methylation data for a large number of individuals. This, in turn, impedes our ability to tackle key biological questions that pertain to variation within a population, such as identification of disease-associated genes at a cell-type-specific resolution. Here, we show mathematically and experimentally that cell-type-specific methylation levels of an individual can be learned from its tissue-level bulk data, as if the sample has been profiled with a single-cell resolution and then signals were aggregated in each cell population separately. Thus, our proposed approach provides an unprecedented way to perform powerful large-scale epigenetic studies with cell-type-specific resolution using relatively easily obtainable large tissue-level data. We revisit previous studies with methylation and reveal novel associations with leukocyte composition in blood and multiple novel cell-type-specific associations with rheumatoid arthritis (RA). For the latter, further evidence demonstrates correlation of the associated CpGs with cell-type-specific expression of known RA risk genes, thus rendering our results consistent with the possibility that contributors to RA pathogenesis are regulated by cell-type-specific changes in methylation.