TY - JOUR T1 - Reference-free deconvolution of complex DNA methylation data – a systematic protocol JF - bioRxiv DO - 10.1101/853150 SP - 853150 AU - Michael Scherer AU - Petr V. Nazarov AU - Reka Toth AU - Shashwat Sahay AU - Tony Kaoma AU - Valentin Maurer AU - Christoph Plass AU - Thomas Lengauer AU - Jörn Walter AU - Pavlo Lutsik Y1 - 2020/01/01 UR - http://biorxiv.org/content/early/2020/02/16/853150.abstract N2 - Epigenomic profiling enables unique insights into human development and diseases. Often the analysis of bulk samples remains the only feasible option for studying complex tissues and organs in large patient cohorts, masking the signatures of important cell populations in convoluted signals. DNA methylomes are highly cell type-specific, and enable recovery of hidden components using advanced computational methods without the need for reference profiles. We propose a three-stage protocol for reference-free deconvolution of DNA methylomes comprising: (i) data preprocessing, confounder adjustment and feature selection, (ii) deconvolution with multiple parameters, and (iii) guided biological inference and validation of deconvolution results. Our protocol simplifies the analysis and integration of DNA methylomes derived from complex samples, including tumors. Applying this protocol to lung cancer methylomes from TCGA revealed components linked to stromal cells, tumor-infiltrating immune cells, and associations with clinical parameters. The protocol takes less than four days to complete and requires basic R skills. ER -