PT - JOURNAL ARTICLE AU - Timothy J. Durham AU - Maxwell W. Libbrecht AU - J. Jeffry Howbert AU - Jeff Bilmes AU - William Stafford Noble TI - PREDICTD: PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition AID - 10.1101/123927 DP - 2017 Jan 01 TA - bioRxiv PG - 123927 4099 - http://biorxiv.org/content/early/2017/04/04/123927.short 4100 - http://biorxiv.org/content/early/2017/04/04/123927.full AB - The Encyclopedia of DNA Elements (ENCODE) and the Roadmap Epigenomics Project have produced thousands of data sets mapping the epigenome in hundreds of cell types. However, the number of cell types remains too great to comprehensively map given current time and financial constraints. We present a method, PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition (PREDICTD), to address this issue by computationally imputing missing experiments in collections of epigenomics experiments. PREDICTD leverages an intuitive and natural model called “tensor decomposition” to impute many experiments simultaneously. Compared with the current state-of-the-art method, ChromImpute, PREDICTD produces lower overall mean squared error, and combining methods yields further improvement. We show that PREDICTD data can be used to investigate enhancer biology at non-coding human accelerated regions. PREDICTD provides reference imputed data sets and open-source software for investigating new cell types, and demonstrates the utility of tensor decomposition and cloud computing, two technologies increasingly applicable in bioinformatics.