RT Journal Article SR Electronic T1 Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics and epigenetics data JF bioRxiv FD Cold Spring Harbor Laboratory SP 143990 DO 10.1101/143990 A1 Quan Nguyen A1 Ross L. Tellam A1 Marina Sanchez-Naval A1 Laercio R. Porto-Neto A1 James Kijas A1 William Barendse A1 Antonio Reverter A1 Ben Hayes A1 Brian P. Dalrymple YR 2017 UL http://biorxiv.org/content/early/2017/06/05/143990.abstract AB Genome sequences for hundreds of mammalian species are available, but an understanding of genomic regulatory regions for non-model species is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The pipeline utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to predict homologous regions in another mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in non-model species. Importantly, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits, and identifying potential genome editing targets.