PT - JOURNAL ARTICLE AU - Olivera Grujic AU - Tanya N. Phung AU - Soo Bin Kwon AU - Adriana Arneson AU - Yuju Lee AU - Kirk E. Lohmueller AU - Jason Ernst TI - Identification and characterization of constrained non-exonic bases lacking predictive epigenomic and transcription factor binding annotations AID - 10.1101/722876 DP - 2020 Jan 01 TA - bioRxiv PG - 722876 4099 - http://biorxiv.org/content/early/2020/02/29/722876.short 4100 - http://biorxiv.org/content/early/2020/02/29/722876.full AB - Annotations of evolutionarily constraint provide important information for variant prioritization. Genome-wide maps of epigenomic marks and transcription factor binding provide complementary information for interpreting a subset of such prioritized variants. Here we developed the Constrained Non-Exonic Predictor (CNEP) to quantify the evidence of each base in the human genome being in a constrained non-exonic element from over 60,000 epigenomic and transcription factor binding features. We find that the CNEP score outperforms baseline and related existing scores at predicting constrained non-exonic bases from such data. However, a subset of such bases are still not well predicted by CNEP. We developed a complementary Conservation Signature Score by CNEP (CSS-CNEP) using conservation state and constrained element annotations that is predictive of those bases. Using human genetic variation, regulatory sequence motifs, mouse epigenomic data, and retrospectively considered additional human data we further characterize the nature of constrained non-exonic bases with low CNEP scores.