Abstract
Large-scale efforts like the Encyclopedia of DNA Elements (ENCODE) Project have made tremendous progress in cataloging the genomic binding patterns of DNA-associated proteins (DAPs), such as transcription factors (TFs). However most chromatin immunoprecipitation-sequencing (ChIP-seq) analyses have focused on a few immortalized cell lines whose activities and physiology deviate in important ways from endogenous cells and tissues. Consequently, binding data from primary human tissue are essential to improving our understanding of in vivo gene regulation. Here we analyze ChIP-seq data for 20 DAPs assayed in two healthy human liver tissue samples, identifying more than 450,000 binding sites. We integrated binding data with transcriptome and phased whole genome data to investigate allelic DAP interactions and the impact of heterozygous sequence variation on the expression of neighboring genes. We find our tissue-based dataset demonstrates binding patterns more consistent with liver biology than cell lines, and describe uses of these data to better prioritize impactful non-coding variation. Collectively, our rich dataset offers novel insights into genome function in healthy liver tissue and provides a valuable research resource for assessing disease-related disruptions.