ABSTRACT
Sequence-specific DNA binding recruits transcription factors (TFs) to the genome to regulate gene expression. Here, we perform high resolution mapping of CEBP proteins to determine how sequence dictates genomic occupancy. Surprisingly, the sequence determinants for CEBPs diverge from classical models. In vivo, CEBPs recognize the fusion of a degenerate and canonical half site, which is atypical for CEBP homodimers and implies altered DNA specificity through heterodimerization. Furthermore, the minimum sequence determinants for CEBP binding are encoded by a 10-mer motif rather than the commonly annotated 8-bp sequence. This extended motif definition is broadly important. First, motif optimization within the 10-mer is strongly correlated with cell-type-independent recruitment of CEBPβ. Second, selection bias at core-motif-flanking nucleotides occurs for multiple bZip proteins. This study sheds new light on DNA-sequence specificity for bZip proteins, and provides key insights into how sequence sub-optimization affects genomic occupancy of CEBPs across cell types.