PT - JOURNAL ARTICLE AU - Cai Li AU - Boris Lenhard AU - Nicholas M. Luscombe TI - Integrated analysis sheds light on evolutionary trajectories of young transcription start sites in the human genome AID - 10.1101/192757 DP - 2017 Jan 01 TA - bioRxiv PG - 192757 4099 - http://biorxiv.org/content/early/2017/09/22/192757.short 4100 - http://biorxiv.org/content/early/2017/09/22/192757.full AB - Previous studies revealed widespread transcription initiation and fast turnover of transcription start sites (TSSs) in mammalian genomes. Yet how new TSSs originate and how they evolve over time remain poorly understood. To address these questions, we analyzed ∼200,000 human TSSs by integrating evolutionary and functional genomic data, particularly focusing on TSSs that emerged in the primate lineages. We found that intrinsic factors of repetitive sequences and their proximity to established regulatory modules (extrinsic factors) contribute significantly to origin of new TSSs. In early periods, young TSSs experience rapid sequence evolution driven by endogenous mutational mechanisms that reduce the instability of associated repetitive sequences. In later periods, the regulatory functions of young TSSs are gradually modified, and with evolutionary changes subject to temporal (fewer regulatory changes in younger TSSs) and spatial constraints (fewer regulatory changes in more isolated TSSs). These findings advance our understanding of how regulatory innovations arise in the genome throughout evolution and highlight the roles of repetitive sequences in these processes.