Natural Selection has Shaped Coding and Non-coding Transcription in Primate CD4+ T-cells

Transcriptional regulatory changes have been shown to contribute to phenotypic differences between species, but many questions remain about how gene expression evolves. Here we report the first comparative study of nascent transcription in primates. We used PRO-seq to map actively transcribing RNA polymerases in resting and activated CD4+ T-cells in multiple human, chimpanzee, and rhesus macaque individuals, with rodents as outgroups. This approach allowed us to measure transcription separately from post-transcriptional processes. We observed general conservation in coding and non-coding transcription, punctuated by numerous differences between species, particularly at distal enhancers and non-coding RNAs. We found evidence that transcription factor binding sites are a primary determinant of transcriptional differences between species, that stabilizing selection maintains gene expression levels despite frequent changes at distal enhancers, and that adaptive substitutions have driven lineage-specific transcription. Finally, we found strong correlations between evolutionary rates and long-range chromatin interactions. These observations clarify the role of primary transcription in regulatory evolution.

Transcriptional regulatory changes have been shown to contribute to phenotypic differences 29 between species, but many questions remain about how gene expression evolves. Here we report 30 the first comparative study of nascent transcription in primates. We used PRO-seq to map actively 31 transcribing RNA polymerases in resting and activated CD4+ T-cells in multiple human, 32 chimpanzee, and rhesus macaque individuals, with rodents as outgroups. This approach allowed us 33 to measure transcription separately from post-transcriptional processes. We observed general 34 conservation in coding and non-coding transcription, punctuated by numerous differences between 35 species, particularly at distal enhancers and non-coding RNAs. We found evidence that 36 transcription factor binding sites are a primary determinant of transcriptional differences between 37 species, that stabilizing selection maintains gene expression levels despite frequent changes at 38 distal enhancers, and that adaptive substitutions have driven lineage-specific transcription. Finally, 39 we found strong correlations between evolutionary rates and long-range chromatin interactions. 40 These observations clarify the role of primary transcription in regulatory evolution. 41 or candidate enhancers based on their proximity to gene annotations. The predicted TREs in each 126 group were highly concordant with other marks of regulatory function in human CD4+ T-cells used 127 previously to define groups of candidate enhancers, including acetylation of histone 3 lysine 27 128 (H3K27ac), mono-and trimethylation of histone 3 lysine 4 (H3K4me1 and H3K4me3), and DNase-I-129 seq signal 58 (Fig. 1e). Notably, dREG identified >83% of DNase-I hypersensitive sites (DHS) marked 130 by H3K27 acetylation in human CD4+ T-cells, consistent with prior work suggesting that 131 transcription patterns alone can recover the majority of active enhancers defined by independent 132 criteria 34,52,54 . Furthermore, we identified 88% and 91% of DHSs marked, respectively, by H3K9ac 133 and H4K16ac, two other markers of regulatory function. Taken together, these data suggest that 134 PRO-seq patterns reveal the locations of TREs with high sensitivity. In the analysis that follows, we 135 will refer to these dREG-identified distal TREs as "enhancers" for simplicity. 136 Extending our dREG analysis to untreated CD4+ T-cells from additional species revealed 137 71,748 TREs that were active in untreated T-cells in at least one species (ranging between 27,581 138 and 39,387 TREs in each species). We defined two types of changes between species: (1) changes 139 in the abundance of Pol II at TREs that were present across all species, and (2) complete gains or 140 losses in at least one species (see Supplementary Note 2, Supplementary Fig. 6, and Online 141 Methods). We found that 52% of enhancers showed evidence of changes in Pol II transcriptional 142 activity in at least one of the three primate species and 81% showed changes at the longer 143 evolutionary distance between primates and rodents (Fig. 2a), similar to recent observations in 144 other systems 10 ,24 (Supplementary Note 3). Enhancers were predicted to be completely gained or 145 lost at nearly eight times the rate of promoters (35% of enhancers; 12% of promoters; p < 2.2e-16 146 by Fisher's exact test), consistent with recent observations based on H3K27ac and H3K4me3 24 . By 147 contrast, TREs induced by π treatment were much more likely to be conserved, and showed similar 148 conservation at promoters and enhancers (Fig. 2a). 149 Next we tested whether evolutionary changes in transcriptional activity correlate with the 150 enrichment of other marks of active enhancers. Predicted lineage-specific human enhancers were 151 enriched for both active and repressive enhancer marks ( Fig. 2b; Supplementary Fig. 7). Whereas 152 apparent human gains were enriched for high levels of the active enhancer marker H3K27ac, sites 153 with reduced transcriptional activity in humans showed much lower enrichments of H3K27ac. 154 Furthermore, locations at which the dREG signal was completely lost in a human-specific fashion 155 displayed levels of H3K27ac approaching those of randomly selected background sites (Fig. 2b). 156 Intriguingly, many of the losses on the human branch retained H3K4me1, which marks both active 157 and inactive enhancers 59,60 , and these losses displayed higher levels of chromatin marks associated 158 with transcriptional repression than a random background (Fig. 2b), indicating that, at least in 159 some cases, an active ancestral primate enhancer retains a 'poised' chromatin state in human, 160 despite losing both transcriptional activity and H3K27ac. Thus, evolutionary changes in poised and 161 active marks may commonly occur as distinct events.

163
Transcriptional changes correlate with DNA sequence differences 164 To investigate whether changes in TRE activity are accompanied by changes in DNA 165 sequence, we compared phyloP sequence conservation scores 61 at transcriptionally conserved TREs 166 with phyloP scores at TREs that display evolutionary changes in transcription. Because signatures 167 of sequence conservation in TREs are likely to be most pronounced in transcription factors binding 168 sites (TFBS), we restricted our sequence conservation analyses to matches to 567 clusters of TF 169 binding motifs selected based on their distinct DNA binding specificities (see Online Methods) 62,63 . 170 We required that motif matches were present in dREG sites, and adopted a threshold at which 171 nearly half of the motifs discovered were bona fide TFBSs, as measured by ChIP-seq (positive 172 predictive value [PPV]= 0.47). 173 TFBSs found in transcriptionally conserved dREG sites showed a marked enrichment for 174 higher phyloP scores relative to surrounding regions, indicating local sequence conservation (Fig.  175  3a). By contrast, TFBSs in lineage-specific dREG sites had substantially reduced enrichments in 176 phyloP scores (Fig. 3a, cyan/blue). Notably, TFBSs in dREG sites lost on the human lineage 177 showed enhanced conservation compared with those in human-specific gains. This observation is 178 consistent with losses evolving under conservation in other mammalian species (which contribute 179 to the phyloP scores) and gains emerging relatively recently. Each of these patterns was robust to 180 corrections for potentially confounding differences in the distribution of sites, as well to choices of 181 motif score thresholds (Supplementary Fig. 8a). Relaxing the motif score threshold to provide 182 sensitivity for larger numbers of TFBSs at the expense of specificity, revealed patterns of 183 conservation that correlate with the information content of positions within the DNA sequence 184 motif ( Supplementary Fig. 8b), further supporting TF binding as the functional property driving 185 sequence conservation at these sites. Together, these analyses support the hypothesis that the 186 sequences in TFBSs are a primary driver of transcriptional differences between species. 187 We searched for examples of DNA sequence differences that might be responsible for 188 transcriptional changes following π treatment, hypothesizing that they might be characterized 189 more easily than sequences responsible for transcription in the untreated condition, because these 190 transcriptional changes were likely driven by a limited number of master TFs 64 (Fig. 1e). In one 191 example, we found nucleotide substitutions in three apparent NF-kB binding sites in the proximal 192 promoter and an internal enhancer of SGPP2 that correlate with differences in SGPP2 expression 193 ( Fig. 3b; Supplementary Fig. 9). Two of these putative binding sites were bound by NF-kB in 194 human cell lines according to ChIP-seq data from ENCODE. Moreover, substitutions observed in 195 human were found to disrupt the same position in the motif as NF-kB binding QTLs 65 (see Online  196 Methods), and showed a general trend toward higher NF-kB binding in the human alleles 197 (Supplementary Fig. 9). To test the hypothesis that observed DNA sequence changes produced 198 differential transcriptional activity, we cloned DNA from each primate species into a reporter 199 vector driving luciferase activity in an MCF-7 cell model, which recapitulates the primary 200 transcriptional features of the SGPP2 locus 66 (Supplementary Fig. 9). Differences in basal 201 luciferase activity were generally concordant with those observed between species 202 (Supplementary Fig. 10). Moreover, both the proximal promoter of SGPP2 and the internal 203 enhancer both activated luciferase expression more strongly following NF-kB activation when 204 human DNA was cloned, but not with orthologous DNA from the other primates (Fig. 3c).

205
To determine whether these TREs affect the expression of SGPP2 in its native genomic 206 context, we silenced each TRE by using CRISPRi, which targets a catalytically dead CAS9 fused to 207 the Krüppel-associated box repressor (dCAS9-KRAB), to specifically tri-methylate lysine 9 of 208 histone 3 (H3K9me3) 67 . Three independent single-guide RNAs (sgRNAs) targeting the internal 209 enhancer and two designed for the proximal promoter reduced SGPP2 transcription to 50-60% of 210 its resting level (p = 1.5e-3 and 2.6e-2, respectively, by a two-tailed t-test; Fig. 3d), consistent with 211 these TREs directly contributing to SGPP2 expression in MCF-7 cells. Three sgRNAs targeting the 212 upstream enhancer also had a significant effect on SGPP2 expression (p = 1.8e-4). Notably, the 213 genome assemblies for chimpanzee and rhesus macaque harbor deletions of this upstream TRE that 214 appear likely to affect its activity (Supplementary Fig. 9). However, although silencing individual 215 enhancers reduced the transcription level of SGPP2 following NF-kB activation, silencing individual 216 enhancers was insufficient to completely abolish induction of SGPP2 (Fig. 3d). Taken together, our 217 findings show that at least two of the three TREs regulating SGPP2 drove expression patterns 218 matching PRO-seq data in a reporter assay, but none completely explained SGPP2 activation in situ. 219 These observations suggest that that multiple causal substitutions in NF-kB binding sites may work 220 in concert to achieve human-specific activation of SGPP2. 221 In several cases, as with SGPP2, we observed numerous nucleotide substitutions within 222 individual or clustered TFBSs. These clusters of substitutions are highly unlikely to occur by chance 223 and suggest that positive selection may have driven adaptation of these binding sites. Indeed, 224 SGPP2 falls in a region recently identified as having an excess of derived alleles in modern humans 225 compared with the sequenced Neanderthal ( Fig. 3d)  polymorphism and between-species sequence divergence in TREs that had undergone human 229 lineage-specific transcriptional changes. This analysis indicated that dREG sites are most strongly 230 influenced by weak negative selection (Fig. 3f), based on an excess of low-frequency derived alleles 231 in human populations, as has been reported previously for regulatory sequences 9 . Nevertheless, 232 TREs with lineage-specific transcriptional changes in human CD4+ T-cells showed reduced weak 233 negative selection and were strikingly enriched for adaptive nucleotide substitutions (p < 0.01 234 INSIGHT likelihood ratio test; Fig. 3f), consistent with positive selection at these sites. We estimate 235 a total of at least 121 adaptive substitutions since the human/chimpanzee divergence within TFBSs 236 that undergo transcriptional changes in human CD4+ T-cells. Despite limited power to detect the 237 specific contributions of many individual TFs at our stringent motif match score threshold, we did 238 note significant excesses of putatively adaptive substitutions in the predicted binding sites of 239 several TFs, including motifs recognized by forkhead box family, POU-domain containing, and ELF/ 240 ETS family (Supplementary Fig. 11; p < 0.01, INSIGHT likelihood ratio test). These estimates 241 highlight the substantial contribution of adaptive evolutionary changes in TFBSs that may influence 242 the transcriptional activity of TREs.

244
Correlation between protein-coding and non-coding transcription 245 We noticed that evolutionary changes in protein-coding gene transcription frequently 246 correlate with changes in non-coding transcription units (TU) located nearby. To examine this 247 pattern more generally, we adapted our recently reported hidden Markov model (HMM) 70 to 248 estimate the location of TUs genome-wide, based on patterns of aligned PRO-seq reads and the 249 location of TREs. Using this method, we annotated 54,793 TUs active in CD4+ T-cells of at least one 250 of the primate species, approximately half of which overlap existing GENCODE annotations or their 251 associated upstream antisense RNAs (Supplementary Fig. 12a). A cross-species comparison of 252 the transcription levels for various TU classes (Fig. 4a)

278
Despite this overall positive correlation, transcription at enhancers evolves rapidly and is 279 frequently unaccompanied by transcriptional changes at nearby protein-coding genes. For example, 280 CCR7 transcription is highly conserved among both primate and rodent species ( To explain this effect, we searched for genomic features correlated with conservation of 286 transcription at enhancers, focusing on untreated CD4+ T-cells in order to leverage the large 287 amount of public data available for this cell type. Not surprisingly, one of the features most strongly 288 correlated with transcriptional conservation at enhancers is the distance from the nearest 289 transcription start site of a protein-coding gene (Fig. 5b). In particular, more than half of 290 enhancers located within 10 kbp of an annotated TSS are shared across all three primate species, 291 whereas for distal enhancers located between 100 kbp to 1 Mbp from a TSS that fraction drops to 292 roughly a third. This relationship is driven by lineage-specific gains or losses of enhancer activity, 293 and to a lesser extent by changes in TRE activity levels, rather than by differences in the alignability 294 of orthologous DNA (Supplementary Fig. 13).

295
These simple distance-based observations, however, ignore the critical issue of chromatin 296 interactions between enhancers and promoters. To account for such loop interactions, we extracted 297 6,520 putative TRE interactions from Chromatin Interaction Analysis with Paired End Tag  298 sequencing (ChIA-PET) data recognizing loops marked with H3K4me2 in human CD4+ T-cells 72 . 299 We found that 55% of enhancers that participate in these loops were conserved between primate 300 species compared to only 47% of non-looped enhancers ( Fig. 5c;

316
Distal loop interactions do not fully account for the disparity between enhancers and 317 promoters in evolutionary rates-even looped enhancers still evolve significantly faster than 318 promoters (p = 3e-3, Fisher's exact test). We hypothesized that redundancy in enhancers may help 319 to explain rapid enhancer evolution. More specifically, we asked whether redundancy makes 320 protein-coding genes regulated by multiple distal TREs, such as CCR7 (Fig. 5a), more robust to 321 enhancer turnover than those influenced by fewer distal TREs. Indeed, we found that evolutionary 322 conservation of promoter TRE transcription is remarkably strongly correlated with the number of 323 loop interactions a promoter has with distal sites (Fig. 6a, weighted Pearson's correlation = 0.87; p 324 < 1e-3 by a bootstrap test). A similar trend was observed between the number of loop interactions 325 made by a target promoter and DNA sequence conservation in transcription factor binding motifs at 326 the promoter, although the effect was weaker and did not meet our criteria for statistical 327 significance (Fig. 6b, weighted Pearson's correlation = 0.65; p = 0.07 by a bootstrap test).

328
But how does redundancy in distal TREs relate to the evolutionary conservation of the 329 distal TREs themselves? If redundant distal TREs compensate for one another in some way, 330 perhaps each one will be less, rather than more, conserved when their associated promoters have 331 larger numbers of loop interactions. To address this question, we examined the rate of 332 conservation of looped distal TREs as a function of the number of loops in which their gene-333 proximal partners participated. We found that DNA sequence conservation in putative TFBSs 334 negatively correlates with the number of loops at the proximal end ( Fig. 6d; weighted Pearson's 335 correlation = -0.80; p = 2e-3 by a bootstrap test). We noted a similar trend toward a negative 336 correlation between the conservation of distal TRE transcription and the number of loop 337 interactions (weighted Pearson's correlation = -0.67; p = 0.059, two-sided bootstrap test, Fig. 6c).

338
These results suggest that each associated distal TFBS is individually less essential at genes having 339 larger numbers of loop interactions with distal sites, and they are therefore consistent with a model 340 in which such TFBSs are more freely gained and lost during evolution. Taken together, our results 341 imply that distance, looping, and redundancy of enhancers all contribute to constraints on the 342 evolutionary rates of changes in gene transcription.

344
Discussion: 345 We have carried out the first comparative analysis of primary transcription in any 346 phylogenetic group, focusing on CD4+ T-cells in primates. Using PRO-seq and various 347 computational tools, we estimated the locations and abundance of transcription units with high 348 resolution and accuracy. In comparison to previous studies in primates 29,33,76-78 , this approach 349 separated primary transcription from post-transcriptional processing, allowing us to study eRNAs, 350 lincRNAs, and other rapidly degraded non-coding RNAs, as well as protein-coding genes. We found 351 clear relationships between the DNA sequences of TFBSs and differential transcription across 352 species and treatment conditions. We also found evidence that some transcriptional changes in 353 humans were driven by adaptive evolution in nearby binding sites. Overall, our study provides new 354 insights into the mode and tempo of recent evolutionary changes in transcription in primates.

355
Perhaps our most striking observation is that many non-coding transcription units, 356 particularly eRNAs and lincRNAs, have undergone rapid evolutionary changes in comparison to 357 protein-coding genes. Similar observations have been reported previously for lincRNAs 79 , but, to 358 our knowledge, the observation for eRNAs is new, and it raises a number of questions. First, why 359 are some enhancers more conserved than others? In particular, we find that enhancers proximal to, 360 or that loop to, annotated promoters tend to be most constrained (Fig 5b-c). These enhancers may 361 simply be most crucial for activating their target genes, but other factors may also contribute to 362 their constraint. For example, perhaps these enhancers are enriched for tissue-specific functions, 363 and are less constrained due to reduced pleiotropy 80 . Or perhaps many of them simply are not 364 functional at all, and are transcribed as a by-product of other processes. 365 Second, how do protein-coding genes maintain stable transcription levels across species 366 despite the rapid turnover of associated enhancers? One possibility is that many rapidly evolving 367 enhancers are either not functional or act on targets other than the ones we have identified. 368 However, several of our findings argue against this possibility; for example, we find that even 369 looped enhancers, for which we have direct evidence of a promoter interaction, evolve significantly 370 faster than promoters, and that eRNA conservation is strongly correlated with the number of loop 371 interactions at associated promoters (Fig. 6). An alternative explanation, which appears more 372 plausible to us, is that stabilizing selection on transcription levels drives enhancers to compensate 373 for one another as they undergo evolutionary flux. This explanation would be compatible with 374 reports from model systems 36,38 . Our finding that sequence conservation at distal enhancers is 375 negatively correlated with the number of loop interactions at associated promoters 376 (Supplementary Fig. 14) is also consistent with this explanation. The possibility of pervasive 377 stabilizing selection on transcription levels in primates has been noted previously based on RNA-378 seq data 81 , but our data allow for more direct observations of both active transcription and 379 associated regulatory elements.

380
A third question is, if most transcribed enhancers do indeed influence gene expression, then 381 why are so many of them weakly maintained by natural selection? At present, we can only 382 speculate on the answer to this question. One possibility is that some of the apparent turnover 383 events we have observed actually represent enhancers that have simply switched cell types in 384 activity, as has been reported in some cases 19 . But it is also possible that selection tends to act 385 diffusely on enhancers across an entire locus, rather than strongly on individual enhancers, as has 386 been proposed in cancer evolution 82 . Our observation that multiple DNA sequence changes at the 387 SGPP2 locus appear functional provides some initial support this this hypothesis. It will be possible 388 to evaluate these hypotheses more rigorously as better data describing enhancers and enhancer-389 promoter interactions across many cell types become available for these and other groups of 390 species. 391

Methods: 392 393
Multiple species PRO-seq library generation. Isolation of CD4+ T-cells from mouse and rat. Spleen samples were collected from one male mouse 412 (FVB) and one male rat (Albino Oxford) that had been sacrificed for IACUC-approved research not 413 related to the present study. Dissected spleen was mashed through a cell strainer using a sterile 414 glass pestle and suspended in 20 mL RPMI-1640. Cells were pelleted at 800xg for 3 minutes and 415 resuspended in 1-5mL of ACK lysis buffer for 10 minutes at room temperature to lyse red blood 416 cells. RPMI-1640 was added to a final volume 10 times that used for ACK lysis (10-40 mL). Cells 417 were pelleted at 800xg for 3 minutes, counted in a hemocytometer, and resuspended in RPMI-1640 418 to a final concentration of 250,000 cells per ml. CD4+ T-cells were isolated from splenocytes using 419 products specific for mouse and rat coordinates of mapped reads, dREG scores, and other parameters of interest were converted to the 455 human assembly (hg19) using CrossMap 87 . We converted genomic coordinates between genome 456 assemblies using reciprocal-best (rbest) nets 88 . Reciprocal-best nets have the advantage that 457 comparisons between species are constrained to 1:1 orthologues. This constraint on mapping is 458 enforced by requiring each position to map uniquely in a reciprocal alignment between the human 459 reference and the other species in the comparison. We downloaded rbest nets for hg19-mm10, 460 hg19-panTro4, hg19-rn6 from the UCSC Genome Browser. We created rbest nets for hg19-rheMac3 461 using the doRecipBets.pl script provided as part of the Kent Source software package.

463
Analysis of transcriptional regulatory elements. Defining a consensus set of transcriptional 464 regulatory elements. We predicted TREs using dREG 54 separately in each species' reference 465 genome. dREG uses a support vector regression model to score each site covered in a PRO-seq 466 dataset based on its resemblance to features associated with transcription start sites in a reference 467 training dataset. The dREG model was trained to recognize DNase-I-hypersensitive sites that also 468 show substantial evidence of GRO-cap data in six PRO-seq or GRO-seq datasets measuring 469 transcription in resting K562 cells. dREG scores were computed in the reference genome of each 470 species in order to provide as much information as possible on the native context of each locus. In 471 all cases, we combined the reads from all individuals for each species in order to maximize power 472 for the discovery of TREs. In the primate species, treated and untreated CD4+ T-cells were 473 analyzed separately. 474 We then defined a consensus set of TRE annotations, each of which bore the signature of an 475 active TRE in at least one species and treatment condition. To define such a set, dREG scores were 476 first converted to human reference genome (hg19) coordinates using CrossMap and the reciprocal-477 best nets. The advantage of converting dREG scores between the reference genome is that 478 individual bases transfer more completely than genomic intervals using CrossMap and related 479 tools. We then identified TREs in each species separately by thresholding the dREG scores. In all 480 analyses, we selected a threshold of 0.3, which corresponds to a predicted false discovery rate of 481 <7% compared with other sources of genomic data in human CD4+ T-cells. In addition, parallel 482 analyses at separate thresholds (0.25 and 0.35) provided results that were in all cases consistent 483 with those reported in the main manuscript (Supplementary Table 2). The set of overlapping 484 TREs from each species were reduced to a single element containing the union of all positions 485 covered by the set using bedops, and sites within 500 bp of each other were further merged. We 486 assigned each putative TRE the maximum dREG score for each species and for each treatment 487 condition.

489
Identifying differences in TREs between species. Differences in TRE transcription in 3-way (human-490 chimp-rhesus macaque) or 5-way (human-chimp-rhesus macaque-mouse-rat) species comparisons 491 were identified using a combination of heuristics and statistical tests. Starting with the consensus 492 set of TREs in hg19 coordinates, we first excluded potential one-to-many orthologs, by eliminating 493 TREs that overlapped gaps in the reciprocal-best nets that were not classified as gaps in the 494 standard nets. The remaining TREs were classified as unmappable when no orthologous position 495 was defined in the rbest nets. Complete gains and losses were defined as TREs that were mappable 496 in all species and for which the dREG score was less than 0.05 in at least one species and greater 497 than 0.30 in at least one other species (see Supplementary Note 1). Gains and losses were 498 assigned to a lineage based on an assumption of maximum parsimony under the known species 499 phylogeny. We defined a set of TREs that displayed high-confidence changes in activity by 500 comparing differences in PRO-seq read counts between species using deSeq2 and thresholding at a 501 1% false discovery rate (as described below). Changes in TRE activities were compared to histone 502 modification ChIP-seq, DNase-I-seq, and DNA methyl immunoprecipitation data from the 503 Epigenome Roadmap project 58 . 504 505 TRE classification. For some analyses, TREs were classified as likely promoters or enhancers on the 506 basis of their distance from known protein-coding gene annotations (GENCODE v.19). TRE classes 507 of primary interest include (see also Supplementary Fig. 7): (1) Promoters: near an annotated 508 transcription start site (<100 bp); (2) Enhancers: distal to an annotated transcription start site 509 (>5,000 bp) 510 511 Covariates that correlate with TRE changes. We compared the frequency at which evolutionary 512 changes in transcription occur at TREs in a variety of different genomic contexts. We examined the 513 rate of change as a function of distance from the nearest annotated transcription start site in 514 GenCode v.19. TREs were binned by distance in increments of 0.02 on a log10 scale and we 515 evaluated the mean rate at which evolutionary changes in TRE transcription arise in each bin. We 516 also compared the rate of changes in TRE transcription across a variety of functional associations, 517 including loop interactions, within the same topological associated domain, and in super-enhancers. 518 H3K4me2 ChIA-PET data describing loop interactions were downloaded from the Gene Expression 519 Omnibus (GEO) database (GSE32677) and the genomic locations of loops were converted from 520 hg18 to hg19 coordinates using the liftOver tool. We also analyzed a separate dataset profiling loop 521 interactions based on promoter capture Hi-C data in human CD4+ T-cells taken from the 522 supplementary materials of ref. 74 . Topological associated domains (TADs) based on Hi-C data for 523 GM12878 cells were also downloaded from GEO (GSE63525). Super-enhancers in CD4+ T-cells 524 were taken from the supplementary data for ref. 75 . In all cases we excluded sites with potential 525 one-to-many orthology in any of the species included in the comparison (typically just the three 526 primates). Potential one-to-many orthologs were defined based on differences in the standard and 527 reciprocal-best nets for each species pair. 528 529 Refining the location of active TREs using dREG-HD. During analyses of transcription factor binding 530 motifs we further refined the location of TREs to the region between divergent paused RNA 531 polymerase using a strategy that we call dREG-HD (manuscript in preparation, preliminary version Next we identified peaks in the imputed DNase-I hypersensitivity profile by fitting the 542 imputed DNase-I signal using a cubic spline and identifying local maxima. We optimized two free 543 parameters that control the (1) smoothness of spline curve fitting, and (2) threshold on the 544 imputed DNase-I signal intensity. Parameters were optimized to achieve an appropriate trade-off 545 between FDR and sensitivity on the testing K562 dataset. Parameters were tuned using a grid 546 optimization over free parameters. 547 548 DNA sequence analysis. Finding candidate transcription factor binding motifs. All motif analyses 549 focused on 1,964 human TF binding motifs from the CisBP database 62 clustered using an affinity 550 propagation algorithm into 567 maximally distinct DNA binding specificities (see ref 63 ). Scores, 551 which reflect a log e -odds ratio comparing each candidate motif model to a third-order Markov 552 background model, were calculated using the RTFBSDB package 63 . We selected two separate motif 553 thresholds for different analyses. Scores >10 were used in analyses which mix multiple TF binding 554 motifs, and strike a tradeoff that focuses on minimizing false positives at the expense of sensitivity. 555 We dropped the cutoff score to motifs >8 in analyses that use individual motifs in order to increase 556 statistical power. For each of these thresholds, we estimated the mean genome-wide positive 557 predictive values to be 0.47 and 0.38, respectively, for motif cutoffs of 10 and 8, by comparing 558 motifs to ChIP-seq peak calls in K562 cells. During comparative analyses we scanned each primate 559 reference genome separately with each motif to allow the detection of a putative binding site in any 560 of the species included in the analysis, and then moved scores to a human (hg19) reference genome 561 using the CrossMap tool. We chose this strategy because changes in TRE activity may reflect 562 changes in binding in any of the primate species. For example, human gains may be explained by 563 either a new binding site for a transcriptional activator in the human genome, or a loss in binding of 564 a transcriptional repressor that was bound in both primate species.

566
Motif enrichment in TREs that change during CD4+ T-cell activation. Motifs enriched in up-or down-567 regulated dREG-HD TREs during CD4+ T-cell activation (p < 0.01) were selected using Fisher's exact 568 test with a Bonferroni correction for multiple hypothesis testing. Up-or down-regulated TREs 569 were compared to a background set of >2,500 GC-content matched TREs that do not change 570 transcription levels following π treatment (log-2 fold change <0.5-fold in magnitude and p > 0.25) 571 using the enrichmentTest function in RTFBSDB 63 . To test for motif robustness, the background 572 resampling was repeated 100 times and motifs were selected that achieve a significant result in 573 >90%. 574 575 DNA sequence conservation analysis. For our evolutionary conservation analysis, we used phyloP 576 scores 61 based on the 100-way genome alignments available in the UCSC Genome Browser (hg19). 577 In all cases, bigWig files were obtained from the UCSC Genome Browser and processed using the 578 bigWig package in R. We represented evolutionary conservation as the mean phyloP score in each 579 identified TFBS in the indicated set of dREG-HD sites.

581
Enrichment of DNA sequence changes in motifs. We identified single-nucleotide DNA sequence 582 differences at sites at which two of three primate species share one base and the third species 583 diverges. We intersected these species-specific divergences with matches to transcription factor 584 binding motifs found within dREG-HD sites that undergo transcriptional changes between primate 585 species. Because many motifs in Cis-BP are similar to one another, we first partitioned the motifs 586 using clustering (as described above), and examined enrichments at the level of these clusters. 587 Motifs were ranked by the Fisher's exact test p-value of the enrichment of species divergences in 588 dREG-HD sites that change transcription status (where changes in DNA sequence and transcription 589 occur on the same branch) to dREG-HD sites that do not change. We also compute the enrichment 590 ratio, which we define as the number of species divergences in each TF binding motif in dREG-HD 591 sites that change on the same branch normalized to the same statistic in sites that do not change. 592 593 INSIGHT analysis. We examined the modes by which DNA sequences evolve in human lineage-594 specific dREG-HD sites or DHSs using INSIGHT 69 . We passed INSIGHT either complete DHSs, dREG-595 HD sites, or TFBS within dREG-HD sites that undergo the changes (see Identifying differences in 596 TREs between species) indicated in the comparison. Human gains and losses, for example, were 597 comprised of 4,384 dREG-HD sites with 9,924 separate regions (median length of 16 bp) after 598 merging overlapping TFBSs with a log-odds score greater than 10. We also analyzed 24 599 transcription factors each of which has more than 900 occurrences in dREG-HD sites that change on 600 the human branch (log-odds score >8). All analyses were conducted using the INSIGHT web server the Frasier lab and converted to a queryable database filtered to include only variants with 605 coverage by 25 reads (75 th percentile) or more to avoid noise at low read counts. For each 606 sequence/variant query, a set of four equivalent sequences/alternate allele pairs was constructed 607 by swapping which allele was the reference and getting the reverse complement for both alleles. 608 For example, given a sequence:variant:position combination of AATCGAA:C:3, the other queries 609 produced were AACCGAA:T:3 (allele swap), TTCGATT:G:5 (reverse complement), and TTCGGTT:A:5 610 (reverse complement allele swap). Frequency shifts were computed by taking the post-ChIP 611 frequency minus the pre-ChIP frequency for the human reference allele. Since k-mers longer than 7 612 had few hits, we allowed for wildcards (N) in longer sequences that would match any base. 613 Wildcards were introduced into a k-mer by matching the k-mer sequence to the NF-kB motif and 614 replacing the 3 lowest information content positions with N(s). Systematic shifts from 0 were 615 tested using a one-tailed t-test. P-values for systematic differences at multiple sites were combined 616 using Fisher's method.

618
De novo discovery of transcription units. using dREG and continues through the entire region inferred to be transcribed, which can covers 622 tens-to hundreds-of kilobases. Three states were used to represent background (i.e., outside of a 623 transcription unit), the TU body, and a post-polyA decay region. The HMM transition structure is 624 shown in Supplementary Fig. 13a. We allow skipping over the post-polyA state, as unstable 625 transcripts do not have these two-phase profiles. We took advantage of dREG as a potential signal 626 for transcription initiation by incorporating the dREG score (maximum value in the interval from a 627 given positive read-count position until the next, clamped to the zero-one interval) as a transition 628 probability from the background to the transcription body state. PRO-seq data is generally sparse, 629 so we applied a transformation that encoded only non-zero positions and the distance between 630 such consecutive positions (Supplementary Fig. 13a). Our model described this transformed data 631 using emissions distributions based on two types of variables. The first type of emission variable 632 defines the PRO-seq read counts in non-zero positions. These counts were modeled using Poisson 633 distributions in the background and post-polyA states, and using a Negative Binomial distribution 634 in the transcription body state. The negative binomial distribution can be seen as a mixture of 635 Poisson distributions with gamma-distributed rates and therefore allows for variation in TU 636 expression levels across the genome. The second type of emission variable describes the 637 distribution of distances in base pairs between positions having non-zero read counts. This 638 distribution was modeled using a separate geometric distribution for each of the three states. 639 Maximum likelihood estimates of all free parameters were obtained via Expectation Maximization, 640 on a per-chromosome basis. TU predictions were then obtained using the Viterbi algorithm with 641 parameters fixed at their maximum-likelihood values. Finally these predictions were mapped from 642 the transformed coordinates back to genomic coordinates. Source code for our implementation is 643 publicly available on GitHub: https://github.com/andrelmartins/tunits.nhp.

645
Inferring TU boundaries in the common great ape ancestor. We identified the most likely TU 646 locations in the great ape ancestor by maximum parsimony. TUs were identified and compared in 647 human reference coordinates (hg19) for all species. We used the bedops package to find the 648 intersection between the predicted TU intervals in each pair of species (i.e., human-chimp, human-649 rhesus macaque, and chimp-rhesus macaque). Intersections (>= 1bp) between pairs of species 650 were merged, resulting in a collection of TUs shared by any two pairs of species, and therefore 651 likely to be a TU in the human-chimp ancestor. All steps were applied independently on the plus 652 and minus strands. These steps identified 37,626 putative TUs active in CD4+ T-cells of the primate 653 ancestor. We added 17,167 TUs that did not overlap ancestral TUs but were found in any one of the 654 three primate species.

656
Transcription unit classification. TUs were classified by annotation type using a pipeline similar to 657 ones that we have described recently 51,70,89 . Before classifying TUs we applied a heuristic to refine 658 TUs on the basis of known annotations. TUs that completely overlap multiple gene annotations 659 were broken at the transcription start site provided that a dREG site overlapped that transcription 660 start site. Classification was completed using a set of rules to iteratively refine existing annotations, 661 as shown in Supplementary Fig. 13A. Unless otherwise stated, overlap between a TU and a 662 transcript annotation was defined such that >50% of a TU matched a gene annotation and covers at 663 least 50% of the same annotation. TUs overlapping GENCODE annotations (>50% overlap, defined 664 as above) were classified using the biotype in the GENCODE database into protein coding, lincRNA 665 (lincRNA or processed transcript), or pseudogene. The remaining transcripts were classified as 666 annotated RNA genes using GENCODE annotations, the rnaGenes UCSC Genome Browser track 667 (converted from hg18 to hg19 coordinates), and miRBase v20 90 . As many RNA genes are processed 668 from much longer TUs, we required no specific degree of overlap for RNA genes. Upstream 669 antisense (i.e., divergent) TUs were classified as those within 500bp of the transcription start site of 670 any GENCODE or higher level TU annotation (including lincRNAs). Antisense transcripts were 671 defined as those with a high degree of overlap (>50%) with annotated protein coding genes in the 672 opposite orientation. The remaining transcripts with a high degree of overlap (>50%) to annotated 673 repeats in the repeatmasker database (rmsk) were classified as repeat transcription. Finally, any 674 TUs still remaining were classified as unannotated, and were further divided into those which are 675 intergenic or that partially overlapping existing annotations.

677
Comparing transcription between conditions and species. Comparing transcription before and 678 after CD4+ T-cell activation. We compared π treated and untreated CD4+ T-cells within each of the 679 primate species using gene annotations (GENCODE v19). We counted reads in the interval between 680 500 bp downstream of the annotated transcription start site and either the end of the gene or 681 60,000 bp into the gene body (whichever was shorter). This window was selected to avoid (1) 682 counting reads in the pause peak near the transcription start site, and (2) to focus on the 5' end of 683 the gene body affected by changes in transcription during 30 minutes of π treatment assuming a 684 median elongation rate of 2 kb/ minute 51,91 . We limited analyses to gene annotations longer than 685 500 bp in length. To quantify transcription at enhancers, we counted reads in the window covered 686 by each dREG-HD site plus an additional 250 bp on each end. Differential expression analysis was 687 conducted using deSeq2 57 .

689
Comparing transcription between species. Read counts were compared between different species in 690 hg19 coordinates. In all analyses, reads were transferred to the hg19 reference genome using 691 CrossMap with rbest nets. Our analysis focused on transcription units or on the union of dREG sites 692 across species. We focused our analysis of transcription units on the interval between 250 bp 693 downstream of the annotated transcription start site and either the end of the gene or 60,000 bp 694 into the gene body (whichever was shorter). We limited our analyses to TUs longer than 500 bp in 695 length. Reads counts were obtained within each transcription unit, gene annotation, or enhancer, 696 abbreviated here as a 'region of interest' (ROI), that has confident one-to-one orthology in all 697 species examined in the analysis. This strategy of focusing on blocks of one-to-one orthology avoids 698 errors caused by systematic differences in mappability or repeat content of species-specific 699 genomic segments. We broke each each ROI into segments that have conserved orthology between 700 hg19 and all species examined in the analysis, which included either a three-way (human-chimp-701 rhesus macaque) or five-way (human-chimp-rhesus macaque-mouse-rat) species comparison. We 702 defined intervals of one-to-one orthology as those represented in levels 1, 3, and 5 of the reciprocal 703 best nets (with gaps defined in levels 2, 4, and 6) 88 . Reads that map to regions that have orthology 704 defined in all species were counted using the bigWig package in R using reads mapped to hg19 705 coordinates. Final counts for each ROI were defined as the sum of read counts within the regions of 706 orthology that intersect that ROI. ROIs without confident one-to-one orthologs in all species 707 analyzed were discarded. Our pipeline makes extensive use of the bigWig R package, Kent source 708 tools, as well as the bedops and bedtools software packages 85,92 . Differential expression was 709 conducted between species using the deSeq2 package for R, as described above. with itself. A second gel extraction was used as before to purify the linearized plasmid. The purified 740 dephosphorylated linear plasmid and phosphorylated annealed oligos were ligated together using 741 the Quick Ligation Kit (NEB). The ligated product was transformed into One Shot Stbl3 Chemically 742 Competent E. coli (ThermoFisher Scientific). 100ul of the transformed bacteria were plated on 743 Ampicillin (200ug/ml) plates. Single colonies were picked, sequenced, and the plasmid was isolated 744 using endo free midi-preps from Omega.

746
Transfection of MCF-7 G11 cell lines. We used lentivirus to transfect MCF-7 cells. Lentivirus was 747 made using lipofectamine 3000 from Invitrogen. Phoenix Hek cells (grown in DMEM with 10% FBS 748 and antibiotics) were seeded in a 6-well plate at 400,000 cells/plate. Cells were grown until ~90% 749 confluent. 1ug of pHAGE_EF1a_dCas9-KRAB plasmid from addgene (#50919) plasmid or the pLenti 750 SpBsmBI sgRNA Hygro (addgene #62205) containing each sgRNA, 0.5ug of psPAX (addgene 751 #12260), and 0.25ug pMD2.G (addgene #12259) were mixed. 752 MCF-7 G11 cells were plated at ~200,000 cells/well in a 6-well plate. 24 hours later 753 3ml/well of virus was mixed with 10ug/ml polybrene and incubated for 5 minutes at room 754 temperature. This mix was added to the cells and centrifuged for 40 minutes at 800g at 32C (Viju 755 Vijayan Pillai, personal communication). 12-24 hours later the virus was removed and fresh media 756 was added. 24-48 hours later the cells were selected with 2ug/ml puromycin for 2 weeks. The MCF-757 7 G11 dCas9-KRAB stable cell lines was grown and maintained in puromycin. A second lentiviral 758 infection was done using the stable MCF-7 G11 dCas9-KRAB cells. The same protocol was used. 24-759 48 hours later the cells were selected with 150ug/ml Hygromycin B. New stable cell lines were 760 grown and maintained in hygromycin B. 761 762 TNFa treatment. Prior to TNFa treatment, cells were grown for 3 days in DMEM with 5% FBS, 763 antibiotics, tamoxifen and hygromycin. Cells were then left untreated or treated for 40 min with 764 25ng/ml TNFa. RNA was extracted using TRIzol Reagent (Invitrogen). We reverse transcribed 1ug 765 of RNA and used this as input for real-time quantitative PCR (RT-PCR) to analyze SGPP2 expression. 766 Primers for SGPP2 were designed targeting a sequence in intron 1, upstream of the intronic 767 enhancer. Raw Cp values were transferred to units of expression using a standard dilution curve 768 comprised of a mixture of cDNA from each sample within the biological replicate. We included four 769 serial dilutions, each of which covered a two-fold difference in expression. Each sample was further 770 normalized for differences in RNA content by primers recognizing the 18S rRNA control. The ratio 771 between normalized SGPP2 expression in each sgRNA-transfected MCF-7 cell line and the empty 772 vector control was log-2 transformed and tested for differences from 0 using a two-sided t-test.   represents Spearman's rank correlation between normalized transcription levels in active gene 7 bodies. Colored boxes (top) represents the species and treatment condition of each sample. (c) MA 8 plot shows the log 2 fold-change following π treatment in human CD4+ T-cells (y-axis) as a function 9 of the mean transcription level in GENCODE annotated genes (x-axis). Red points indicate 10 statistically significant changes (p < 0.01). Several classical response genes that undergo well-11 documented changes in transcript abundance following CD4+ T-cell activation (e.g., IL2, IFNG, 12 TNFα, and EGR3) are marked. (d) Enrichment of TF binding motifs in TREs that increase 13 transcription levels following π treatment in the indicated species compared to TREs whose 14 transcription abundance does not change. in untreated human CD4+ T-cells using dREG-HD (see Online Methods). All plots are ordered based 21 on the maximum dREG score in the window. (c) MA plot shows the log2 fold-change following π treatment in human CD4+ T-cells (y-axis) as a function of the mean transcription level in GENCODE annotated genes (x-axis). Red points indicate statistically significant changes (p < 0.01). Several classical response genes that undergo well-documented changes in transcript abundance following CD4+ Tcell activation (e.g., IL2, IFNG, TNFa, and EGR3) are marked. (d) Enrichment of TF binding motifs in TREs that increase transcription levels following π treatment in the indicated species compared to TREs whose transcription abundance does not change. Table shows the Bonferroni corrected p-value, based on a Fisher's exact test (circle size), and the fold-enrichment over a group of unchanged background sequences (color scale). Motif logos and the candidate transcription factor or Cis-BP ID are shown. (e) Heatmaps show the distribution of PRO-seq (red and blue indicate transcription on the plus and minus strand, respectively), ChIP-seq for H3K27ac, H3K4me1, and H3K4me3, and DNase-I-seq signal intensity. Plots are centered on transcriptional regulatory elements (TREs) predicted in untreated human CD4+ T-cells using dREG-HD (see Online Methods). All plots are ordered based on the maximum dREG score in the window. , are not detectable and are therefore inferred as gains or losses (teal-white) or undergo significant changes (green) in at least one species, or fall in regions for which no ortholog occurs in at least one of the indicated genomes (pink). Inferred gains or losses are colored according to the FDR corrected p-value associated with changes in RNA polymerase abundance (deSeq2). Plots labeled "Primate" illustrate frequency of changes in a three-way comparison of human, chimpanzee, and rhesus macaque focusing on the untreated condition, whereas those labeled "Mammal" summarize a five-way comparison also including rat and mouse. π treatment denotes a comparison between human untreated and PMA+Ionomycin treated CD4+ T-cell samples. (b) Boxplots show the ChIP-seq signal near dREG sites classified as conserved, gains, losses, or complete losses for the indicated chromatin or DNA modification in units of reads per kilobase. The box represents the 25th and 75th percentile. Whiskers represent 1.5 times the interquartile range, and points outside of this range are not shown.  where Neanderthal has the ancestral allele. Points below the line represent a statistically significant number of derived alleles in modern human (line indicates a Z-score of -2). Net synteny tracks show the position of regions that have one-to-one orthologs in the chimpanzee and rhesus macaque genomes. (c) Luciferase signal driven by the SGPP2 promoter or the internal enhancer in MCF-7 cells using DNA from each primate species. Bars show the mean fold-induction following 3 hours of stimulation with TNFa. Error bars represent the standard error of the mean. Red ** denotes p < 1e-3 by a two-tailed t-test. (d) Transcription of SGPP2 using primers targeting intron 1 following 0 or 40 min. of TNFa treatment after silencing the indicated TRE using dCAS9-KRAB. Bars represent the median of three independent biological replicates of two gRNAs targeting the promoter, three targeting the internal enhancer, and four targeting the upstream enhancer. Error bars represent the standard error. Red * denotes p < 5e-2 and ** p < 5e-3 by a two-tailed t-test.