High-resolution mapping of DNA alkylation damage and base excision repair at yeast transcription factor binding sites

DNA base damage arises frequently in living cells and needs to be removed by base excision repair (BER) to prevent mutagenesis and genome instability. Both the formation and repair of base damage occur in chromatin and are conceivably affected by DNA-binding proteins such as transcription factors (TFs). However, to what extent TF binding affects base damage distribution and BER in cells is unclear. Here, we used a genome-wide damage mapping method, N-methylpurine-sequencing (NMP-seq), to characterize alkylation damage distribution and BER at TF binding sites in yeast cells treated with the alkylating agent methyl methanesulfonate (MMS). Our data shows that alkylation damage formation was mainly suppressed at the binding sites of yeast TFs Abf1 and Reb1, but individual hotspots with elevated damage levels were also found. Additionally, Abf1 and Reb1 binding strongly inhibits BER in vivo and in vitro, causing slow repair both within the core motif and its adjacent DNA. The observed effects are caused by the TF-DNA interaction, because damage formation and BER can be restored by depletion of Abf1 or Reb1 protein from the nucleus. Thus, our data reveal that TF binding significantly modulates alkylation base damage formation and inhibits repair by the BER pathway. The interplay between base damage formation and BER may play an important role in affecting mutation frequency in gene regulatory regions.


Introduction 38
DNA in living cells is exposed to an array of genotoxic agents, both endogenous and 39 exogenous. Alkylating agents comprise a large number of reactive chemicals present in 40 cells and in the environment (Fu et al., 2012), which can react with the nitrogen and 41 oxygen atoms of DNA bases to induce formation of alkylation damage. Some alkylation 42 prevention and therapy. 48 The most common alkylation lesions are N-methylpurines (NMPs), including 7-49 methylguanine (7meG) and, to a lesser extent, 3-methyladenine (3meA) (Kondo et al., 50 2010). Although 7meG is not genotoxic by itself, it is prone to spontaneous depurination 51 to form a mutagenic apurinic (AP) site (Fu et al., 2012). 7meG can also form deleterious 52 DNA-protein crosslinks with the lysine-rich histone tails (Yang et al., 2018). The 3meA 53 damage is even more harmful than 7meG, as 3meA lesions block DNA polymerases 54 and affect DNA replication (Plosky et al., 2008). Hence, NMP lesions need to be 55 repaired in a timely manner to avoid detrimental outcomes such as cell death or 56 mutations. The primary repair pathway for NMPs is base excision repair (BER), which is 57 initiated by alkyladenine-DNA glycosylase (AAG; also known as MPG and ANPG) in 58 human cells or its yeast ortholog Mag1 (Wyatt et al., 1999). During BER, AAG/Mag1 59 removes the alkylated base and generates an AP site, which is then cleaved by the 60 apurinic/apyrimidinic endonuclease (APE1) (Whitaker and Freudenthal, 2018). 61 Subsequently, DNA polymerase and ligase are recruited to the nick to conduct repair 62 synthesis and ligation, respectively (Krokan and Bjørås, 2013). mutation rates in non-UV exposed cancers remains elusive. Since base damage (e.g., 77 oxidative, alkylation, uracil, and so on) caused by endogenous and exogenous 78 damaging sources is prevalently associated with cancer mutations (Tubbs and 79 Nussenzweig, 2017; Wallace et al., 2012), a potential mechanism for mutation elevation 80 in non-UV exposed tumors is increased base damage formation and/or suppressed 81 BER in TF-bound DNA. However, this hypothesis has not been tested and it is unclear 82 to what extent TF binding affects base damage formation and BER. 83 Alkylation damage has been widely used as a model lesion for BER studies (Fu et 84 al., 2012;Li et al., 2015). We previously developed an alkylation damage mapping 85 method, N-methylpurine-sequencing (NMP-seq), to precisely map 7meG and 3meA 86 lesions in cells treated with methyl methanesulfonate (MMS) (Mao et al., 2017). Here, 87 we used NMP-seq to analyze alkylation damage formation and BER at the binding sites 88 of ARS binding factor 1 (Abf1) and rDNA enhancer binding protein 1 (Reb1), two 89 essential yeast TFs that have been extensively characterized. The genome-wide 90 binding sites for Abf1 and Reb1 have been identified at near base-pair resolution 91 (Kasinathan et al., 2014;Rossi et al., 2021) and the DNA-binding mechanisms were 92 analyzed in previous studies (Jaiswal et al., 2016;McBroom and Sadowski, 1994a complementary to the adaptor generates a genome-wide profile of NMP lesions at 106 single-nucleotide resolution (Mao et al., 2017). 107 To determine how TF binding affects NMP lesion formation, we analyzed initial NMP 108 lesions at Abf1 and Reb1 binding sites in yeast immediately after 10 min MMS 109 treatment (i.e., no repair incubation). The ongoing BER during the period of MMS 110 exposure may repair some of the damage and affect analysis of NMP formation. To 111 minimize the effect of endogenous BER, we used a BER-deficient mag1 deletion strain 112 (i.e., mag1Δ) to profile the initial NMP distribution. We obtained a total of ~44 million 113 sequencing reads in MMS-treated mag1Δ cells. The majority of the reads (~56%) were 114 associated with G nucleotides (G reads), followed by A nucleotides (A reads) 115 (Supplemental Fig. S1B), consistent with the expected trend of 7meG and 3meA lesion 116 formation after MMS treatment (Friedberg et al., 2006). 117 Since 7meG is the major class of lesion induced by MMS, we first characterized 118 7meG formation at Abf1 and Reb1 binding sites. To account for potential DNA 119 sequence bias at the binding sites, we also mapped NMP damage in naked yeast 120 genomic DNA, in which all proteins were removed and the purified DNA was damaged 121 by incubating with MMS (Supplemental Fig. S1C and S1D). Normalization of cellular G 122 reads by the naked DNA G reads enables us to elucidate the modulation of 7meG 123 formation by TF proteins. Importantly, we found that formation of 7meG was significantly 124 inhibited at Abf1 and Reb1 binding sites relative to the flanking DNA ( Fig. 1A and 1B, 125 Supplemental Fig. S2A). Analysis of the average 7meG levels in 5 bp non-overlapping 126 moving windows indicates that 7meG was reduced by up to 40% and 70% for Abf1 and 127 Reb1 binding sites, respectively. Furthermore, the extent of damage reduction was 128 correlated with the level of TF occupancy, as Reb1 binding sites with low occupancy 129 (occupancy <10) (Kasinathan et al., 2014) only slightly reduced 7meG formation (Fig.  130   1C). 131 Damage formation was further analyzed in the TF core motif and its immediately 132 adjacent DNA (20 bp on each side of the motif midpoint). This analysis shows that 133 7meG formation was strongly suppressed in the conserved regions of the motif 134 sequences (Fig. 1D, 1E and Supplemental Fig. S2B) where Abf1 and Reb1 proteins 135 directly contact DNA (Jaiswal et al., 2016;McBroom and Sadowski, 1994a). In contrast, 136 7meG damage levels were not affected outside of the core motif (e.g., -20 to -10 and 10 137 to 20 bp relative to the motif midpoint). 7meG levels were relatively even across the 138 'low-occupancy' Reb1 binding sites ( Fig. 1F and Supplemental Fig. S2B), even though 139 these sites have nearly identical motif sequence as the 'high-occupancy' binding sites. 140 While damage formation was mainly suppressed in the core motif, we also saw 141 increased 7meG levels (~1.5 fold) at a few positions (e.g., -7, -3, -2, and 0) at the edge 142 of the Abf1 motif or between the two highly conserved regions within the motif (Fig. 1D). 143 Moreover, analysis of A reads indicates that 3meA formation was increased at the -3 144 position of the 'high-occupancy' Reb1 sites, but not at the same position of the 'low-145 occupancy' Reb1 sites (Supplemental Fig. S3A and S3B). Intriguingly, the increased 146 3meA formation appears to be position dependent, because the adjacent -2 and -1 147 positions (both are conserved in A or T) did not show elevated 3meA damage formation. 148 To understand why the -3 position is sensitive to MMS treatment, we analyzed the 149 published Reb1-DNA complex structure (Jaiswal et al., 2016). Analysis of the structural 150 data indicates that Reb1 protein binding causes a large curvature (~56°) in DNA and 151 significantly compresses the minor groove near the -3 position (Supplemental Fig. S3C  152 and S3D). These structural changes caused by Reb1 protein binding may play a role in 153 modulating 3meA formation. 154

Abf1 and Reb1 binding inhibits repair of 7meG lesions 155
To address how TF binding affects 7meG repair in cells, we analyzed NMP-seq data 156 generated after repair incubation (e.g., 1 and 2 h repair). Repair analysis was conducted 157 by normalizing 7meG lesions at each time point to the initial 7meG damage (i.e., 0 h 158 repair). This analysis considers the variable amounts of initial damage along the motif 159 sequence, which can conceivably impact remaining damage after repair. The 160 normalization (i.e., damage after repair / initial damage) results in fraction of remaining 161 damage, which is inversely correlated with DNA repair activity (Mao et al., 2017(Mao et al., , 2016. 162 Our analysis indicates that repair of 7meG lesions was strongly suppressed at both 163  Repair of 7meG by BER is initiated by the Mag1 glycosylase in yeast (Wyatt et al., 178 1999). To test if the inhibited repair of 7meG at TF binding sites is due to reduced BER, 179 we analyzed 7meG repair in the mag1Δ mutant strain. NMP-seq analysis in this mutant 180 revealed higher levels of unrepaired 7meG lesions at 2 h than in WT (Fig. 2F), 181 consistent with deficient BER for NMPs in the mutant. Moreover, there was no 182 difference in remaining damage between the TF binding sites and flanking DNA in 183 mag1Δ cells (Fig. 2F), confirming that BER is inhibited by TF binding. 184 The above analyses were performed using TF binding data generated with occupied 185 regions of genomes from affinity-purified naturally isolated chromatin (ORGANIC), a 186

Depletion of Abf1 or Reb1 protein restores 7meG formation and elevates BER at 203 their binding sites 204
Our data suggests that TF binding acts as a barrier to the damaging chemical MMS and 205 BER enzymes. We hypothesize that removal of the TF would expose the binding sites 206 to MMS and repair enzymes. As both Abf1 and Reb1 are essential for yeast survival 207 and cannot be knocked out, we used the published Anchor-Away strategy (Haruki et al.,208 2008) to conditionally and rapidly export the protein from the nucleus to the cytoplasm. 209 We then performed NMP-seq experiments in the TF-depleted yeast strains to analyze 210 7meG formation and repair. Both Abf1 and Reb1 anchor-away strains (Abf1-AA and 211 Reb1-AA) were generated and used to study their impacts on gene transcription (Kubik 212 et al., 2018(Kubik 212 et al., , 2015. We followed the published protocol to deplete Abf1 or Reb1 from the 213 nucleus with rapamycin. Moreover, growth of Abf1-AA or Reb1-AA strain was inhibited 214 on rapamycin-containing plates (Supplemental Fig. S5A), confirming that nuclear 215 depletion of either protein is lethal for yeast cells (Kubik et al., 2015). 216 In the control strain (WT-AA), in which no target protein is tagged for depletion, 217 analysis of the NMP-seq data indicates that 7meG damage formation was still 218 suppressed at the conserved motif sequences upon rapamycin treatment 219 (Supplemental Fig. S5B), indicating that rapamycin itself had little effect on NMP 220 damage formation. However, TF depletion in Abf1-AA or Reb1-AA cells restored 221 damage formation at their corresponding binding sites (Supplemental Fig. S5C and 222 S5D). For example, Abf1 depletion increased 7meG formation at Abf1 binding sites to a 223 level comparable to the flanking DNA; however, no damage restoration was seen at 224 Reb1 binding sites in Abf1-AA cells (Supplemental Fig. S5C). Similarly, damage was 225 restored at Reb1 binding sites in Reb1-AA cells, but not at Abf1 sites (Supplemental 226 Fig. S5D). Therefore, these data indicate that nuclear depletion of each TF specifically 227 affects damage formation at its own binding sites, but has no effect on the binding sites 228 of the other TF. 229 Analysis of 7meG repair in the anchor-away strains indicates that BER was restored all genes) was generally faster in NDR relative to the coding region where DNA is 252 organized into +1, +2, and so on nucleosomes (Fig. 4A), a pattern consistent with our 253 previous studies (Mao et al., 2017). Hence, the global BER pattern revealed by our 254 analysis indicates that Abf1 and Reb1 do not inhibit repair in NDR when all genes were 255

included. 256
As Abf1 or Reb1 do not affect BER globally, we hypothesized that they may 257 specifically affect BER in target genes. To test this hypothesis, we linked Abf1 and Reb1 258 binding sites to the closest TSS of annotated genes (Park et al., 2014). This association 259 identified 697 Abf1-linked and 708 Reb1-linked genes (see Methods for detail). We then 260 aligned Abf1-linked and Reb1-linked genes at their TSS and plotted 7meG repair in 261 accordance with the transcriptional direction. For each subset of genes (i.e., Abf1-linked 262 or Reb1-linked genes), we found a prominent damage peak in NDR after 2 h repair in 263 WT cells (Fig 4B and 4C, black arrows). The damage peak was located ~100 bp 264 upstream of the TSS and overlapped with Abf1 or Reb1 binding peak (Supplemental 265 promoters. This finding was further confirmed by analyzing NMP-seq data generated in 267 the anchor-away cells. We found that depletion of Abf1 in Abf1-AA cells did not change 268 the global BER pattern when all genes were included (Fig. 4D), but it restored repair in 269 the NDR of Abf1 target genes (Fig. 4E). As expected, repair in Reb1 target genes was 270 still inhibited in the Abf1-AA cells (Fig. 4F, black arrow). Similar results were found in the 271 Reb1-AA cells (Fig. 4G to 4I). The damage peaks in NDR were not as high as repair 272 analysis at the mapped TF binding sites (e.g., compare Fig. 4B with Fig. 2A), likely 273 because the gene analysis was performed in each subset of genes aligned on their 274 TSS, not the midpoint of the TF binding sites. In summary, these data indicate that Abf1 275 and Reb1 inhibit BER in their target promoters. 276

Repair of 3meA is inhibited by TF binding in vivo and in vitro 277
Although 3meA is much less abundant than 7meG in MMS-treated cells, 3meA has long 278 been known to be cytotoxic (Fu et  Additionally, 3meA is unstable and difficult to be synthesized in vitro. As NMP-seq maps 282 both 3meA and 7meG lesions, we extracted A reads to specifically analyze 3meA 283

repair. 284
Analysis of 3meA lesions in WT cells indicates that the repair was inhibited near the 285 center of Abf1 and 'high-occupancy' Reb1 binding sites, as shown by high levels of 286 remaining 3meA lesions at 2 h ( Fig. 5A and 5B). In contrast, 3meA repair was not 287 inhibited at 'low-occupancy' Reb1 binding sites (Fig. 5C). Interestingly, the 3meA peaks 288 appear to be narrower than the 7meG peaks, and no clear 3meA repair inhibition was 289 seen in nucleosomes surrounding the TF binding sites. These differences are consistent 290 with the greater activity of Mag1 and its homologs in removing 3meA than 7meG 291 (Connor et al., 2005), which may lead to less repair inhibition to 3meA lesions by DNA-292 binding proteins. 293 A closer examination of 3meA repair at 'high-occupancy' Reb1 binding sites 294 revealed a slow repair spot at the +4 position (Supplemental Fig. S7A). Repair of 7meG 295 was also inhibited at the same location (Supplemental Fig. S7B), suggesting that the +4 296 position is refractory to BER enzymes. Although the sequence at +4 position is not 297 conserved in the Reb1 motif, the Reb1-DNA crystal structure (Jaiswal et al., 2016) 298 shows that this position is directly contacted by the DNA binding domain of Reb1 protein 299 (Supplemental Fig. S3C). The strong repair inhibition at the +4 position led us to further 300 investigate BER using an in vitro system. To simulate 3meA repair at the Reb1 binding 301 site, we incorporated a stable 3meA analog, inosine (denoted as I), at the +4 position of 302 the motif strand (Fig. 5D). Inosine can naturally arise from adenine deamination in cells 303 and is repaired by Mag1-mediated BER (Alseth et al., 2014). We found that inosine 304 incorporation did not change Reb1 binding affinity compared to DNA without inosine 305 (Supplemental Fig. S7C). AAG and APE1 enzymes were added to naked DNA or DNA 306 pre-bound with purified Reb1 protein to examine BER activity in vitro. The AAG/APE1 307 cleavage product (i.e., the lower band) was analyzed in a time-course experiment to 308 compare BER activity between free DNA and Reb1-bound DNA (Fig 5E). Quantification 309 of the gel showed significantly reduced repair activity at the binding site in Reb1-bound 310 DNA relative to the naked DNA substrate (Fig 5F). Reduced BER activity was also 311 observed when inosine was placed on the other strand at the +4 position (Supplemental 312 Fig. S7D and S7E). Hence, these in vitro data, consistent with our cellular damage 313 sequencing data, indicate that BER of 3meA is suppressed by TF binding. 314

BER inhibition at TF binding sites is different from NER inhibition 315
TF binding has been shown to inhibit NER of UV damage (Frigola et  and Reb1 binding sites, but elevated in DNA adjacent to the center due to depletion of 331 nucleosomes ( Fig. 6C and 6D), similar to the BER pattern ( Fig. 2A and 2B). 332 Additionally, GG-NER was also modulated by nucleosomes positioned around the TF 333 binding sites. These analyses indicate that GG-NER is inhibited by both Abf1 and Reb1 334 at their binding sites. 335 As the GG-NER pattern at the TF binding sites resembles the BER pattern revealed 336 by our NMP-seq data, we sought to understand if the size of the inhibited DNA region is 337 the same for both repair pathways. A comparison between CPD and 7meG repair 338 indicates that GG-NER was inhibited in a broader DNA region at Abf1 and Reb1 binding 339 ( Fig. 6E and 6F). While BER (i.e., 7meG repair) was inhibited in ~30-40 bp DNA 340 surrounding the center of the binding motif, inhibition of GG-NER was extended by an 341 additional 10 bp on each side (Fig. 6E and 6F). These high-resolution sequencing data 342 demonstrate the difference between BER and GG-NER at TF binding sites, which is 343 consistent with the different mechanisms underling NER and BER (see Discussion). MybAD2 and MybR1 and exhibit significantly reduced 7meG formation (Fig. 1E). 391 Positions -1 to -3 (i.e., TAA) are sandwiched between the subsites for MybAD1 and 392 MybAD2, which insert recognition helices into the adjacent DNA major groove. As a 393 result, the minor groove from positions -1 to -3 is strongly compressed in width and 394 increased in depth (Supplemental Fig. S3D). These results suggest that preferential 395 formation of 3meA at position -3 may be facilitated by enhanced minor groove 396 narrowing and DNA curvature by Reb1 binding. 397 Our data further revealed strong inhibition of BER at Abf1 and Reb1 binding sites. 398 Compared to NMP damage formation, repair of 7meG lesions was inhibited in a wider 399 DNA region consisting of the core motif and some of the flanking DNA. As mentioned 400 earlier, the conserved nucleotides in the core motif are bound by the TFs and BER 401 enzymes could be sterically hindered to access these sites. Even the less conserved 402 nucleotides in the core motif are also likely inaccessible to BER enzymes, since the TF 403 protein covers the whole motif region. In addition to the core motif, structural data 404 indicates that some nucleotides in the flanking DNA are bound by the Reb1 protein 405 (Jaiswal et al., 2016). Although Abf1-DNA complex structure data is currently 406 unavailable, it is conceivable that Abf1 also binds to part of the flanking DNA. The 407 strength of protein-DNA interaction in the flanking DNA may not be as high as in the 408 core motif, which still allows damage formation to occur, but it considerably reduces the 409 access of BER enzymes, particularly in DNA immediately adjacent to the core motif. As 410 BER is generally inhibited in TF-bound DNA, damage hotspots induced by TF binding 411 cannot be efficiently repaired and may eventually cause individual mutation hotspots 412 when DNA is replicated. Considering the conserved damage formation and repair 413 mechanisms between yeast and human cells, our findings provide a potential 414 explanation to mutation hotspots at TF binding sites in non-UV exposed tumors. 415 The comparison between NMP and CPD repair at TF binding sites provides new 416 insights into how TFs affect BER and NER differently. While TF binding inhibits both 417 BER and NER, we found that the affected DNA region is considerably broader in NER 418 compared to BER. NER is inhibited in about 50-60 bp DNA centered on the midpoint of 419 Abf1 or Reb1 binding sites, whereas BER is suppressed in a narrower DNA region (Fig.  420   6). The extended inhibition region in NER is consistent with more proteins being 421 involved in NER compared to BER. Moreover, NER requires repair endonucleases to 422 cleave upstream of the 5' side and downstream of the 3' side relative to the lesion, 423 releasing a repair intermediate of ~25 nt (Huang et al., 1992;Schärer, 2013). Although 424 UV damage located outside of the TF binding site may be recognizable by the damage 425 recognition factor such as XPC or yeast Rad4, one of the two repair cleavage sites may 426 still be located within the binding motif and is inaccessible to the repair endonuclease. 427 Hence, the unique 'dual-incision' mechanism of NER is consistent with the broader 428 repair-resistant DNA region around a TF binding site compared to BER. 429 In summary, we generated high-resolution alkylation damage and BER maps at 430 yeast TF binding sites, which allows us to elucidate how TF binding modulates base 431 damage formation and repair. Considering the potential connection between base 432 damage, BER, and mutations in non-UV exposed tumors, these analyses provide 433 important insights into cancer mutations frequently elevated at TF binding sites. To damage naked yeast DNA with MMS, genomic DNA was first isolated from WT 453 yeast cells without MMS treatment. All proteins were removed during DNA isolation by 454 using vigorous phenol chloroform extraction, followed by ethanol precipitation. The 455 purified DNA was incubated with MMS for 10 min. After MMS treatment, DNA was 456 purified by phenol chloroform extraction and ethanol precipitation. 457

NMP-seq library preparation 458
NMP-seq library preparation was described in our previous study (Mao et al., 2017). 459 Genomic DNA was sonicated to small fragments and ligated to the first adaptor DNA.

TF binding data sets 469
We used published yeast TF binding data sets in this study. Most analyses were 470 performed using the published ORGANIC binding data (Kasinathan et al., 2014). 471 Binding sites were obtained from experiments using 10 min micrococcal nuclease 472 digestion with 80 mM NaCl, as described in our previous study (Mao et al., 2016). Only 473 binding sites with the canonical Abf1 or Reb1 motif sequence (CGTNNNNNRNKA and 474 TTACCC, respectively) were used for damage and repair analysis. Binding sites that did 475 not match the motif sequences were excluded. Reb1 binding sites were further stratified 476 into 'high-occupancy' (occupancy > 10) and 'low-occupancy' (occupancy <=10) binding 477 sites based on the mapped occupancy levels (Kasinathan et al., 2014). 478 Some of our repair analyses (e.g., Supplemental Fig. S4B) were confirmed using 479 ChIP-exo TF binding sites. The Abf1, Reb1, and Rap1 binding peaks were determined 480 by mapping genome-wide binding sites in TAP-tagged yeast strains (e.g., Abf1-TAP, 481 Reb1-TAP, and Rap1-TAP) in a recent ChIP-exo study (Rossi et al., 2021). The data 482 were downloaded from the Gene Expression Omnibus, 483 https://www.ncbi.nlm.nih.gov/geo/ (accession number GSE147927). 484 To identify target genes for Abf1 and Reb1, we searched gene transcription start 485 sites (TSS) to find the closest midpoint of Abf1 or Reb1 binding sites using the 486 ORGANIC datasets. If the TF binding site is located within 300 bp upstream or 487 downstream of the gene TSS, the gene is identified as a putative target gene. Some 488 binding sites are located in the middle of two divergently transcribed genes. In this case, 489 both genes are recognized as target genes. 490

NMP-seq data analysis 491
Analysis of NMP-seq datasets was conducted using our published protocols (Mao et al., 492 2017). NMP-seq sequencing reads were demultiplxed and aligned to the yeast 493 reference genome (sacCer3) using Bowtie 2 (Langmead and Salzberg, 2012). For each 494 mapped read, we identified the position of its 5' end in the genome using SAMtools (Li 495 et al., 2009) and BEDTools (Quinlan and Hall, 2010). Based on the 5' end position, the 496 single nucleotide immediately upstream of the 5' end was found and the sequence on 497 the opposing strand was identified as the putative NMP lesion. The number of 498 sequencing reads associated with each of the four nucleotides (e.g., A, T, C, and G) 499 was counted to estimate the enrichment of MMS-induced NMP lesions in the 500 sequencing libraries. G reads were typically highly enriched relative to C reads, followed 501 by A reads. 502 To analyze damage formation and BER at TF binding sites, we extracted G or A 503 reads to analyze 7meG and 3meA lesions, respectively. The number of lesions at each 504 position around the midpoint of Abf1 or Reb1 binding sites was counted using the 505 BEDTool intersect function. For damage formation, the cellular lesion counts were 506 normalized to the naked DNA to account for the impact of DNA sequences on NMP 507 lesion formation. The normalized ratio was scaled to 1.0 and plotted along the TF 508 binding sites (e.g., Fig. 1A to 1C). Plots at single nucleotide resolution (e.g., Fig. 1D to 509 1E) also show scaled damage ratio between cellular and naked DNA NMP-seq data. nucleotide resolution plots (e.g., Fig. 1D). Alternatively, we analyzed the average 516 damage in a 5-bp non-overlapping moving window to show the average damage and 517 repair in a broader DNA region (e.g., Fig. 1A and 2A). 518 Some NMP-seq datasets such as mag1-0 h, WT-1 h and WT-2 h, were downloaded 519 from our published studies (NCBI GEO, accession code GSE98031). New NMP-seq 520 data generated in this study, including NMP data in naked DNA and in anchor-away 521 yeast strains, have been submitted to NCBI GEO (accession code GSE183622). In some 522 of the new samples (e.g., WT-AA, Abf1-AA-rep 2), we tried to add MMS-damaged 523 pUC19 plasmid as spike-in control to quantify repair efficiency. Hence, the fraction of 524 remaining damage in these samples was normalized by the pUC19 read ratio between 525 0 h and 2 h. 526

CPD-seq datasets and analysis 527
Yeast CPD-seq data were downloaded from NCBI GEO (accession code GSE145911). 528 Analysis of CPD repair at Abf1 and Reb1 binding sites was performed using the same 529 method described in NMP-seq data analysis. buffer (see Fig. 5D). The binding reaction was incubated on ice for 40 minutes. Free 548 DNA and DNA bound by Reb1 were loaded onto a 12% native PAGE and separated by 549 gel electrophoresis at 200 V for 30 minutes. The gel was exposed to a phosphor screen 550 and the image was scanned using a Typhoon FLA7000 scanner (GE Healthcare). Gel 551 quantification was performed with the ImageQuant software (GE healthcare). 552 For BER assays, equal amount of naked DNA and DNA bound by Reb1 protein (~ 553 5pmol) were incubated with AAG (10 units) and APE1 (1 unit) (New England Biolabs) in 554 a 20 µL reaction containing 1X Thermopol buffer (20mM Tris HCl pH 8.8, 10 mM 555 (NH4) 2 SO 4 , 10mM KCl, 2mM MgSO 4 , 0.1% Triton X-100) at 37°C for 15, 30, 45 and 60 556 minutes. After BER cleavage, DNA was purified using Phenol:Chloroform:Isoamyl 557 alcohol extraction and precipitated using ethanol. The purified DNA was resuspended in 558 formamide (80%) and denatured at 95°C for 10 minutes. The denatured DNA was 559 analyzed by electrophoresis at 200V for 30 min using 12% polyacrylamide urea gels. 560 The gel was exposed to a phosphor screen and imaged using a Typhoon FLA7000 561 scanner and quantified by ImageQuant. 562