Quantitation, base identity and distribution of stably incorporated ribonucleotides in nuclear and mitochondrial DNA from murine tissues

Ribonucleotides are estimated to be the most common non-canonical nucleotides transiently incorporated in DNA. Their presence or failure of their removal can affect genome stability and mutations in factors involved in dNTP pool maintenance or ribonucleotide removal can cause Aicardi-Goutières syndrome or promote certain human cancers. Here, we have mapped and quantitated ribonucleotides genome-wide, in nine tissues of wild-type mice. We observed tissue-specific variation in number and base identity of incorporated ribonucleotides and present evidence that a number of genomic features, such as tRNA genes, transcription start sites and G-quadruplexes, can increase the frequency of stably incorporated ribonucleotides in their proximity. Moreover, we present the non-random distribution of incorporated ribonucleotides in mtDNA and identified ribonucleotide hotspots. The study presents a framework to understand the physiological role of ribonucleotides in mammalian DNA.


INTRODUCTION 26
After its discovery in 1869, DNA has been of great interest in research and was since then found to 27 be the central hereditary molecule 1 . Without any known exceptions the genetic information of all 28 eukaryotic life is stored in the form of DNA 2 . Despite the fact that some genetic diversity is beneficial 29 from an evolutionary point of view in terms of population persistence and individual fitness 3 , DNA 30 damage and mutations can have profound effects on the health and aging of an organism. It is 31 therefore of great importance for each cell to maintain the integrity of its genetic material. The 32 intactness of the DNA is however constantly challenged by intra-and extracellular factors: DNA 33 damaging agents and irradiation have been investigated early on 4 , but more recently, knowledge of 34 intracellular mechanisms affecting genome integrity is emerging 5 . Aside from the elucidation of DNA 35 repair mechanisms, as well as error-prone and error-free translesion synthesis, a fundamental 36 realization was the asymmetry of deoxyribonucleotide (dNTP) and ribonucleotide (rNTP) 37 concentrations in the cell and later on in mitochondria 6,7 . Already in 1994 Thomas W. Traut compiled 38 a list of nucleotide (NTP) concentrations illustrating in general the great excess of rNTPs as 39 compared to dNTP concentrations and the variation of these concentrations between different cell 40 types, tissues and organisms 6 . 41 Eukaryotic DNA is replicated by the three major replicative DNA polymerases α, δ and ε which 42 discriminate against rNTPs with a so-called "steric gate" residue near the polymerases' active site 8 . 43 This discrimination however is not perfect and a direct consequence of the great rNTP excess is the 44 relatively frequent misincorporation of ribonucleotides by replicative DNA polymerases. These 45 events render ribonucleotides to be possibly the most common non-canonical nucleotides 46 incorporated into the DNA 9 . While ribonucleotide incorporation may temporarily prove to be 47 beneficial by mitigating challenging replication scenarios such as dNTP shortage 10 and seem to fulfill 48 useful functions such as allowing discrimination of parent and nascent strands during mismatch 49 repair 11 or serve as imprints for facilitating mating type switch in one of two S. pombe daughter 50 cells 12 , their increased inclination to hydrolyze due to their 2´-hydroxyl group 13 can cause nicked DNA 51 and negatively affect genome stability 14 resulting in associated pathologies 15,16 . 52 Since the incorporation of short RNA primers during the synthesis of Okazaki fragments is necessary 53 and misincorporation of ribonucleotides by DNA polymerases happens frequently, as summarized by 54 Sassa et al. 17 mammalian cells possess multiple mechanisms to remove single ribonucleotides or 55 short stretches of ribonucleotides from the DNA. RNA primers are removed either via DNA 56 polymerase δ performing strand-displacement synthesis, displacing the primer from the previous 57 Okazaki fragment, thereby forming a 5´-flap that is subsequently cleavage by flap endonuclease 1 58 and the resulting nick can be ligated by DNA ligase I 18 , or via RNA:DNA hybrid removal facilitated by 59 the RNases H 19 . The removal of single ribonucleotides is mostly achieved by a mechanism called 60 Ribonucleotide Excision Repair (RER) initiated by RNase H2, which cleave 5´ of embedded 61 ribonucleotides in double-stranded DNA. Recent in vitro findings suggest a possible alternative RER 62 pathway via the human DEAD-box RNA helicase DDX3X which showed RNase H2-like activity 20 . 63 Topoisomerase 1 (TOP1) is also able to mediate the removal of a ribonucleotide but is more likely to 64 generate short deletions at repetitive sequences or double-strand breaks 21 -seq method coupled  79  with cleavage by the restriction enzyme SacI to map and quantitate stably incorporated  80  ribonucleotides in the nuclear and mitochondrial DNA from nine different mouse tissues: blood,  81 bone marrow, brain, heart, kidney, liver, lung, muscle and spleen. Since we use a wild-type mouse  82  strain proficient in RER, this study for the first time gives a more comprehensive view on the  83 ribonucleotide landscape within the DNA in vivo. We determined the frequency and base identity of 84 incorporated ribonucleotides and provide evidence that in a wild-type mouse, ribonucleotide 85 removal mechanisms are imperfect, and ribonucleotides remain permanently incorporated in both 86 nDNA and mtDNA. Moreover, we provide evidence that incorporated ribonucleotides are not 87 randomly distributed throughout the nuclear and mitochondrial DNA but are over-or 88 underrepresented near certain genomic features in nDNA. Similarly, we observed non-random 89 distribution of ribonucleotides in mtDNA with distinct hotspots throughout the molecule. 90

92
Based on the knowledge that dNTP and rNTP pools vary between tissues 6 and that their 93 concentrations may affect how many and which ribonucleotides are misincorporated into the DNA 25 , 94 we hypothesized that incorporated ribonucleotides vary in number and base identity in different 95 mouse tissues. To test this hypothesis, we used the HydEn-seq method coupled with SacI digestion 96 for the mapping and quantitation of incorporated ribonucleotides in nine tissues (blood, bone 97 marrow, brain, heart, kidney, liver, lung, muscle and spleen) from six male wild-type mice. For 98 quantitation of incorporated ribonucleotides, two HydEn-seq libraries were prepared for each 99 sample: Both samples were cleaved with SacI, then one underwent alkaline hydrolysis with KOH and 100 one was treated with KCl instead. All reads at 5´-ends were normalized to the mean reads in the SacI 101 cleavage sites and reads at 5´-ends from KCl-treated libraries were subtracted from the 102 corresponding KOH-treated libraries in order to determine the number of reads corresponding to 103 incorporated ribonucleotides alone. Furthermore, the number of incorporated ribonucleotides was 104 normalized to 1 kb for comparison. 105

Number of embedded ribonucleotides varies between nDNA and mtDNA and is tissue-dependent 106
In nDNA ( Fig. 1 A), we determined that approximately 1 ribonucleotide per kb was present and the 107 variation in numbers of permanently incorporated ribonucleotides between tissues was relatively 108 small with kidney and liver containing the least and spleen, lung and heart containing the most 109 incorporated ribonucleotides. As hypothesized based on the lack of efficient ribonucleotide removal 110 in mtDNA ( Fig. 1 B), we observed more variance between tissues with approximately 1 to 8 stably 111 incorporated ribonucleotides per 1 kb in mtDNA, with lung, spleen and muscle containing markedly 112 more ribonucleotides than the other tissues. It is conceivable that tissue-specific differences in 113 ribonucleotide incorporation exist even in nDNA, but that such differences are masked by the 114 presence of efficient ribonucleotide removal mechanisms in the nucleus. To investigate the variation 115 in the number of incorporated ribonucleotides among tissues, we performed the Welch's t-test and 116 reported the p-values in matrices. As evident from Figure 1 C and D, with several p-values below 117 0.05 the number of embedded ribonucleotides is statistically significantly different between multiple 118 tissues: In nDNA ( Fig. 1 C), the number or incorporated ribonucleotides did not vary greatly, but 119 statistically significant differences were found between spleen and all tissues except lung and blood, 120 and among kidney, liver and heart, which all showed little variation within the sample. 121  On both strands of mtDNA ( Fig. 1 D) from lung and spleen the number of incorporated 133 ribonucleotides was statistically significantly higher than in other tissues. The number of 134 incorporated ribonucleotides varies not only between tissues ( Fig. 1 A-D) but in case of mtDNA also 135 between strands (Fig 1 B), presumably due to the unequal base composition (Fig. 2 C) between the 136 two strands. We found the light strand to consistently contain about 1.5 to 2-fold more incorporated 137 ribonucleotides than the heavy strand. Taken together, these results show that ribonucleotides are 138 frequently present with around 1 ribonucleotide per kb in nDNA or approximately 5.2 million 139 ribonucleotides per nuclear mouse genome, despite efficient methods of ribonucleotide removal 140 and that ribonucleotides in mtDNA are more frequent than in nDNA with about 1 to 8 141 ribonucleotides per kb or 16 to 128 ribonucleotides per mtDNA molecule depending on the tissue, 142 probably due to the lack of efficient removal mechanisms. The number of permanently incorporated 143 ribonucleotides is moreover varying depending on the tissue in both nDNA and mtDNA. 144 The base identity of incorporated ribonucleotides shows tissue-specific variation 145 Variation between tissues is not only reflected in the total numbers of incorporated ribonucleotides, 146 but in the base identity of these ribonucleotides. Figure  find rA to be preferentially incorporated, followed by rG which was overrepresented in kidney and 159 lung, as well. In contrast, rC occurred rarely, and the presence of U was minimal in mtDNA. The high 160 frequency of rA incorporation in mtDNA might also in part explain the higher total of incorporated 161 ribonucleotides observed on the light strand which contains more adenine than the heavy strand 162 (Fig. 2 C). The difference between nDNA and mtDNA samples may be the result of multiple 163 differences between nucleus and mitochondria: Since nDNA and mtDNA are synthesized by different 164 replicative DNA polymerases, DNA polymerases α, δ, ε in the nucleus and DNA polymerase  in 165 mitochondria, the architecture of each DNA polymerase affects which type of ribonucleotides are 166 most likely incorporated by them and how frequently 9 . Furthermore, ribonucleotide repair 167 mechanisms may have varying removal efficiency depending on the base identity and sequence 168 context, possibly obscuring biases in incorporation of the nuclear DNA polymerases. As shown 169 earlier in mtDNA, ribonucleotide incorporation is also directly affected by the rNTP and dNTP 170 pools 25 . Consistent with earlier findings in mouse liver, heart and brain 16 and rNTP and dNTP pool 171 measurements in mouse liver mitochondria that showed an abundance of adenosine triphosphate 172 (ATP) 28 , we found rA to be most commonly incorporated in mtDNA, with the highest percentage of 173 incorporated rA in muscle mtDNA. Overall, these results suggest that the base identity of stably 174 embedded ribonucleotides has tissue-specific variation in both nDNA and mtDNA and strengthens 175 earlier evidence of the absence of ribonucleotide repair mechanisms 29 and the high ATP 176 concentrations in mitochondria to be the main reason for frequent rA presence in mtDNA. 177

Figure 2. Base identity of incorporated ribonucleotides between tissues and nDNA and mtDNA. 179
A Base composition of murine nDNA. B Ribonucleotide base identity in nDNA from blood, bone 180 marrow, brain, heart, kidney, liver, lung, muscle and spleen. Error bars represent the standard 181 deviation (SD predicted intrastrand G-quadruplexes (G4s) by using a custom script searching for the pattern 198 (G 3+ N 1-25 ) 3 G 3+ in the Mus musculus reference genome (mm10/GRCm38). 199 We filtered our ribonucleotide mapping data by the positions listed for each region of interest (ROI) 200 and then calculated the mean number of incorporated ribonucleotides in bins before and after the 201 ROI start positions. Enrichment plots (Fig. 3) were generated by dividing the mean number or 202 incorporated ribonucleotides at ROI by the mean number from a random data set subjected to the 203 same normalizations and binning. The random data set was generated by picking 1,000 random 204 positions per chromosome. Display window and bin sizes were adjusted in relation to the 205 ribonucleotide patterns observed at ROIs and spanned either 10 kb, 4 kb, 1 kb or 500 b up-and 206 downstream of the ROI start position with 500 b-, 200 b-, 50 b-, or 25 b-bins, respectively. The 207 analysis was performed separately for the same and opposite strand if strand information was 208 available, otherwise reads on both strands were counted. 209 Our analysis of 863 enhancers (Fig. 3 A) showed a 4 to 5-fold enrichment of ribonucleotides at about 210 0.8 to 1 kb on the same strand as the enhancers and a nearly 4-fold enrichment at about -2 kb and -211 1.6 kb on the opposite strand. Investigating 1,048,573 intrastrand G4s (Fig. 3 B downstream, while we did not find any marked increase on the opposite strand. When analyzing 219 173,204 TSS (Fig. 3 D) and found a 30-to 35-fold enrichment of ribonucleotides near the TSS start 220 point, decreasing to about 5-fold enrichment at around 5 kb up-and downstream, then increasing 221 again in the periphery. Given that tRNA genes are short, we suspect that the peripheral increase in 222 incorporated ribonucleotides may be a result of neighboring features that also show enrichment of 223 ribonucleotides. We could not decern a clear pattern for the 25,111 promoters (Fig. 3 E), though 224 there might be a slight overrepresentation of ribonucleotides on the opposite strand near the 225 promoter start position. Since CpG islands and microsatellites are genomic features on both strands, 226 we calculated the enrichment of ribonucleotides on both strands. Our analysis of 16,023 CpG islands 227 (Fig. 3 F)  We calculated the mean ribonucleotides at each position of the heavy and light strand of the 257 mtDNA. As shown in Fig. 4 A for the heavy strand and Fig. 4 B for the light strand, the distribution of 258 ribonucleotides is heterogenous and ribonucleotides are found at distinct hotspots throughout each 259 strand. To best fit the data, the upper bound for the y-axis was set to 0.003 ribonucleotides per 260 heavy strand and 0.005 ribonucleotides per light strand. Based on the quantitation in Fig. 1 B,  261 estimating the average total number of ribonucleotides on the light strand to be around 3.5 which 262 equates to about 0.0002 ribonucleotides at each position, and on the heavy strand around 1.6 263 ribonucleotides or circa 0.0001 ribonucleotides at each position, it is evident that certain positions 264 are more prone to contain ribonucleotides as they show 10-to 20-fold higher numbers. 265

Figure 4. Distribution of ribonucleotides mtDNA. 266
Genomic positions on the mtDNA are indicated in kb on the outer ring and coding regions ( Based upon this, we hypothesized that ribonucleotides are underrepresented near the origin of 277 heavy strand replication (OriH) and the origin of light strand replication (OriL). Surprisingly, we 278 observed a distinct ribonucleotide peak at OriH at position 16,032 on the heavy strand ( Since incorporated ribonucleotides are estimated to be the most common non-canonical nucleotides 297 in eukaryotic DNA 9 , that they are much more unstable than deoxyribonucleotides 13 and have the 298 potential to alter the DNA structure 41,42 , these ribonucleotides can have significant consequences for 299 genome stability. Their distribution in the mammalian genome and in relation to common genomic 300 features are the first steps to precisely discern what determines their appearance, permanent 301 incorporation, recognition or protection from removal mechanisms and the resulting beneficial or 302 detrimental effects of their presence. Here we have for the first time determined the number and 303 identity of incorporated ribonucleotides in wild-type mouse nDNA and mtDNA across nine tissues: 304 blood, bone marrow, brain, heart, kidney, liver, lung, muscle and spleen. The number of 305 incorporated ribonucleotides and their base identity varied between the nDNA of the investigated 306 tissues and to a greater extend also in mtDNA. Our estimate of about 5.2 million stably incorporated 307 ribonucleotides per murine nuclear genome suggests that ribonucleotides are the most common 308 non-canonical nucleotides in DNA. This number is higher than anticipated from RNase H2 deficient 309 murine embryonic fibroblasts (MEFs) where an incorporation of 1 ribonucleotide per 7.6 kb 310 nucleotides equivalent to about 1.3 million ribonucleotides per genome were estimated to be 311 incorporated and a near complete repair was expected 43 . Several factors may contribute to different 312 estimates in the number of incorporated ribonucleotides between MEFs and the investigated mouse 313 tissues: differences between tissues (Fig. 1 A and B) and that our tissues were obtained from 7.5 or 314 30 week-old mice rather than embryos allowing an accumulation over time are conceivable 315 contributing factors. Furthermore, the incorporated ribonucleotides may be the result of the cell 316 cycle-dependent regulation of RNase H2, which seems to limit its activity to the G2 phase 44 . 317 Ribonucleotide removal efficiency might be further restricted by impeded recognition of 318 incorporated ribonucleotides through masking by other DNA-interacting or -binding proteins, as 319 suggested by our findings. While the differences between nDNA and mtDNA are most likely 320 explained by the lack of efficient ribonucleotide removal in mitochondria 25,29 and differences in sugar 321 selectivity between the replicative DNA polymerases and mtDNA polymerase γ 8 , the tissue-specific 322 differences can probably be attributed to factors determined by the specific characteristics of the 323 cell types of each given tissue. RNase H2 was also shown to be important in the prevention of intestinal tumorigenesis in mice 54 . 342 Acetylated SAMHD1, which has increased dNTPase activity, is implicated in promoting cancer cell 343 proliferation in both tumor cells from hepatocarcinoma patients and HeLa cells 55 . CRISPR screening 344 identified TOP1-incised embedded ribonucleotides as poly(ADP-ribose) polymerase (PARP) trapping 345 lesions that impede DNA replication and affect PARP inhibitor cytotoxicity 56 . This might be exploited 346 to sensitize cells in the treatment of prostate cancer and chronic lymphocytic leukemia which are 347 frequently deficient in RNASEH2B 57 . The accumulation of incorporated ribonucleotides in the 348 absence of RER can furthermore lead to unrepaired nicks that may contribute to the pathology of 349 ataxia with oculomotor apraxia 1 58 . Some pathologies are more specific to the mitochondrial NTP 350 pools and the mtDNA integrity in particular. Several forms of mtDNA depletion syndromes can be 351 caused by the proteins involved in maintaining mitochondrial dNTP pools, their disruption leading to 352 limited dNTPs available to mtDNA replication 59 . Mutations in RNASEH1, which is responsible for the 353 primer removal in mtDNA 37 , can cause adult-onset mitochondrial encephalomyopathy 60 . 354 Our meta-analyses of genomic features imply that the non-random distribution of ribonucleotides in 355 the murine nDNA is probably the product of a variety of processes affecting incorporation and 356 removal dependent of the genomic feature. Because cleavage of a RNA-DNA junction for ribonucleotide removal requires access of the enzymes' active site 61 it seems reasonable to suspect 358 that DNA binding proteins or other DNA-interacting proteins, as well as structural distortion in the 359 case of G4s (Fig. 3 B) may mask incorporated ribonucleotides from recognition by RNase H2 or TOP1, 360 but mechanistic experiments are needed to test this hypothesis. 361 While an accumulation of incorporated ribonucleotides seems to generally have negative effects as 362 described above, also more studies emerge that suggest beneficial roles of transiently incorporated 363 ribonucleotides involved in DNA repair intermediates, illustrating the balancing act between 364 advantageous and detrimental ribonucleotide incorporation. Newly embedded ribonucleotides were 365 found to serve as a strand discrimination signal in eukaryotic mismatch repair and mismatch repair 366 efficiency is decreased in the absence of RNase H2 11 . In the repair of double-strand breaks by 367 mammalian non-homologous end-joining, ribonucleotides incorporated by DNA polymerase μ or 368 terminal deoxynucleotidyl transferase were shown to facilitate first-strand ligation in as many as 369 65% of non-homologous end-joining products 62 Ribonucleotides embedded in the mtDNA seem to be tolerated better than in nDNA, as we observed 377 a higher frequency of ribonucleotides in mtDNA of wild-type mice (Fig. 1 B) and given the fact that 378 mtDNA stability does not seem to be negatively impacted by the age-related accumulation of 379 ribonucleotides in mtDNA due to the lack of efficient ribonucleotide removal 26 . It is therefore of 380 interest to understand if their presence contributes to mtDNA instability in other ways. In the mouse 381 model for a mitochondrial DNA depletion syndrome caused by MPV17 deficiency, mtDNA deletions 382 and an increased number of incorporated rG were observed in the brains of MPV17 knockout mice. 383 It was therefore hypothesized that those ribonucleotides may be involved in causing these 384 deletions 16,40 . Interestingly, mutational hotspots identified by ultra-deep sequencing in mouse 385 mtDNA only coincided with the ribonucleotide peak we observed near OriL at 8,155 (Fig. 5 D), while 386 none of the ribonucleotide peaks listed in Figure  and three 30-week-old male mice (C57BL6) were used for our experiments. Mice were euthanized 399 by placing them into an induction chamber (Abbott Scandinavia AB) for 10 min with 4% isoflurane 400 (Forene isoflurane, Abbott Scandinavia AB) followed by cervical dislocation. Whole blood was 401 immediately harvested and the heart surgically excised thereafter. 300-600 µL of whole blood were 402 thoroughly mixed with 30 µl 0.1 M EDTA to prevent coagulation, 200 µL aliquots were transferred to 403 1.5 mL tubes and frozen at -80 °C. Tibia and femur bones (for extraction of bone marrow), brain, 404 heart, kidney, liver, lung, thigh muscle and spleen were removed and small sections were flash 405 frozen with liquid nitrogen, then stored at -80 °C until further processing. 406

DNA extraction 407
DNA was extracted using the MasterPure Complete DNA and RNA purification kit (MC85201, lucigen) 408 preceded by adjusted approaches for homogenization and lysis of tissues. DNA from whole blood 409 was extracted using the kit following manufacturer's instruction for the extraction from whole blood 410 with RBC lysis. DNA from bone marrow originated from tibia and fibula. Bones were freed from any 411 remaining tissue and broken at the knee joints. Placing this opening downward in a perforated 0.5 412 mL tube set in a 1.5 mL tube allowed to centrifuge out the bone marrow at 10,000 g for 20 s. 100 µL 413 of RCB lysis buffer were added and mixed and then incubated at RT for 10 min while vortexing every 414 5 min. Samples were centrifuged at 10,000 g for 25 s and supernatant discarded. 300 µL of tissue 415 and cell lysis buffer including proteinase K were added and incubated at 65 °C for 15 min while 416 shaking at 1,500 rpm. Further extraction steps were performed following the kit's instructions. 2-70 417 mg of brain, heart, kidney, liver, lung, muscle or spleen tissue were added to a 1.5 mL tube 418 containing 5-15 stainless steel homogenizing beads (0.9-2 mm, SSB14B, next advanced Inc.) and 100 419 µL tissue and cell lysis buffer including proteinase K. replicates for each tissue came from different mice (n = 6 for blood, brain, heart, kidney, liver and 445 muscle; n=5 for bone marrow, lung and spleen). All tissue samples were summarized in the analyses 446 for Figures 3 to 5 (n = 51). 447 Sequence trimming, filtering and alignment 448 Cutadapt 1.12 68 was used for quality and adaptor sequence trimming. Pairs with one or both reads 449 shorter than 15 nt were discarded. Mate 1 of the remaining pairs was aligned to the list of index 450 primers used to prepare the libraries using Bowtie 1.2 and all matching pairs were removed. 451 Remaining pairs were aligned to the mm10/GRCm38 Mus musculus reference genome 452 (https://www.ncbi.nlm.nih.gov/grc) using bowtie (-v2 -X2000-best). Then, single-end alignments 453 were performed for mate 1 of all unaligned pairs (-m1, -v2). The count of 5'-ends of all unique 454 paired-end and single-end alignments was determined and shifted one base upstream to represent 455 the location of the original embedded ribonucleotide. For the mtDNA, reads were uniquely aligned 456 only to the mitochondrial chromosome. 457

Quantitation of incorporated ribonucleotides 458
For comparison between libraries, visual representations, and meta-analyses all end counts were 459 normalized to the mean reads per SacI cleavage site detected in each sample. Normalized 5´-end 460 counts from KCl-treated control samples were subtracted from the corresponding KOH-treated 461 samples to remove counts from free 5´-ends that were not caused by alkaline hydrolysis. 462

Analysis of ribonucleotides near genomic features 463
A list of confirmed mouse TSS was retrieved from the RefTSS data set 464 (http://reftss.clst.riken.jp/reftss/) 36 . Positions of intrastrand G4s were predicted by using a custom 465 script searching the mm10 reference genome for the motif (G 3+ N 1-25 ) 3 G 3+ . Other genomic positions 466 for ROI were acquired from the UCSC ribonucleotide enrichment or depletion, we generated a data set of 21,000 random genomic 469 positions by picking 1,000 random positions from each chromosome. Data processing and 470 visualization was performed using custom scripts: All 5´-end counts were normalized to the mean 471 reads per SacI cleavage site and normalized end counts from KCl-treated samples were subtracted 472 from the corresponding SacI-normalized end counts of KOH-treated samples to obtain only reads at 473 incorporated ribonucleotides. All reads for ribonucleotides were collected near each ROI start 474 position including a 10kb, 4 kb, 1 kb or 500 b window up-and downstream of the ROI and binned in 475 500 b-, 200 b-, 50 b-or 25 b-bins, respectively. The mean number of ribonucleotides per bin was 476 calculated. The same procedure was performed with the random data set, which was then used to 477 calculate ribonucleotide enrichment. 478 Similarly, mtDNA visualization ( Fig. 4 and 5)  are provided in Supplementary Figure 1. 485