Transposable element landscape changes are buffered by RNA silencing in aging Drosophila

Genetic mechanisms that repress transposable elements (TEs) in young animals decline during aging, as reflected by increased TE expression in aged animals. Does increased TE expression during aging lead to more genomic TE copies in older animals? To answer this question, we quantified TE Landscapes (TLs) via whole genome sequencing of young and aged Drosophila strains of wild-type and mutant backgrounds. We quantified TLs in whole flies and dissected brains and validated the feasibility of our approach in detecting new TE insertions in aging Drosophila genomes when natural defenses like RNA interference (RNAi) pathways are compromised. By also incorporating droplet digital PCR to validate genomic TE loads, we confirm TL changes can occur in a single lifespan of Drosophila when TEs are not suppressed. We also describe improved sequencing methods to quantify extra-chromosomal DNA circles (eccDNAs) in Drosophila as an additional source of TE copies that accumulate during aging. Lastly, to combat the natural progression of aging-associated TE expression, we show that knocking down PAF1, a conserved transcription elongation factor that antagonizes RNAi pathways, may bolster suppression of TEs during aging and extend lifespan. Our study suggests that RNAi mechanisms generally mitigate genomic TL expansion despite the increase in TE transcripts during aging.


INTRODUCTION
All animal genomes carry the genetic burden of a sizeable reservoir of parasitic 2 elements called transposons or transposable elements (TEs). This TE burden can 3 range from the extreme >70% proportion of the axolotl genome [1, 2] to >50% in the 4 human genome [3] to >10% in the Drosophila melanogaster genome [4,5]. TEs are 5 selfish invaders of animal genomes with some potential for stimulating more rapid gene 6 regulatory innovations like serving as novel enhancers [6], but more frequently are 7 detrimental to animal fitness when they insert into and disrupt expression of important 8 genes [7]. Therefore, conserved chromatin regulation and RNA-interference (RNAi) 9 pathways must silence TEs to ensure fertility and animal health. However, these 10 genomic defense mechanisms also weaken during animal aging concomitant with 11 observable decreases in genomic integrity in aging cells. This phenomenon has been 12 articulated in the hypothesis of TEs impacting aging [8]. histone methyltransferases like Suv39h1 and G9A [35][36][37]. In addition, there are DNA 5 methyltransferases that genetically interact with the piRNA pathway to target TEs for 6 chromatin silencing in mammalian germ cells [38][39][40][41][42][43][44]. In primates and mice, the most 7 active TE is the LINE-L1 which is implicated in somatic genome mosaicism in 8 developing brains and individual neurons [45][46][47][48][49][50][51][52][53]. Lastly, LINE-L1 is linked to 9 deleterious novel mutations in tumors and they are activated in cell culture models of 10 cellular senescence [54][55][56][57][58][59]. Although TE control is clearly important to mammalian 11 health, the large genome sizes and longer lifespans hampers comprehensive 12 assessments of mammalian TLs during impact aging. 13 Therefore, in this study we leveraged Drosophila's rapid aging, its compact 14 genome and powerful genetic tools as significant advantages for testing how TE of genetic factors regulating TE silencing [66,67]. 5 In this study, we demonstrate how WGS and extrachromosomal circular DNA 6 (eccDNA) sequencing of aged and young flies can report changes in TLs during fly 7 aging. Although TE RNA upregulation is a recurring phenotype of aging wild-type flies, 8 we show that major changes in the genomic TLs are generally suppressed by the RNAi 9 pathway because RNAi mutants allow TEs to expand their genomic DNA (gDNA) copy 10 numbers. We also demonstrate that tissue-specific (i.e. fly brain tissues) gDNA 11 sequencing can sensitize the detection of genomic TL changes; and eccDNA 12 accumulation during aging of the ISO1 strain is an additional feature of the hypothesis 13 of TEs impacting animal aging. Lastly, we show that genetically boosting RNAi activity 14 in aged flies via knockdown of PAF1 can suppress TE RNAs and extend longevity. observation for three commonly used WT fly strains that would form the basis of this 1 study. Using our lab's standard rearing conditions, we first determined the aging curves 2 for the ISO1 strain used for the D. melanogaster reference genome sequence [4], an 3 isogenic w1118 strain that is a common background strain in genetic studies [68], and 4 the Oregon-R strain used in a series of functional genomics datasets [69]. Whereas 5 w1118 and OreR displayed lifespans typical of other WT fly strains (Figure 1A) w1118 and OreR (Figure 2A). 22 As expected, each of these WT flies TLs displayed completely distinct  Therefore, we updated TIDAL to map to Drosophila TE families consensus 19 sequence coverage, and added arbitrarily-selected protein coding genes, analogous to 20 the modification to TEMP to track protein-coding genes as Immobile gene elements 21 (IGEs) [60]. We gauged a relatively low average rate (<4%, Table S1) of false positive 22 split reads called by TIDAL in hitting IGEs, whereas these protein coding genes 23 sequencing coverage generally also remained stable between 5-day young and 30-day 24 aged flies. (Figure S3A). Tracking TE consensus sequence coverage has the 1 advantage of accounting for all accumulating TE sequences in both the mappable and 2 unassembled and ambiguous-mapping regions of the genome. With this analysis 3 approach, we could detect a clearer increase of total TE sequence coverage in OreR 4 and w1118 30-day aged flies versus 5-day young flies (Fig. S3A). This overall aging-5 associated increase in TE consensus sequence coverage was represented by many TE 6 families and is controlled against stable protein-coding gene coverage (Fig. S3B). However, both this coverage analysis and the quantile insertions analysis cannot 8 discriminate between a full-length or truncated TE sequence, which we have noted in P-9 elements can have critically variable transposition activities [73].

11
Resolving and validating our approaches measuring TE landscapes with RNAi mutants. 12 This unresolved genomics challenge of using short read WGS data for analyzing 13 TE sequences coverage also extends to some limitations in using droplet digital PCR 14 (ddPCR) to precisely quantify genomic TE copies for only the isoforms covered by the 15 short ddPCR amplicons [45]. Although the ddPCR quantifications of specific TE copies 16 ( Fig. S3C) followed the similar proportional trends of TE families called by TIDAL ( Fig.   17 2B), the aging-associated increases in TE copies measured by ddPCR in the WT fly 18 strains was also not detected. In questioning the accuracy of this ddPCR assay in 19 absolute quantification of TE copies, we compared ddPCR results on P-elements and I- 20 elements versus WGS and TIDAL determinations in two other directly matched gDNA 21 samples ( Figure S4A,B). The ddPCR copy number measurements were very similar to  The ddPCR results above affirm that genomic approaches are capable of 22 detecting TL changes and WGS analysis of RNAi mutants demonstrate that the TL 23 changes can be detected during a single generation of aging flies. Therefore, we 1 conclude that although fly aging may allow TE RNAs to become upregulated [10, 13], 2 the RNAi pathways are still functioning in aging WT flies to mitigate TE RNAs from 3 transposing in genomes. In essence, the RNAi mutants raise the frequency of 4 successful TE mobilization events earlier in development so that these new insertions  Detectable TL changes in fly brains during aging. 10 Perhaps new TE insertions may be better detected in specific tissues were cells 11 that are more permanent and not turned over as frequently, such as the brain. For 12 example, in mammalian neurons, the most active TE LINE-L1 has been implicated in 13 transposing relatively frequently during development to give rise to genomic mosaicism 14 in the brain [45][46][47][48][49][50][51][52][53]. Given the caveats of having to do prior total DNA amplification from 15 limited gDNA from fly neurons [60], we undertook WGS from at least 50 dissected 16 female brains to provide sufficient nucleic acid for RT-PCR confirmation of neuronal 17 gene expression and WGS of brain DNA (Figure 4). 18 We successfully generated libraries directly from brains of WT fly strains and piwi 19 and AGO2 mutants without any prior total DNA amplification, and now we could detect 20 increases in TLs from OreR and w1118 strains (Fig. 4B). Although there may be 21 piRNA-like small RNAs and piwi expression in fly heads [13, 21, 22], we detected 22 increases in TLs in piwi mutants' brains that were similar in magnitude to the WT OreR 23 and w1118 strains (Fig. 4C). Greater and more variable increases in TLs were apparent 1 in the AGO2 mutants' brains ( Fig. 4D). Only the brains from the ISO1 strain were 2 recalcitrant from showing much TE insertion increases except for a >2-fold increase in 3 the Stalker TE (Fig. 4B). Except for ISO1, the other fly strains appear to enable 4 increasing TL changes in the brain during fly aging.  We investigated eccDNAs in Drosophila by optimizing our own method to purify 20 enough eccDNAs to directly generate libraries for deep sequencing without requiring 21 prior total DNA amplification ( Figure 5A). Furthermore, we used spike-ins of cloning- 22 vector plasmid DNAs into gDNA preparations and magnetic beads for improved 23 recovery and quantitation of eccDNAs for comparing between different samples. To 24 confirm that eccDNA was recovered after two rounds of Exo5 and Exo8 exonuclease 1 digestion steps which only degrade linear but not circular DNA, we conducted PCR with 2 standard primers amplifying linear genes and TEs (F1-R1 primer pairs, Table S2), and 3 outward-facing primers that either generate an amplicon from a TE eccDNA or tandem 4 genomic copies of the same TE (P10-P11 primer pairs) (Fig. 5B). Linear gene 5 amplicons were significantly depleted after exonuclease digestions, while the amplicon 6 for the spike-in plasmid was enriched. Linear TE amplicons were also reduced while 7 eccDNA-targeted amplicons for the copia TE was resilient against the exonuclease 8 treatment. Some other TE amplicons with outward-facing primers that were reduced 9 after exonuclease treatment may reflect more tandem copies of these TEs.

10
Since the regular PCR amplicons for the copia eccDNA were readily apparent in 11 WT strains (Fig. 5B), we used qPCR to quantify the changes and show that copia 12 eccDNA copies were increased >~2-fold in 30-day aged flies compared to 5-day young 13 flies ( Fig. 5C). This result motivated us to deeply sequence short read libraries 14 generated directly from those eccDNA-enriched samples which did not undergo any 15 total DNA amplification (Table S3). We first adapted the TIDAL scripts of mapping reads 16 to the TE families consensus sequences to measure sequencing coverage as well as 17 circular junction spanning reads against copia and observed an aging-associated 18 increase in copia eccDNA that was consistent with our qPCR results (Fig. 5D). We also 19 applied this custom eccDNA quantitation pipeline to all the other Drosophila TEs as well 20 as adapting the CIRCLE-Map pipeline previously used to measure mammalian 21 eccDNAs [81] to the Drosophila TEs. We then normalized the ratios of the eccDNA-TE 22 counts between 30-day aged and 5-day young flies (Fig. 5E). Although the CIRCLE-

23
Map pipeline was more sophisticated at providing a significance "circle score" that we 24 set the cutoff to be >50, our custom eccDNA quantitation pipeline's results were notably 1 consistent in showing overall that most eccDNAs as TEs were increasing in the libraries 2 of 30-day aged flies (Fig, 5E, F). However, the additional normalization to the plasmid 3 spike-ins were more informative in moderating eccDNA levels in w1118 while 4 reaffirming the TE eccDNA increases in OreR and ISO1 (Fig, 5E, F). Thus, while ISO1 5 TLs did not change much at the chromosomal level during aging, ISO1 TE copy 6 numbers may instead increase through eccDNA accumulation. Although TE expression still increased in WT aging flies, we hypothesized 10 whether endogenous RNAi pathways that still limit genomic TL increases could also be 11 genetically enhanced to mitigate the aging-associated rise of TE RNAs. To test this 12 hypothesis, we first used a ubiquitous Tubulin-GAL4 driver to overexpress AGO2 in 13 adults, and as expected, multiple TE RNAs had lowered expression relative to the 14 negative control ( Figure 6A). We then used the same driver to overexpress piwi, and 15 although there was likely a silencing limit to prevalent piwi expression in the ovary, the 16 enhancement of piwi expression and TE silencing was much more apparent in the 17 female carcass (Fig. 6B).

18
These data provided a proof of principal that augmenting these RNAi pathways in 19 adults results in improvements in TE silencing. However, inhibiting a factor that 20 normally limits RNAi activity would be preferable from a therapeutic standpoint.  Since we had observed TE landscape activity in the adult fly brain (Fig. 4), we 21 also tested a brain-specific driver, elav-GAL4, that was effective at triggering PAF1  mobilizing into uniquely-mapping sequences and also count the coverage on TE family 20 consensus sequences (Fig. S3). After showing that an orthogonal quantitation method 21 like ddPCR is consistent with TIDAL's quantitation of TE copy numbers from WGS of P- 22 elements and I-elements (Fig. S4), our parsimonious conclusion is that despite aging-  Single-cell WGS is not yet robust enough nor has total DNA amplification approaches 11 been demonstrated to be unhampered by molecule bias, so our study required pools of 12 genomes and non-amplified input DNA samples to reduce the prior concerns. Our study 13 also adds a second dimension to WGS of TLs by incorporating eccDNA as an in vivo 14 cache of accumulating TE DNA sequences (Fig. 5). Intriguingly, the ISO1 strain showed 15 the least chromosomal TL changes yet exhibited the greatest increase in TE-eccDNAs 16 in the whole flies, while the OreR and w1118 strains also showed evidence of TE-17 eccDNAs accumulating in the brain (Figure S5A). 18 In addition to variations in TLs between WT strains, we also observed differences 19 in TLs between other RNAi mutants that we cannot fully explain. For example, we 20 examined aging-associated TLs from two EMS-induced point mutants of Dcr-2 21 (L811fsx) and Dcr-2 (R416X) from [105]), the nuclease acting upstream of AGO2 to 22 generate the siRNAs from TE dsRNAs. However, there was inconsistent and contrary 23 TL differences between young and aged Drosophila in these Dcr-2 mutants whole flies 24 and brains (Fig. S5B, C) as well as in AGO3 mutants (Fig. S5D, [106]). Perhaps these 1 sets of mutants are not as penetrant in the loss of RNAi activity as the piwi, aubergine 2 and AGO2 mutants. Furthermore, the analysis of a partially rescuing AGO2 transgene 3 in the AGO2 (2-5-14) null mutant did lower the initial levels of TE insertion differences 4 noted by TIDAL, but the partial rescue still did not fully prevent aging-associated TE 5 increases (Fig. S5E), suggesting only wild type strength RNAi can buffer aging 6 genomes from accumulating new TE insertions. 7 Therefore, we propose that RNAi activity must be sustained during aging to 8 mitigate negative effects of increased TE expression in aged flies, a phenotype that has 9 also been frequently observed in mammals [34,55,107,108]. To combat TEs' impact  (Fig. S3B). 3 Lastly, during fly aging there are also gross-level changes in histone marks typically 4 associated with chromatin silencing [12,14], which may precede the increase TE 5 expression, so the future extension of this work will be to add epigenetic and chromatin 6 accessibility landscapes to TLs during Drosophila aging.  All flies were raised at 25 o C on standard cornmeal food. For fly aging analyses, newly 3 eclosed female flies were harvested from bottles and mated with males for two days. These 4 females were then divided into ~20 individuals per vial and flipped to new vials every 2-3 days 5 to mitigate crowding stress according to this protocol [112]. Surviving flies were counted at each 6 flip, and the percentage of cumulative survival rate at each time point was plotted against its 7 corresponding age (date of counting subtracted by date of eclosion). 8 The isogenized ISO1 fly strain for the Dm6 reference genome sequence was obtained 9 from Tubulin-Gal80 ts strain we received from the Griffith lab, we also obtained a second Tubulin-20 Gal80 ts stock from the Bloomington Drosophila Stock Center (BDSC#7018) and combined with 21 Tubulin-Gal4 for further experiments. The PAF1 knockdown RNAi line was obtained from the 22 Vienna Drosophila Resource Center (VDRC#108826) and an mCherry shRNA control line was 23 obtained from the Harvard TRiP resource (BDSC#35785). 24 To quantify I-element copies by ddPCR, fly cross schemes from [74] were replicated 25 (Fig. S4C). Parental crosses between the w 1118 strain with active I-elements and w k strain with 26 inactive I-elements were performed reciprocally to generate many virgin F1 females where one 27 strain enables I-element transposition ("invaded" from w k as the maternal parent) versus a 28 control that maintains I-element silencing (w 1118 as the maternal parent). These F1 females 29 were then crossed to sperm-less males that were obtained as F1 male progenies from the 30 parental cross of w 1118 virgin females with XY attached male. F2 oocytes were collected 31 overnight and DNA was extracted for ddPCR against the I-element and Rp49. 32

Fly brain isolation, genomic DNA extraction, WGS library construction and deep 33
sequencing. 34 Fly brains were dissected from at least 50 females per age group, following a procedure 35 laid out in [119]. Eye disks and other tissues were removed from heads with forceps, and brain 36 lobes were dissected into tubes with ice-cold PBS before freezing once at -20 o C. Whole female 37 flies and fly brains were homogenized in a standard DNA digestion buffer (1% SDS, 50 mM 38 Tris-HCl pH 8.0, 100 mM EDTA, 100 mM NaCl, 0.5mg/ml Proteinase K) overnight at 50 o C, and 39 then extracted using standard phenol chloroform extraction, ethanol precipitation, and 40 resuspending gDNA pellets in pure water. 41 42 WGS of whole flies began with the circa 2014 Nextera Tn5 tagmentation kit (Illumina) 1 using an input of 50 ng gDNA and outputs were purified with AMpure XP beads (Beckman 2 Coulter). WGS libraries were quality controlled with the high-sensitivity DNA kit on the 3 Bioanalyzer (Agilent), selecting for size distributions of 300bp to 1kb and concentrations over 1 4 nM. Multiplexed libraries were sequenced on Illumina Nextseq500 high-output flow cells using 5 75 bp paired-end and single end kits. All WGS libraries were sequenced to a minimum depth of 6 35 million reads (Table S1, S2). After determining that some whole fly libraries made using 7 NEBNext Ultra-II DNA library prep kit for Illumina (NEB) were as complete and has better yields 8 than the then discontinued Nextera kit, we completed the fly brain gDNA libraries with the 9 NEBNext kit and sequenced them to similar depths as above. 10

11
RNA extraction, quantitative RT-PCR, digital droplet PCR (dd-PCR) and TE copy number 12 estimation 13 Total RNA was extracted from 5-10 female flies harvested at corresponding age with 14 TRI-reagent (MRC, Inc.). Reverse transcription (RT) was performed using random primers, 15 ProtoScript II (NEB), and 1 μg input of total RNA. Quantitative PCR (qPCR) with the Luna Sybr-16 Green mastermix (NEB) used primer sequences in Table S3 and 2 μL of a 1:10 dilution of the 17 cDNA. Relative changes in gene expression were calculated using the 2^ΔΔCt method with 18 Rp49 as a housekeeping gene for normalization. 19 Droplet digital PCR (ddPCR) was conducted on a QX200 instrument with the Evagreen 20 assay reaction (Biorad). Copy number measurements from specific TE primers (Table S3) were 21 normalized to Rp49 as a diploid gene, starting first at 2 ng of gDNA as input per 20 μL ddPCR 22 for droplet generation for most TEs. For TEs with very high copy numbers that saturate the 23 droplets, input gDNA was diluted further to 2 ng into the ddPCR mix prior to droplet generation. 24 At least 10,000 droplets were required to achieve good statistical estimation of the concentration 25 calculated by Poisson distribution using Quantasoft Analysis Pro (Biorad). TE copy numbers per 26 genome was determined by dividing against half of the measured Rp49 copies. 27

Extracellular circular DNA isolation and sequencing 28
To quantify eccDNAs during fly aging, 30 female flies were harvested from 5-days and 29 30-days post eclosion, and a fixed amount off pre-extraction plasmids was added prior to cell 30 lysis: ~80 pg of ~7kb-pGL3-DmPiwipro1 and ~50 pg of ~11kb-pCas9 prior to cell lysis. About 30 31 ug of total gDNA was recovered from using MasterPure TM Complete DNA and RNA Purification 32 kit (Lucigen), and 0.5ug-1ug gDNA was checked on a 1% agarose gel for integrity and quality. 33 Good gDNA primarily migrated at >10kb and to 20 μg gDNA we added 40ul of a second plasmid 34 cocktail: (1ng/ul of the 2.7kb pUC19, 0.1ng/ul of the 3.5kb pMaxGFP, 0.01ng/ul of the 5.2kb 35 pGSH0 and 0.001ng/ul of the 6.3kb pCENPm3) and split equally to two reactions: were mapped to genome to get the genic and intergenic counts using the genome GTF file. The 42 total number of reads mapped to the Drosophila genome was derived by subtracting the total 43 number of reads not mapped to the Drosophila genome from the total number of 1 reads. The total number of mapped reads was used as the basis for normalization of TE counts 2 and spike-in plasmid counts. 3 Plasmid sequences were treated as linear entries in the FASTA file database similar to 4 the TE family consensus sequences. The raw read counts from TE mapping were further 5 normalized by the total number of reads mapping to the Drosophila Dm6 genome assembly. For 6 spike-in plasmid counting, because several plasmids share the same backbone with different 7 inserts, read frequencies were normalized by the total plasmid mapping sites as well as by the 8 total number of Drosophila genome-mapping reads. 9 To execute the CIRCLE-Map program for repeats [   were detected by TIDAL be at least 1% of total number of TE families (i.e. all the TEs 1 not lumped into the "Others" category of Fig. 2B).    RT-qPCR (middle), which reduces TE RNA expression (right). The genes TFIIs and piwi 8 are controls suggesting that TE RNA reduction is distinct from a concern that PAF1 9 RNAi would simply be causing global reduction in transcription. Examining the effect of 10 TE RNA reduction in the PAF1 knockdown in the ovary (D) and brain (E) of adult 11 Drosophila with TEs and PAF1 in left graph and control genes in the right graph. (F) Life  Figures S1-S5.