Co-evolutionary analysis suggests a role for TLR9 in papillomavirus restriction

Upon infection, DNA viruses can be sensed by pattern recognition receptors (PRRs) leading to the activation of type I and III interferons, aimed at blocking infection. Therefore, viruses must inhibit these signaling pathways, avoid being detected, or both. Papillomavirus virions are trafficked from early endosomes to the Golgi apparatus and wait for the onset of mitosis to complete nuclear entry. This unique subcellular trafficking strategy avoids detection by cytoplasmic PRRs, a property that may contribute to establishment of infection. However, as the capsid uncoats within acidic endosomal compartments, the viral DNA may be exposed to detection by toll-like receptor (TLR) 9. In this study we characterize two new papillomaviruses from bats and use molecular archeology to demonstrate that their genomes altered their nucleotide composition to avoid detection by TLR9, providing evidence that TLR9 acts as a PRR during papillomavirus infection. Furthermore, we demonstrate that TLR9, like other components of the innate immune system, is under evolutionary selection in bats, providing the first direct evidence for co-evolution between papillomaviruses and their hosts.


104
This overall dinucleotide depletion confounds the ability to demonstrate the cause of this 105 depletion.

106
Bats serve as reservoirs for many viruses and and have served as the source of well-documented 107 cross-species transmission events of pathogens responsible for a myriad of epidemics, the most 108 notable being severe acute respiratory syndrome coronavirus 1, middle east respiratory syndrome

123
To address this question, we determined the genomes of two novel bat papillomaviruses from 124 Tadarida brasiliensis (TbraPV2, TbraPV3). Taxonomically, bats are classified into two suborders; 125 Yinpterochiroptera and Yangochiroptera. By comparing the genomes of papillomaviruses 126 associated with bats from either suborder, we demonstrate that TLR9 target motifs are 127 significantly depleted and impact papillomavirus evolution. Furthermore, we extend existing data

128
showing that Yangochiroptera TLR9 is evolving under diversifying selection. This argues that papillomavirus genomes are evolving in response to a host change. This is the first direct evidence 130 of PVs evolving in response to host evolutionary changes, thus providing direct evidence for co-131 evolution. Also, these data argue that TLR9 is a restriction factor for papillomavirus infection.

163
In support of this notion, these novel bat papillomaviruses cluster with other previously described

172
We wanted to compare the evolutionary history of these diverse viruses to their hosts. Due to 173 intra-host divergence and niche adaptation, papillomaviruses infecting the same host can be 174 found in multiple phylogenetic tree clades. To ensure that viruses with a similar tissue tropism 175 and evolutionary history are compared, we extracted a subtree from the maximum likelihood tree 176 (Figure 2) (Smeele et al. 2018). This clade contains the newly identified TbraPV2 and TbraPV3 177 embedded within the largest monophyletic Chiroptera papillomavirus clade (red arrow in Figure   178 1A). We used a tanglegram to address our hypothesis of virus-host co-evolution (Figure 2A)

185
To formally quantify the degree of co-speciation, we focused on two datasets. First, we used the 186 phylogenetic tree shown in Figure 2. Because it was previously shown that members of the

187
Lambdapapillomavirus genus co-evolve with their hosts (Rector et al. 2007), we also tested a 188 smaller subtree to avoid skewing the results (indicated with the red arrow in Figure 2

210
shown in Figure 2. A ratio close to 1 indicates that a dinucleotide is seen in the sequence as often 211 as expected based on each sequence's nucleotide composition. Values lower than 1 suggest that 212 a dinucleotide is depleted. While the ApC, ApT, GpT, TpA, and TpC ratios are significantly lower 213 than 1 (one-sample t-test p-value < 0.001), we observed the most significant decrease in the CpG 214 dinucleotide ratio (Figure 3). The median CpG content for these evolutionarily related viruses is 215 0.46. However, we noticed that the distribution has a long tail towards even lower values,

216
suggesting that some viruses have a further reduced CpG content (Figure 3).

217
When we plotted the CpG ratio for each virus on a phylogenetic tree ( Figure

227
This reduction in the CpG ratio could be due to an overall lower GC content. We compared the 228 CpG ratio to total genomic GC content ( Figure 4C). This analysis demonstrates that there is no 229 correlation between the decreased CpG ratio and the total GC content (linear regression: R 2 = 230 0.008, p-value = 0.27), and the reduced CpG ratio is not simply due to a lower GC content.
The viruses infecting bats in the Yangochiroptera suborder have a reduced CpG content, raising 232 the possibility that the host species is influencing the CpG ratio and evolutionary trajectory of 233 these viruses.

235
The CpG dinucleotide is present in 8 codons coding for five different amino acids. Therefore,

236
reducing CpG dinucleotides is expected to lead to a bias in codon usage or amino acid

246
We determined codon usage tables for the non-overlapping coding sequences for each of the 247 papillomaviruses in the subtree described in Figure 2. These codon usage tables were compared 248 using the Emboss 'codcmp' tool to calculate codon usage differences. The more diverse the 249 codon usage, the larger the differences between both tables. This analysis shows that compared 250 to each other, the Yangochiroptera papillomaviruses codon usage is more similar than when other 251 viruses are compared ( Figure 5A). This suggests that the reduction in CpG leads to a more for episodic diversifying selection on the branches leading to the Yangochiroptera ( Figure 6B).

279
Finally, FEL identified 7 sites under diversifying selection within the Yangochiroptera.

280
We mapped a subset of these residues as well as residues previously identified to be under

289
The Arg at position 76 is much larger than the canonical His at this position, which will presumably   Figure 7A). This initial analysis indicates that ACGT, GCGT, TCGA, and

306
TCGT are diminished in Yangochiroptera. We calculated the average tetramer ratio for each 307 group of viruses to normalize for the differences in overall CpG content between Yangochiroptera 308 and other viruses, as in Figure 4. We compared the proportion of Yang to Yin, Yang to other, and Yin to other ( Figure 7B)

330
However, the reason for this depletion is unclear. Of note, in mammalian genomes, CpGs are 331 rare outside of so-called CpG islands (Illingworth and Bird 2009). This is believed to be mainly 332 due to the observation that methylated CpGs are prone to deamination, resulting in C à T 333 mutations, leading to a depletion of CpG sites in the mammalian genomes over evolutionary time.
Our data in Figure 3 show an increase in TpG and CpA. However, this increase does not appear 335 to be of the same magnitude as the dramatic reduction in CpG seen in the same dataset.

336
The zinc-finger antiviral protein (ZAP) acts as a broad-spectrum antiviral restriction protein that

375
Therefore, it is likely that the reduction in DNMT1 levels will lead to a loss of methylation on the 376 viral genomes destined for packaging and infection of the new tissue.

427
Co-evolution alongside their hosts has been suggested to be an essential factor in the evolution

433
The co-evolution theory would predict that the virus would need to adapt when the host evolves 434 a new skill. Therefore, as TLR9 evolves new functionalities, the virus would need to respond to 435 preserve the balance between virus and host. Indeed, our data suggest that as Yangochiroptera

436
TLR9 is undergoing diversifying selection, papillomavirus genomes infecting these bats further depleted their CpG content, specifically in the context of a known TLR9 PAMP. This is the first 438 direct evidence of co-evolution between this family of viruses and their hosts.

439
In the phylogenetic analysis, the viruses that infect Yinpterochiroptera and Yangochiroptera, 440 respectively, are not monophyletic but rather are present in three mixed clades (Figure 2; EsPV1, 441 EsPV3, and RfPV1; MscP2 and EhPV1; TbraPV1-3). This suggests that these three main clades 442 diverged before the ancestor of Yinpterochiroptera and Yangochiroptera split over 65 million years 443 ago. As these ancestral viruses co-evolved with the Yangochiroptera hosts, they selected for loss 444 of CpG. This occurred at least three separate times in the evolution of Yangochiroptera viruses.

445
This strongly argues against a founder effect but in favor of recurring co-evolutionary interactions.

484
Bats were captured in mist nets set over water sources, extracted from the nets and put in brown 485 paper bags. Bats were held in bags for 20 minutes, removed, and then measured and weighed.

486
Individuals were identified to species in the field using metrics such as forearm length and weight.

487
Feces and urine were collected from the bag or swabbed directly off the bat using a PurFlock 488 0.14" Ultrafine swab (Puritan, Guilford, Maine). All feces and urine samples were put into tubes containing 0.5 mL buffer consisting of 1x PBS and 50% Glycerol. These samples were held on 490 ice until returning to the lab where they were stored in a -80°C freezer. All applicable international,

516
The tetramer content for each genome was calculated as described for the dinucleotides. To

519
To test whether any of the tested tetramers depletions are statistically significant, we randomly 520 shuffled each viral genome. To ensure that each randomly shuffled sequence would maintain the 521 same dinucleotide ratio as the original sequence, we used the Altschul

557
An additional subtree was extracted to minimize the impact of the genus Lamdapapillomavirus.

558
The viral types included in this smaller dataset are underlined in Figure 2.