Continued circulation, recombination and evolution of the ancient subcontinent lineage despite predominance of the recent arctic-like lineage rabies viruses (RABV) in India

Background Rabies is an emerging and re-emerging lethal encephalitis causing 26,400 to 61,000 human deaths annually. Approximately 20,000 people die of rabies every year in India that accounts to 36% of the world’s rabies deaths. Rabies is endemic among domestic dogs in India and there are conflicting reports on the currently circulating RABV lineages in domestic dogs in India. Further, movement of humans and animals between Sri Lanka and southern coastal states of India was proposed to be a source of the emergence of variant RABV in India. For effective prevention and control of rabies in India it is essential to establish the genetic diversity and evolutionary dynamics of RBAV currently circulating in India. Methods We carried out molecular evolution and recombination analyses of nucleoprotein (N) and glycoprotein (G) genes of 26 RABV isolates from southern Indian states of Tamil Nadu and Goa. Results We found continued co-circulation of ancient subcontinent lineage despite predominance of the recent arctic-like lineage RABVs in southern India. The mean rate of nucleotide substitution in G and N genes was 1.32 × 10−3 and 1.91 × 10−4 substitutions/site/yr respectively. The study also found recombination in both N and G genes and a higher mean rate of evolutionary changes in G gene among Indian dog RABV isolates than those of lyssaviruses. The Indian subcontinent lineage RABV isolates investigated in this study clustered closely with other subcontinent lineage viruses from Sri Lanka highlighting the continued incursion and/or circulation of the variant subcontinent lineages of RABVs between India and Sri Lanka. Conclusion We report that there is enzootic viral establishment of two distinct RABV lineages in domestic dogs in India that are evolving at a greater rate. Author summary Rabies is a fatal viral disease that has no treatment and can only be prevented by post-exposure vaccination. In many parts of Asia and Africa, rabies continues to be a major public health threat almost always caused by dog bites. In this study, we investigated the genetic diversity and rate of evolution among rabies viruses isolated from dogs in India. We found that two distinct lineages of Rabies viruses (RABVs) namely the ancient subcontinent lineage and a more recent arctic-like lineage co-circulate among dogs in India. Notably, our study found that the dog rabies viruses in India are undergoing recombination and evolving at a higher rate than other lyssaviruses. Phylogenetic analysis revealed continued incursion and/or circulation of the variant subcontinent lineages of RABVs in India that might have been originated from Sri Lanka. Our study indicates that two distinct lineages of RABVs are maintained and currently circulate among dog population in India


Introduction
Indian subcontinent lineage RABV in India [13]. However, a more recent study reported that    Table 2: Gene specific primers used for gene amplification and sequencing.

180
Primers G2-SYR-R CAAAGGAGAGTTGAGATTGTAGTC Indian subcontinent, Cosmopolitan and Asian (Fig 1). The mean rate of nucleotide substitution 217 of N gene nucleotides estimated using by a Bayesian method was 1.91 x 10 -4 substitutions/site/yr 218 (95% highest posterior density (HPD) 1.05 x 10 -4 to 2.78 x 10 -4 ). Based on the N gene, the 219 RABV isolates sequenced in this study clustered into two lineages, Arctic-like (13 isolates) and 220 Indian subcontinent (2 isolates). Tree topology generated using an alignment-free method (Fig 2) 221 agreed with that of maximum likelihood and Bayesian methods.  Recombination event were detected in isolate MVC-50 (GenBank accession number MH258824) 241 by more than three methods with p-value < 0.0005 and hence was removed from further 242 phylogenetic analysis.

243
The 21 glycoprotein sequences of Indian isolates along with representative RABV sequences from 244 across the globe (totaling 57 sequences) were used for molecular phylogenetic analysis using 245 alignment-based and alignment-free methods. MSA showed ~70% identity and ~84% similarity.

246
Lineages of the various RABV isolates were estimated using ML method implemented in PhyML 247 package. The mean rate of nucleotide substitution estimated from the partial glycoprotein 248 sequences by Bayesian analysis was 1.32 x 10 -3 substitutions/site/yr (95% HPD 6.92 x 10 -4 to 2.05 249 x 10 -3 ).

250
Similar clustering patterns of RABV isolates were observed in the trees generated using 251 alignment-based (ML, (Fig 3) and Bayesian) and alignment-free (RTD, Figure 4) methods. Of 252 the Indian RABV isolates, 16 RABV clustered into Arctic-like lineage and 5 clustered into 253 Indian subcontinent lineage. The phylogenetic trees of the N and G genes (Fig 1, 2, 3

281
The mean rate of nucleotide substitution estimated from the N gene sequences was 1.91 x 10 -4 substitutions/site/yr (95% HPD 1.05 x 10 -4 to 2.78 x 10 -4 ). Interestingly, these estimates are in 283 close agreement with the reported N-gene mean rate of substitution in RABVs from bats and 284 terrestrial mammals [30] suggesting that the dog RABVs in India continue to evolve at the same 285 rate as bat RABVs from around the world. The mean rate of nucleotide substitution estimated 286 from the partial G gene sequences is 1.32 x 10 -3 substitutions/site/yr (95% HPD 6.92 x 10 -4 to 287 2.05 x 10 -3 ) which is marginally higher than that reported in a previous study, which evaluated for potential recombination events when analyzed using the N-gene sequences. Interestingly,

301
MVC-50 was also predicted to be a potential recombinant using the G gene partial sequence.

302
Homologous recombination is known to occur among rabies viruses which could play a role in 303 the diversity and evolution of rabies viruses [32]. To more completely understand the role of 304 recombination in the evolution of Indian isolates of RABV, further studies using complete genome data are required.

306
Molecular phylogenetic analysis of both, N and G genes were performed using alignment-based 307 and alignment-free methods. The RTD-based alignment-free method is based on frequency of k-308 mers and the relative order in which the k-mers occur in the sequences.