Influenza B viruses exhibit lower within-host diversity than influenza A viruses in human hosts

Influenza B virus undergoes seasonal antigenic drift more slowly than influenza A, but the reasons for this difference are unclear. While the evolutionary dynamics of influenza viruses play out globally, they are fundamentally driven by mutation, reassortment, drift, and selection within individual hosts. These processes have recently been described for influenza A virus, but little is known about the evolutionary dynamics of influenza B virus (IBV) at the level of individual infections and transmission events. Here we define the within-host evolutionary dynamics of influenza B virus by sequencing virus populations from naturally-infected individuals enrolled in a prospective, community-based cohort over 8176 person-seasons of observation. Through analysis of high depth-of-coverage sequencing data from samples from 91 individuals with influenza B, we find that influenza B virus accumulates lower genetic diversity than previously observed for influenza A virus during acute infections. Consistent with studies of influenza A viruses, the within-host evolution of influenza B viruses is characterized by purifying selection and the general absence of widespread positive selection of within-host variants. Analysis of shared genetic diversity across 15 sequence-validated transmission pairs suggests that IBV experiences a tight transmission bottleneck similar to that of influenza A virus. These patterns of local-scale evolution are consistent with influenza B virus’ slower global evolutionary rate. Importance The evolution of influenza virus is a significant public health problem and necessitates the annual evaluation of influenza vaccine formulation to keep pace with viral escape from herd immunity. Influenza B virus is a serious health concern for children, in particular, yet remains understudied compared to influenza A virus. Influenza B virus evolves more slowly than influenza A, but the factors underlying this are not completely understood. We studied how the within-host diversity of influenza B virus relates to its global evolution by sequencing viruses from a community-based cohort. We found that influenza B virus populations have lower within-host genetic diversity than influenza A virus and experience a tight genetic bottleneck during transmission. Our work provides insights into the varying dynamics of influenza viruses in human infection.

All samples exhibited low genetic diversity. The vast majority had no iSNV above the 2% cutoff.

4 5
Of the 99 samples with high-quality NGS data, 70 had no minority iSNV, 17 had one iSNV, 7 1 4 6 had two iSNV, and 3 samples had 3 iSNV (median 0, IQR 0-2; Table 2). Two outliers had a 1 4 7 large number of iSNV, with 8 and 20 iSNV. These two samples came from the same individual, 1 4 8 with one collected at home and the second at the study clinic. Most of the iSNV in these two 1 4 9 2 0 8 Together, these results indicate that our measurements of within-host diversity are robust to 2 0 9 several technical aspects of variant identification and are unlikely to account for the lower 2 1 0 observed diversity of IBV. Because these data are from the same cohort and were generated 2 1 1 using the same sequencing approach and analytic pipeline as our previous IAV datasets, the 2 1 2 observed differences likely reflect true biological differences between IAV and IBV. We compared viral diversity across samples from individuals in the same household to 2 1 7 investigate the genetic bottleneck that influenza B viruses experience during natural 2 1 8 transmission. Over the seven influenza seasons, thirty-nine households in the HIVE cohort had 2 1 9 two or more individuals positive for the same IBV lineage within a 7-day interval (Table 1). This 2 2 0 epidemiologic linkage is suggestive of transmission events but does not rule out co-incident 2 2 1 community acquired infection (19). We identified 16 putative transmission pairs for which we 2 2 2 sequenced at least one sample from each individual. In one of these pairs, the putative recipient 2 2 3 was the individual with a mixed infection. The donor did not have evidence of a mixed infection 2 2 4 based on number of iSNV, which would imply that the recipient may have been infected twice or 2 2 5 that the second virus was lost from the donor by the time of sampling. This pair was excluded 2 2 6 from the between-host analysis, leaving 15 putative transmission pairs for which we have high-2 2 7 quality sequencing data on both donor and recipient influenza populations. We used our sequencing data to determine which of these epidemiologically linked household 2 3 0 pairs were actual IBV transmission pairs. We generated maximum likelihood phylogenetic trees 2 3 1 for samples from the two IBV lineages using the concatenated coding consensus sequences.

3 2
Phylogenetic analysis provided genetic evidence that the 15 epidemiologically-linked pairs were 2 3 3 indeed true transmission pairs, as epidemiologically-linked pairs were found nearest each other 2 3 4 in each tree ( Figure 5A and 5B; vertical bars with household ID). We also validated these 2 3 5 transmission pairs by analyzing the genetic distance across viral populations. True transmission 2 3 6 pairs should have genetically similar populations exhibiting low genetic distance, while 2 3 7 individuals with coincident community acquisition are more likely to have populations with a 2 3 8 higher genetic distance. We compared the genetic distance between epidemiologically-linked 2 3 9 household pairs and random community pairs from the same season and infected with the 2 4 0 same IBV lineage, using L1-norm as measurement of genetic distance ( Figure 5C). The  Transmission bottlenecks restrict the genetic diversity that is passed between hosts. With a 2 5 1 loose transmission bottleneck, many unique genomes will be passed from donor to recipient.

5 2
Because this will allow two variants at a given site to be transmitted, sites that are polymorphic 2 5 3 in the donor are more likely to be polymorphic in the recipient. However, in the case of a tight or 2 5 4 stringent bottleneck, sites that are polymorphic in the donor will likely be either fixed or absent in 2 5 5 the recipient. We have previously demonstrated that influenza A experiences a tight 2 5 6 transmission bottleneck of 1-2 unique genomes (19). Across our 15 IBV transmission pairs, we 2 5 7 found no sites that were polymorphic in the donor and recipient ( Figure 6). Intrahost SNV 2 5 8 present in the donor were either fixed (100%) or absent (0%) in the recipient. These data 11 suggest a stringent transmission bottleneck for influenza B, similar to that of influenza A. As 2 6 0 there were fewer samples, transmission pairs, and iSNV in our IBV dataset, we were unable to 2 6 1 obtain a robust and precise estimate of bottleneck size. Here we define the within-host genetic diversity of IBV in natural infections by sequencing 106 2 6 6 multiple genes in IBV could result in more limited within-host diversity, perhaps located to 3 1 2 certain regions of the genome. However, we found that the distributions of iSNV across IAV and 3 1 3 IBV genomes are relatively similar. Furthermore, we have previously shown that the distribution 3 1 4 of mutational fitness effects in influenza A/WSN/33/H1N1 matches that of other RNA and 3 1 5 ssDNA viruses (38). Given that viruses across families with vastly different genomic architecture 3 1 6 have similar mutational robustness, this is unlikely to account for the differences in within-host 3 1 7 diversity between IAV and IBV. can decrease population fitness. However, there are potential evolutionary advantages to 3 2 5 stringent bottlenecks, including removal of defective interfering particles (40,41). While we were 3 2 6 not able to estimate the size of the transmission bottleneck as precisely as IAV, it is likely that 3 2 7 the bottleneck size is comparable across the two viruses given the similarities in their 3 2 8 transmission routes and ecology in the human population. Data from many more transmission 3 2 9 pairs will be necessary for a more robust estimate.   We amplified viral cDNA from all eight genomic segments using the SuperScript III One-Step   Bowtie2 (32). Duplicate reads were marked and removed with Picard and samtools (33).

1 6
Putative variants were identified with the R package deepSNV using data from the clonal 4 1 7 plasmid controls of each sequencing run (34). Minority iSNV (<50% frequency) were identified 4 1 8 using the following empirically-derived criteria: deepSNV p-value <0.01, average mapping In our previous work on IAV, we found that there were multiple sites with mutations that were 4 2 7 essentially fixed (>0.95) relative to the plasmid control and in which the base in the plasmid 4 2 8 control was therefore identified as a minority variant in the sample (19). At these sites, deepSNV 4 2 9 is unable to estimate the base-specific error rate and cannot distinguish true minority iSNV; 4 3 0 however, we found that we could accurately identify minority variants at these sites at a 4 3 1 frequency of 2% or above (19). This frequency threshold was incorporated into the pipeline for 4 3 2 iSNV identification at these sites. Therefore, we report intrahost variants from 2-98%; minority 4 3 3 iSNV are the subset of these variants with a frequency between 2-50%. Any sites that were 4 3 4 monomorphic after applying quality filters were assigned a frequency of 100%.                      depth on the y-axis and location within a concatenated influenza B virus genome on the x-axis.

6 9
The mean coverage for each sample was calculated over a sliding window of size 200 and a     (inferred based on day of symptom onset). Each iSNV is plotted as a point with its frequency in 6 1 5 the recipient (y-axis) versus its frequency in the donor (x-axis). 6 1 6