RT Journal Article SR Electronic T1 Transmission Bottleneck Size Estimation from Pathogen Deep-Sequencing Data, with an Application to Human Influenza A Virus JF bioRxiv FD Cold Spring Harbor Laboratory SP 101790 DO 10.1101/101790 A1 Ashley Sobel Leonard A1 Daniel Weissman A1 Benjamin Greenbaum A1 Elodie Ghedin A1 Katia Koelle YR 2017 UL http://biorxiv.org/content/early/2017/01/19/101790.abstract AB The bottleneck governing infectious disease transmission describes the size of the pathogen population transferred from a donor to a recipient host. Accurate quantification of the bottleneck size is of particular importance for rapidly evolving pathogens such as influenza virus, as narrow bottlenecks would limit the extent of transferred viral genetic diversity and, thus, have the potential to slow the rate of viral adaptation. Previous studies have estimated the transmission bottleneck size governing viral transmission through statistical analyses of variants identified in pathogen sequencing data. The methods used by these studies, however, did not account for variant calling thresholds and stochastic dynamics of the viral population within recipient hosts. Because these factors can skew bottleneck size estimates, we here introduce a new method for inferring transmission bottleneck sizes that explicitly takes these factors into account. We compare our method, based on beta-binomial sampling, with existing methods in the literature for their ability to recover the transmission bottleneck size of a simulated dataset. This comparison demonstrates that the beta-binomial sampling method is best able to accurately infer the simulated bottleneck size. We then apply our method to a recently published dataset of influenza A H1N1p and H3N2 infections, for which viral deep sequencing data from inferred donor-recipient transmission pairs are available. Our results indicate that transmission bottleneck sizes across transmission pairs are variable, yet that there is no significant difference in the overall bottleneck sizes inferred for H1N1p and H3N2. The mean bottleneck size for influenza virus in this study, considering all transmission pairs, was Nb = 196 (95% confidence interval 66-392) virions. While this estimate is consistent with previous bottleneck size estimates for this dataset, it is considerably higher than the bottleneck sizes estimated for influenza from other datasets.Author Summary The transmission bottleneck size describes the size of the pathogen population transferred from the donor to recipient host at the onset of infection and is a key factor in determining the rate at which a pathogen can adapt within a host population. Recent advances in sequencing technology have enabled the bottleneck size to be estimated from pathogen sequence data, though there is not yet a consensus on the statistical method to use. In this study, we introduce a new approach for inferring the transmission bottleneck size from sequencing data that accounts for the criteria used to identify sequence variants and stochasticity in pathogen replication dynamics. We show that the failure to account for these factors may lead to underestimation of the transmission bottleneck size. We apply this method to a previous dataset of human influenza A infections, showing that transmission is governed by a loose transmission bottleneck and that the bottleneck size is highly variable across transmission events. This work advances our understanding of the bottleneck size governing influenza infection and introduces a method for estimating the bottleneck size that can be applied to other rapidly evolving RNA viruses, such as norovirus and RSV.