Abstract
Background: The southern United States (US) may be vulnerable to outbreaks of Zika Virus (ZIKV), given its broad distribution of ZIKV vector species and periodic ZIKV introductions by travelers returning from affected regions. As cases mount within the US, policymakers seek early and accurate indicators of self-sustaining local transmission to inform intervention efforts. However, given ZIKV's low reporting rates and geographic variability in both importations and transmission potential, a small cluster of reported cases may reflect diverse scenarios, ranging from multiple self-limiting but independent introductions to a self-sustaining local outbreak.
Methods and Findings: We developed a stochastic model that captures variation and uncertainty in ZIKV case reporting, importations, and transmission, and applied it to assess county-level risk throughout the state of Texas. For each of the 254 counties, we identified surveillance triggers (i.e., cumulative reported case thresholds) that robustly indicate further epidemic expansion. Regions of greatest risk for sustained ZIKV transmission include 33 Texas counties along the Texas-Mexico border, in the Houston Metro Area, and throughout the I-35 Corridor from San Antonio to Waco. Across this region, variation in reporting rates, ZIKV introductions, and vector habitat suitability drives variation in the recommended surveillance triggers for public health response. For high risk Texas counties, we found that, for a reporting rate of 20%, a trigger of two cumulative reported cases corresponds to a 60% chance of an ongoing local transmission.
Conclusions: With reliable estimates of key epidemiological parameters, including reporting rates and vector abundance, this framework can help optimize the timing and spatial allocation of public health resources to fight ZIKV in the US.
Introduction
In February 2016, Zika virus (ZIKV) was declared a public health emergency of international concern [1]. As of 11 May of 2016, the World Health Organization (WHO) confirmed mosquito-transmitted cases in 58 countries, with an estimated 290,000 cases in the Americas alone [2]. In the US, one of the primary vectors for ZIKV, Aedes aegypti, is thought to inhabit at least 30 states [3]. Texas ranks among the most vulnerable states for ZIKV transmission due to its suitable climate, international airports, and geographical proximity to affected countries [3-8]. Of the 503 imported ZIKV cases in the US, 35 have occurred in Texas. While these importations have yet to spark autochthonous (local) transmission, Texas has historically sustained several autochthonous outbreaks of one other arbovirus vectored by Ae. Aegypti— dengue (DENV) [9].
As peak mosquito season approaches in the US and more cases are potentially introduced via international travelers attending the Rio 2016 Olympic Games, public health decision makers will face considerable uncertainty in gauging the severity of the threat and in effectively initiating interventions, given the large fraction of unreported ZIKV cases as well as the shifting economic balance between intervention expenditure and disease burden [10,11]. Depending on the ZIKV symptomatic fraction, reliability and rapidity of diagnostics, importation rate, and transmission rate, the detection of five cases in a single Texas county, for example, may indicate five unrelated importations, a small outbreak from a single importation, or even a large, hidden epidemic underway (Fig 1). These possibilities are illustrated by prior outbreaks. In French Polynesia, a handful of suspected ZIKV cases were reported by October 2013; two months later an estimated 14,000-29,000 individuals had been infected [12,13]. By contrast, in Dominica, despite 18 confirmed cases in early 2016, two months later no sustained epidemic has yet ensued [14].
Here, we develop a model to support real-time ZIKV risk assessment that accounts for uncertainty regarding ZIKV epidemiology, including importation rates, reporting rates, and local vector population density. This framework can be readily updated as our understanding of ZIKV evolves to provide actionable guidance for public health officials, in the form of surveillance triggers that robustly indicate imminent epidemic growth. By simulating ZIKV transmission using a stochastic branching process model [15] based on recent ZIKV data and epidemiological estimates, we derive thresholds of reported cases indicative of ongoing local transmission for each of the 254 counties in Texas. Our results suggest that counties along the Texas-Mexico border, in the Houston Metro Area, and throughout the I-35 Corridor from San Antonio to Waco are at highest risk for sustained outbreaks.
Methods
Model
To transmit ZIKV, a mosquito must bite an infected human, the mosquito must get infected with the virus, and then the infected mosquito must bite a susceptible human. Rather than explicitly model the full transmission cycle, we aggregate the two-part cycle of ZIKV transmission (mosquito-to-human and human-to-mosquito) into a single meta-latent period, and do not explicitly model mosquitos. For the purposes of this study, we need only ensure that the model produces a realistic human-to-human generation time of ZIKV transmission.
We simulate a Susceptible-Exposed-Infectious-Recovered (SEIR) transmission process stemming from a single ZIKV infection using a Markov branching process model, stratifying for reported and unreported cases. The temporal evolution of the compartments is governed by daily probabilities for infected individuals transitioning between E, I and R states, new ZIKV introductions, and reporting of current infectious cases (Table S7). We assume that infectious cases cause a Poisson distributed number of secondary cases per day (via human to mosquito to human transmission), and that low reporting rates correspond to the percentage (~20%) of symptomatic ZIKV infections [10]. We make the simplifying assumption that asymptomatic cases transmit ZIKV at the same rate as symptomatic cases, which can be modified if future evidence suggests otherwise.
Serial interval
The ZIKV serial interval measures the average duration from an exposure to the subsequent exposure of a secondary case, and is estimated to range from 10 to 23 days [16]. Using boxcar models [17] for both the infectious and meta-latent period, we chose model parameters that produce a comparable delay between subsequent human infections rather than fit a more mechanistic model of human and vector infection and latency to ZIKV case data. First, we solved for transition rates that yield a negative binomial distribution of infectious periods with mean duration of 9.88 days (Table S6) [18]. Then, we fit the meta-latent period so that the combined duration of the infectious and meta-latent periods matched the empirical ZIKV serial interval distribution [16], yielding a mean meta-latent period of 10.4 days (95% CI 6-17) and a mean serial interval of 15.3 days (95% CI 9.5-23.5). Given that the meta-latent period includes human and mosquito incubation periods and mosquito biting rates, this range is consistent with the estimated 5.9 day human ZIKV incubation period [18]. This flexible framework can be extended and updated readily as we learn more about ZIKV.
Importation rate
Our analysis assumes that any ZIKV outbreaks in Texas will originate with infected travelers returning from regions with high ZIKV activity. During the first quarter of 2016, 27 travel-associated cases of ZIKV were reported in Texas, with 11 occurring in Houston's Harris County [19], yielding a baseline first quarter estimate of 0.3 imported cases per day throughout Texas. Given the geographic and biological overlap between ZIKV, DENV and Chikungunya (CHIKV), we use historical DENV and CHIKV importation data to inform our importation model, while recognizing that future ZIKV importations may be fueled by large epidemic waves in neighboring regions and travel from the 2016 Olympics, and thus far exceed recent DENV and CHIKV importations [20]. In 2014 and 2015, arbovirus introductions into Texas were threefold higher during the third quarter than the first quarter of each year, perhaps driven by seasonal increases in arbovirus activity in endemic regions and the approximately 40% increase in international travel to the US [21]. Taking this as a baseline (lower bound) scenario, we project a corresponding increase in ZIKV importations to 0.9 cases per day (statewide) for the third quarter. We also consider an elevated importation scenario, in which the first quarter cases (27) in Texas represent only the symptomatic (20%) imported cases, corresponding to a projected third quarter importation rate of 4.5 cases/day.
To estimate the importation rate for a specific county, we multiply the statewide rate by the county importation probability, which is given by a maximum entropy model [22] that we fit to the 183 DENV, 38 CHIKV, and 31 ZIKV reported Texas importations from 2002 to 2016. DENV, CHIKV, and ZIKV importation patterns differ most noticeably along the Texas-Mexico border. Endemic DENV transmission and sporadic CHIKV outbreaks in Mexico regularly spill over into neighboring Texas counties. In contrast, ZIKV is not yet as widespread in Mexico as it is in Central and South America, with no reported ZIKV importations along the border to date. We included DENV and CHIKV importation data in the model fitting so as to consider potential future importations pressure from Mexico, as ZIKV continues its northward expansion. We analyzed 72 socio-economic, environmental, and travel variables, and used both representative variable selection [23] and predictive variable selection [24] to identify the most informative variables. Near duplicate variables and those that contributed least to model performance, based on out-of-sample cross validation, were discarded, reducing the original set of 72 variables to 10 (Supplement §1).
Estimating local transmission rates
Upon ZIKV importation by an infected traveler, the risk of ZIKV emergence will depend on the likelihood of mosquito-borne transmission. For each Texas county, we used the Ross-Macdonald formulation to estimate the ZIKV reproduction number (R0), which is the average number of secondary infections caused by the introduction of a single infectious individual into a fully susceptible population (Supplement §2) [25]. To parameterize the model, we use mosquito life history estimates from a combination of DENV and ZIKV studies and estimated Aedes aegypti abundance for each county [4]. For parameters that are sensitive to temperature (i.e., mosquito mortality and the extrinsic incubation period), we adjusted the estimates using average reported temperatures for the month of August [26]. We then obtained the county-level ZIKV transmission rate, using β=R0*γ
Identifying Surveillance Triggers
Policymakers must often make decisions in the face of uncertainty, such as when and where to initiate ZIKV interventions. Our stochastic framework allows us to address a simple but important question: at what point (after how many reported cases), has the probability of ongoing local transmission reached a critical threshold. The choice of critical threshold will depend on the risk tolerance of the policymaker. A policymaker wishing to trigger interventions early, upon even a low probability of epidemic spread, has a low tolerance for failing to intervene (false negative); a policymaker willing to wait longer, has a higher risk tolerance.
For each scenario, we ran 10,000 stochastic simulations, which terminate when the outbreak ceases or cumulative infections reaches 2,000. We classify simulations as either epidemics or self-limiting outbreaks; epidemics are those reaching 2,000 cumulative infections with a maximum prevalence above 50 (Fig S3). We define prevalence as the number of current unreported and reported infections. To identify surveillance triggers, we solved for the minimum number of cumulative reported cases (c) that indicate future epidemic expansion with a specified probability (p). That is, we find c such that among all simulations that eventually reach c, at least a fraction p subsequently progress into an epidemic. For example, if we set our threshold probability (risk tolerance) to p=0.5 and derive an epidemic trigger of c=10 reported cases, then we expect outbreaks that reach 10 reported cases to have a 50% chance of future epidemic expansion. In supplementary analysis, we also consider surveillance triggers for detecting when the current prevalence has reached a specified threshold (Fig S6).
Uncertainty Analysis
Given the considerable uncertainty regarding ZIKV epidemiology, we solve for surveillance triggers assuming that both R0 and importation rate are unknown, but lie within a plausible range for Texas. We consider two scenarios: a high risk scenario, where R0 is thought to exceed one, and a completely unknown risk scenario (Fig S5). For each case, we derive triggers by randomly drawing 10,000 simulations from all combinations of Texas (baseline) importation and transmission rates from either the high risk Texas counties (R0 ≥1) or all counties.
Results
To develop surveillance triggers for Texas, we first project county-level ZIKV importation and transmission rates for August 2016. ZIKV importation risk within Texas is predicted by variables reflecting urbanization, mobility patterns, and socioeconomic status (Table S4), and is concentrated in metropolitan counties of Texas (Fig 2A). The highest risk counties, Harris (with an estimated 27% chance of receiving the next imported Texas case) and Travis (10%), which includes Austin, both contain international airports. Other high risk regions include Brazos County, the Dallas and San Antonio metropolitan areas, and several counties along the Texas-Mexico border.
We also estimate county-level ZIKV risks of autochthonous ZIKV transmission (Fig 2B), and find that the majority of Texas counties (87%) have an estimated R0<1, and thus are unlikely to sustain local epidemics. The Southeast region of Texas has the highest estimated transmission risk, driven primarily by high mosquito habitat suitability. These estimates are sensitive to uncertainty in several parameters (Fig S1-S2), and should be updated as we learn more about ZIKV. Without perfect estimates for county R0, we develop plausible distributions for a county of high or unknown risk for local transmission using our 254 county estimates, resulting in a median high risk R0 of 1.1 (95% CI: 1.0-1.8) and median unknown risk R0 of 0.5 (95% CI: 0.0-1.2).
Under a single set of epidemiological conditions, wide ranges of outbreaks are possible (Fig 3A). The relationship between what policymakers can observe (cumulative reported cases) and what they wish to know (current prevalence) can be obscured by such uncertainty, and will depend critically on both the transmission and reporting rates (Fig 3B). If key drivers, such as R0, can be estimated with confidence, then the breadth of possibilities narrows, enabling more precise surveillance. For example, under a known moderate R0 scenario, ten cumulative reported cases corresponds to an expected prevalence of 6 with a 95% CI of 1-15; under an unknown high R0 scenario, the same number of cases corresponds to an expected prevalence of 10 with a much wider 95% CI of 2-33 (Fig 3B).
We apply our model to address the question, at what point (after how many reported cases), has the probability of ongoing local transmission reached a critical threshold. Under both a known moderate risk and unknown high risk scenario, we track the probability of epidemic expansion following each additional reported case (Fig 3C). Across the full range of reported cases, the probability of epidemic spread will always be higher under the high risk scenario, with the moderate risk scenario showing more sensitivity to the reporting rate. These curves can support both real-time risk assessment as cases accumulate and the identification of surveillance triggers indicating when risk exceeds a specified threshold. For example, suppose a policymaker wishes to initiate an intervention when the chance of sustained transmission exceeds 50%. In the high risk scenario, they should act immediately following the 2nd or 3rd reported case; in the moderate risk scenario, the corresponding trigger rangers from 3 to 10 reported cases, depending on the reporting rate. As the policymaker's threshold (risk tolerance) increases, the recommended surveillance triggers can increase by orders of magnitude.
We determine county-level surveillance triggers throughout Texas, assuming that a policymaker would act when the probability of sustained local transmission reaches 70%, under two importation scenarios: (1) a baseline importation rate extrapolated from recent importations to August 2016 (Fig 4A), and (2) an elevated importation scenario assuming that only one fifth of ZIKV importations (the symptomatic proportion) have been observed (Fig 4B). Under baseline importations, only 21 of the 254 counties in Texas are expected to attain the trigger conditions (Fig 4A), and have triggers ranging from 1 (Starr County) to 71 (Bastrop County) reported cases, with a mean of 15. The San Antonio metropolitan region appears to be most at risk, with almost every county capable of epidemics and triggers ranging from 2 to 5 reported cases. The greater Houston metropolitan area also is at high risk, with surveillance triggers of 2 (Brazoria) and 3 (Fort Bend) cases.
Under the elevated importation rate scenario, the mean surveillance triggers decrease by 4 reported cases (Fig 4B) and the size of Texas' population at risk for sustained ZIKV transmission is expected to increase from ~14% to ~30%. If the importation rate is sufficiently high, even lower risk regions (R0 just below one) can experience sustained outbreaks. For example, under the elevated importation scenario, Harris county, with an estimated R0=0.8, is likely to suffer persistent transmission, fueled by an expected 1.2 importations per day. Notably, the Dallas metropolitan area is projected to be at minimal risk for sustained transmission under both scenarios.
To avoid confusion and facilitate implementation, policymakers may wish to issue common guidelines for all counties. For example, they may seek a common surveillance trigger that provides robust warning across the plausible range of local conditions. For example, assuming a 20% reporting rate, we consider the implications of a statewide trigger of two reported cases (Fig 4C). Upon seeing two cases, the epidemic risk varies widely, with most counties having near zero probabilities of a sustained outbreak and a few counties far exceeding 50%. For example, two reported cases in Starr County, along the Texas-Mexico border, correspond to a 98% chance of ongoing transmission.
Discussion
US public health authorities are responding to ZIKV importations and preparing for the possibility of ZIKV outbreaks in vulnerable regions. A key challenge is knowing when and where to initiate interventions based on potentially sparse and biased ZIKV case reports. Our simple model is designed to address this challenge by developing robust surveillance triggers for epidemic events, such as reaching a threshold of current ZIKV infections indicative of future epidemic expansion. We demonstrate its application across the 254 ecologically and demographically diverse counties of Texas, a high risk state [5,6,8]. Based on county-level estimates for ZIKV importation and transmission rates (Fig 3) we expect that most Texas counties are not at risk for a sustained ZIKV epidemic (Fig 4A). However, 30% of Texas' population resides in vulnerable regions, including the cities of Austin, San Antonio, and Waco along the I-35 corridor, Houston, and the Rio Grande Valley. The higher the ZIKV importation rate in these locations, the higher the chance of sustained transmission (Fig 4B). However, even in the most high risk regions of Texas, we expect far more limited ZIKV transmission than observed in Central America, South America, and Puerto Rico, where R0 has been estimated to be as high as 11 [25,27,28]. This is consistent with recent introductions of DENV and CHIKV into Texas, which have failed to spark large epidemics.
As an example, we suppose that a policymaker wishes to initiate interventions upon the epidemic probability reaching 70%. The recommended triggers range from 1 to 71 reported cases, across the 21 Texas counties capable of sustaining such outbreaks. The model can also be used to assess universal triggers intended to robustly detect sustained transmission across a range of local conditions. A risk averse policymaker will likely select a very low trigger, ensuring early detection in the riskiest sites at the cost of false alarms in low risk locations (Fig 4C). Note that our triggers include all imported and locally transmitted cases; trigger specificity may be improved by rapidly and accurately excluding imported cases from the count and restarting the count upon sufficient temporal separation of cases.
In Texas' vulnerable regions, where the estimated reproduction numbers lie above one, frequent importations can compound the epidemic risk. Additionally, importations in counties where estimated reproductive numbers lie below one, like Harris, can spark substantial transmission (Fig 4B). These findings apply only to the early, pre-epidemic phase of ZIKV in Texas, when travel from affected regions outside the contiguous US is the primary importation source. If self-sustaining outbreaks emerge within Texas, there would likely be county-to-county importations that are not yet included in the model, and could increase risk in counties surrounding high risk regions.
Importantly, our analyses rest on the very recent and limited scientific investigations of ZIKV's biology and epidemiology, and should be continually updated as our understanding matures. Furthermore, while we demonstrated triggers for a few plausible scenarios, the design of a trigger--both the event to be detected and the probability threshold upon which to take action--requires extensive public health expertise and deliberation. Nonetheless, this simple framework offers a flexible means for bringing current data and expert knowledge to assist critical public health decision making.
The reporting rate dictates the relationship between observed cases and the underlying outbreak, and its magnitude impacts the timeliness and precision of detection. If only a small fraction of cases are reported, the first few reported cases may correspond to a wide range of underlying epidemiological conditions, from isolated introductions to a growing epidemic. In contrast, if most cases are reported, policymakers can wait longer (in terms of the number of detected cases) to trigger interventions and have more confidence in their epidemiological assessments. ZIKV reporting rates are expected to remain quite low, because an estimated 80% of infections are asymptomatic, and DENV reporting rates have historically matched its asymptomatic proportion [10,29]. Obtaining a realistic estimate of the ZIKV reporting rate is arguably as important as increasing the rate itself, with respect to reliable situational awareness and forecasting. An estimated 8-22% of ZIKV infections were reported during the 2013-2014 outbreak in French Polynesia [28]; similarly, an estimated 10% have been reported during the ongoing epidemic in Columbia [27]. While these provide a baseline estimate for the US, there are many factors that could increase (or decrease) the detection rate, such as ZIKV awareness among both the public and health-care practitioners. Thus, rapid estimation of the reporting rate should be a high priority. While some methods require extensive epidemiological data not typically available early in an outbreak [30], a new method exploiting early outbreak viral sequence data was introduced during the recent West African Ebola epidemic [31]. However, as of May 2016, there are no US ZIKV sequences available on GenBank and few available from other regions.
Our flexible framework can help policymakers make sound risk assessments from noisy and biased surveillance data. It forces analysts to be explicit about risk tolerance, that is, the certainty needed before sounding an alarm. For example, should ZIKV-related pregnancy advisories be issued when there is only 5% chance of an impending epidemic? 10% chance? 80%? A policymaker has to weigh the costs of false positives--resulting in unnecessary fear and/or intervention--and false negatives--resulting in suboptimal disease control and prevention-complicated by the difficulty inherent in distinguishing a false positive from a successful intervention. The more risk averse the policymaker (with respect to false negatives), the earlier the trigger should be, which can be exacerbated by low reporting rates, high importation rate, and inherent ZIKV transmission potential. In ZIKV prone regions with low reporting rates, even risk tolerant policymakers should act quickly upon seeing initial cases; in lower risk regions, longer waiting periods may be prudent.
Acknowledgments
We acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing High performance computing resources that have contributed to the research results reported within this paper. URL: http://www.tacc.utexas.edu.