Abstract
Background Lyme disease is the most common vector-borne disease in temperate zones and a growing public health threat in the US. Tick life cycles and disease transmission are highly sensitive to climatic conditions but determining the impact of climate change on Lyme disease burden has been challenging due to the complex ecology of the disease and the presence of multiple, interacting drivers of transmission.
Objectives We estimated the impact of prior temperature and precipitation conditions on US Lyme disease incidence and predicted the effect of future climate change on disease.
Methods We incorporated 17 years of annual, county-level Lyme disease case data in a panel data statistical modeling approach to investigate prior effects of climate change on disease while controlling for other putative drivers. We then used these climate-disease relationships to forecast Lyme disease cases using CMIP5 global climate models and two potential climate scenarios (RCP 4.5 and RCP 8.5).
Results We find that climate is a key driver of Lyme disease incidence across the US, but the relevant climate variables and their effect sizes vary strongly between regions, with larger effects apparent in the Northeast and Midwest where Lyme disease incidence has recently increased most substantially. In both of these regions, key climate predictors included winter temperatures, spring precipitation, dry summer weather, and temperature variability. Further, we predict that total US Lyme disease incidence will increase significantly by 2100 under a moderate emissions scenario, with nearly all of the additional cases occurring in the Northeast and Midwest.
Conclusions Our results demonstrate a regionally-variable and nuanced relationship between climate change and Lyme disease and highlight the need for improved preparedness and public health interventions in endemic regions to minimize the impact of further climate change-induced increases in Lyme disease burden.
Introduction
Arthropod-transmitted pathogens and the diseases they cause pose a severe and growing threat to global public health (World Health Organization 2014). Because vector life cycles and disease transmission are highly sensitive to abiotic conditions (Mattingly 1969; Sonenshine and Roe 2013), climate change is expected to alter the magnitude and geographic distribution of vector-borne diseases (Kilpatrick and Randolph 2012; World Health Organization 2014). Climatic changes, in particular warming temperatures, have already facilitated expansion of several vector species (e.g., Purse et al. 2005; González et al. 2010; Roiz et al. 2011; Clow et al. 2017a), and have been associated with increased vector-borne disease incidence (e.g., Loevinsohn 1994; Subak 2003; Hii et al. 2009). Identifying areas of high risk for current and future vector-borne disease transmission under climate change is critical for mitigating disease burden. However, the presence of interacting drivers of disease transmission such as land use change and globalization, and the complex ecology of vector-borne disease make this effort challenging (Lafferty and Mordecai 2016; Mills et al. 2010; Ostfeld and Brunner 2015; Rogers and Randolph 2006; Tabachnick 2010).
This challenge is particularly apparent in the case of Lyme disease, the most common vector-borne disease in temperate zones (Kurtenbach et al. 2006; Rizzoli et al. 2011; Rosenberg et al. 2018), because transmission depends on a complex sequence of biotic interactions between vector and numerous host species that may respond differently to environmental change (Ostfeld 1997). In the US, Lyme disease is caused by the bacteria Borrelia burgdorferi, and is vectored by two tick species: Ixodes scapularis in the eastern and midwestern US and Ixodes pacificus in the western US. After hatching from eggs, both tick species have three developmental stages—larvae, nymph, and adult—during which they take a single blood meal from a wide range of vertebrate hosts before transitioning to the next developmental stage or reproducing (Sonenshine and Roe 2013). This life cycle takes 2-3 years to complete, 95% of which is spent at or below the ground surface in diapause, seeking a host, digesting a blood meal, or molting (Ostfeld and Brunner 2015; Sonenshine and Roe 2013).
Given their long life spans, inability to regulate their body temperature, and high degree of interaction with the physical environment, ticks are highly sensitive to changes in climatic and weather conditions (Sonenshine and Roe 2013). Prior research has demonstrated that temperature and moisture conditions at the ground surface strongly influence tick mortality, development, and host-seeking abilities (Ostfeld and Brunner 2015). In particular, high temperatures and low humidity decrease I. scapularis and I. pacificus survival (Bertrand and Wilson 1996; Nieto et al. 2010; Stafford 1994) and host-seeking activity (Lane et al. 1995; MacDonald et al. 2019b; Schulze et al. 2001; Vail and Smith 1998), while cold temperature extremes cause significant mortality (Lindsay et al. 1995; Vandyk et al. 1996). Accordingly, temperature and precipitation are important predictors of these tick species’ latitudinal and altitudinal range limits (Berger et al. 2014a; Brownstein et al. 2003; Estrada-Peña 2002; Leighton et al. 2012; McEnroe 1977; Ogden et al. 2005), and changes in climatic conditions have been associated with northward range shifts of I. scapularis (Clow et al. 2017b, 2017a; Ogden et al. 2014a).
While the movement of vector species to higher latitudes suggests an associated impending increase in Lyme disease with further climate warming, the direct impacts of climate on Lyme disease cases are difficult to measure given the influence of many non-climate related factors (Kilpatrick et al. 2017). As a result, the few studies that have attempted to determine the impact of climate conditions on Lyme disease incidence have yielded conflicting results. For example, studies have found positive associations between incidence and each of the following: average spring precipitation (McCabe and Bunnell 2004), June moisture index in the region two years prior (Subak 2003), fewer dry summer days (Burtis et al. 2016), warmer winter temperatures in the prior year (Subak 2003), and increasing average annual temperature (Dumic and Severnini 2018; Robinson et al. 2015). However, others failed to detect an effect of temperature on incidence (McCabe and Bunnell 2004; Schauber et al. 2005), found the timing of climatic changes to be inconsistent with the timing of variation in Lyme disease cases (Randolph 2010), were limited in geographic scope (Burtis et al. 2016; Dumic and Severnini 2018; McCabe and Bunnell 2004; Robinson et al. 2015; Subak 2003) and/or used modeling techniques that did not account for confounding variables that might influence disease incidence (Subak 2003; McCabe and Bunnell 2004). Further, while the rise in Lyme disease cases in the US has occurred concurrently with climatic changes promoting tick suitability, demonstrating causal relationships is challenging (Ostfeld and Brunner 2015). This has led others to argue that climate change is merely the backdrop for rising tick-borne disease incidence (Randolph 2010), while other factors such as increasing physician awareness are the true drivers of increased disease burden (Morshed et al. 2006; Scott and Scott 2018). Nonetheless, a recent CDC study on vector-borne disease burden in the US showed a dramatic rise in Lyme disease (Rosenberg 2018), and much of the extensive media coverage of this report asserted the role of climate change. Despite this media attention, as well as strong known relationships between climate conditions and key features of vector ecology, the evidence for climate change as a driver of increasing Lyme disease incidence in the US remains equivocal.
In this study, we investigate the role of past climatic conditions on Lyme disease incidence across the US using a 17-year, county-level Lyme disease case reporting dataset and explicitly controlling for other drivers of disease burden. Specifically, we ask: How has interannual variation in climate conditions contributed to changes in Lyme disease incidence? We include climate variables capturing changes in temperature and precipitation conditions and investigate how relationships between climate and Lyme disease outcomes vary across different regions of the US. To avoid drawing spurious conclusions about the effects of climate, we analyze the effects of other known and potential drivers of disease incidence such as changing forest cover, public awareness of tick-borne disease, and health-seeking behavior, and use a statistical approach that explicitly accounts for unobserved heterogeneity in disease incidence between counties and years. We then use these modeled, regionally-specific relationships between climate and Lyme disease burden to ask: How is US Lyme disease incidence expected to change under future climate scenarios? We report the predicted change in Lyme disease incidence for individual US regions in 2040 – 2050 and 2090 – 2100 relative to hindcasted 2010 – 2020 levels under two potential climate scenarios: RCP 8.5, which reflects the upper range of the literature on emissions, and RCP 4.5, which reflects a moderate mitigation scenario (Hayhoe et al. 2017).
Methods
Lyme disease case data
We obtained annual, county-level reports of Lyme disease cases spanning 2000 to 2017 from the US Centers for Disease Control and Prevention (CDC) (Supplementary Methods). These disease case data provide the most spatially-resolved, publicly available surveillance data in the US. Raw case counts were converted to incidence—the number of cases per 100,000 people—for each year using annual county population sizes from the US Census Bureau (USCB).
Climate data
We calculated the following variables to capture climate conditions relevant for tick-borne disease transmission: average winter temperature lagged one year; average spring precipitation; the number of hot, dry days in May – July; cumulative average temperature; cumulative daily precipitation; temperature variance; and precipitation variance (Table 1). Details about how these variables were calculated and their biological relevance are listed in Table 1. For past climate conditions, we obtained daily, county-level average temperature and total precipitation data from the National Oceanic and Atmospheric Administration (NOAA) weather stations accessed via the CDC’s Wide-ranging Online Data for Epidemiological Research (WONDER) database.
To estimate future climate variables, we used CMIP5 modeled temperature and precipitation data available from NASA Goddard Institute for Space Studies global climate models (Schmidt et al. 2014). Specifically, we obtained estimates of daily near-surface air temperature and precipitation through 2100 under the upper climate change scenario (RCP 8.5) and a moderate climate change scenario (RCP 4.5) (Taylor et al. 2012; van Vuuren et al. 2011). These climate scenarios are relatively similar in the radiative forcing levels assumed through 2050 but diverge substantially in the latter half of the century. Climate estimates from these two scenarios are provided at a 2° x 2.5° resolution; values were then ascribed to counties based on county latitude and longitude (Supplementary Methods). Mean values for hindcasted and forecasted climate variables for each region are listed in Supplementary Table 1.
Awareness data
We controlled for variation in public awareness of ticks and Lyme disease using data from Google trends on the frequency of “ticks” as a search term. We obtained data on “ticks” search frequency, normalized for a given location and year, for 2004 (the first year the data were available) to 2017. We also initially used “tick bite”, and “Lyme disease” as search terms, but found that these generated nearly identical coefficient estimates, thus we proceeded to use only the “ticks” search term as a predictor. Search frequency data were aggregated at the designated market area (DMA), the smallest spatial scale available. Search frequency values for a given DMA, which contained an average of 14 counties, were thus applied equally to all counties therein. We also calculated a 1-year lagged version of the tick search variable, as awareness of tick-borne disease is likely endogenous to disease reporting, and using predetermined values reduces endogeneity concerns (Bascle 2008).
Health-seeking behavior data
We explicitly controlled for variation in health-seeking behavior, previously posited as a driver of Lyme disease reporting (Armstrong et al. 2001; Wilking and Stark 2014) by including the following three variables: diabetes incidence, health insurance coverage, and poverty. Diabetes was selected as a healthcare-seeking proxy as the behavioral drivers of healthcare seeking that drive diabetes reporting are likely to be similar to those of Lyme disease. Namely, the early symptoms of diabetes are often vague (Harris and Eastman 2000) and an individual’s ability and decision to seek healthcare plays a large role in whether their case is recorded, as reflected in the substantial underreporting of this disease (Anwar et al. 2011; Doshi et al. 2010; Harris and Eastman 2000). We obtained annual, county-level data for 2004 to 2015 on the percentage of adults aged 20+ years diagnosed with type 1 or type 2 diabetes from the CDC’s US Diabetes Surveillance System. To capture variation in healthcare access, we included the annual percentage of county residents with any form of health insurance coverage using data for 2005 to 2017 provided by the USCB’s Small Area Health Insurance Estimates (SAHIE) program. Lastly, we used data from the USCB to include the percentage of county residents living in poverty as a predictor, as poverty has been significantly negatively associated with healthcare-seeking behavior (Bourne 2009; Kirby and Kaneda 2005).
Land cover data
We included two land cover variables putatively associated with higher tick-borne disease risk: the percent forest in a given county and year, and the percent mixed development (Brownstein et al. 2005b; Dister and Fish 1997; Frank et al. 1998; Glass et al. 1995; Killilea et al. 2008; MacDonald et al. 2019a). We calculated these variables using 30-m resolution land cover data from the US Geological Survey (USGS) National Land Cover Database (NLCD) (Yang et al. 2018). Percent forest included any deciduous, evergreen, or mixed forest. Mixed development was defined as areas with a mixture of constructed materials and vegetation, including lawn grasses, parks, golf courses, and vegetation planted in developed settings. We calculated county-level values of these land cover variables for 2001, 2004, 2006, 2008, 2011, 2013, and 2016 as these are the only years the NLCD dataset is currently available.
To estimate future land cover variables, we used USGS land cover projections available through 2100 (Sohl et al. 2014). We used modeled land cover data from two land-use change scenarios corresponding to the Intergovernmental Panel on Climate Change (IPCC) Special Report on Emission Scenarios (SRES). We used scenario B1, which reflects lower urban development, to align with the moderate climate change scenario (RCP 4.5), and scenario A1B, which reflects higher urban development and forest clearing, to align with the upper climate change scenario (RCP 8.5) (Nakicenovic et al. 2000; Rogelj et al. 2012; Sohl et al. 2014). Using these data, we again calculated annual, county-level values of percent forest cover and mixed development. However, as the ‘mixed development’ land cover class was not included in the projected data, we instead used the ‘mechanically disturbed’ public or private land cover class (Supplementary Methods).
Regional divisions
Given the large variation in climatic conditions across the US, as well as variation in ecological dynamics of tick-borne disease such as tick species identity, tick densities, tick questing behavior, and host community composition (Eisen et al. 2016; Kilpatrick et al. 2017; Ostfeld 1997; Salkeld and Lane 2010), we examined regional differences in climate-disease relationships. We used the US Fish & Wildlife Service regional boundaries to divide the US into the following seven regions for analysis: Northeast, Midwest, Mountain Prairie, Pacific, Pacific Southwest, Southwest, and Southeast (Figure 1). These regional divisions were selected as they roughly correspond to genetic structuring of I. scapularis and I. pacificus (Humphrey et al. 2010; Kain et al. 1997, 1999) and are likely distinct in environmental conditions and resources (Ricketts et al. 1999; Smith et al. 2018). Further, each region contains only one vector species: I. scapularis in the Northeast, Midwest, Southeast, and Southwest, and I. pacificus in the Pacific and Pacific Southwest (Dennis et al. 1998). As neither species has an established presence in the Mountain Prairie, this region was removed from the analysis. Regional descriptions, including the population size (as of 2017), the number of counties, and the average climate conditions, are provided in Supplementary Table 2.
Statistical approach
We used a least squares dummy variable (termed “fixed-effects” in econometrics) regression approach to estimate changes in Lyme disease incidence using repeated observations of the same groups (counties) from 2000 – 2017 (Larsen et al. 2019). We included ‘county’ and ‘year’ as dummy variables to control for any unobserved heterogeneity that may influence reported Lyme disease burden in a particular county across all years (e.g., number of health care providers), or influence Lyme disease in all counties in a given year (e.g., changes in disease case definition). All counties (n = 2,232) for which there was complete data on Lyme disease cases, climate, and other predictors were included.
To account for regional variation in the predictors of tick-borne disease incidence (Raghavan et al. 2014; Wimberly et al. 2008), we ran separate models for each US region (see Methods: Regional divisions). We used stepwise variable selection, in which variables were added if they reduced model Akaike information criterion (AIC) by 2 or more, to identify the climate, land cover, and non-ecological predictors that best explained Lyme disease incidence in each region (Yamashita et al. 2007; Zhang 2016). We assessed the multicollinearity of these models by calculating the variance inflation factor (VIF). No predictors had VIF values greater than 10 after the stepwise variable selection procedure, thus we did not remove any variables from the final models due to high collinearity (Hair et al. 2014).
We accounted for spatial autocorrelation of observations by using cluster-robust standard errors. This nonparametric approach accounts for arbitrary forms of autocorrelation within a defined “cluster” to avoid misleadingly small standard errors and test statistics (Cameron and Miller 2015). We specified clusters as US Agricultural Statistics Districts (ASDs) as these districts contain contiguous counties grouped by similarities in soil type, terrain, and climate. When reporting on the significance of a predictor, we use standard errors and p-values calculated using this correction.
Lyme disease forecasting
We forecasted Lyme disease incidence using the climate and land cover variables included in the best model for each region as well as a county dummy variable. Non-ecological predictors were not included as projections for these variables are unavailable. Using these models, we obtained regional estimates for Lyme disease incidence under the upper and moderate climate change scenarios (RCP 8.5 and RCP 4.5) for 2040 – 2050 and 2090 – 2100. We calculated county-level changes in Lyme disease incidence by subtracting modeled incidence for 2010 – 2020 from forecasted incidence generated using the same modeled climate and land cover data sources. We converted predicted Lyme disease incidence to cases by assuming county population sizes remained the same as those in 2017. As the USCB projects a 75% increase in US population size by 2100 (under the most likely scenario regarding fertility, mortality, immigration, and emigration rates) (U.S. Census Bureau 2000), our estimates on the number of additional Lyme disease cases are conservative. To generate rough predictions of Lyme disease case counts under population growth, we provide estimates that assume a 75% increase in population size relative to 2017 within each county. We report point estimates and 95% prediction intervals when discussing predicted changes in Lyme disease case counts.
Model validation
We assessed predictive model accuracy by comparing hindcasted Lyme disease incidence under both emissions scenarios to observed values for 2008 – 2017 (Clark et al. 2001; Judge et al. 1985). We also compared model accuracy under varying model specifications. In the first specification, each regional model contained the predictors (climate, land cover, and non-ecological) determined through variable selection (see Methods: Statistical approach) as well as county and year dummy variables. In the second specification, each regional model contained all available predictors (7 climate predictors, 2 land cover predictors, and 4 non-ecological predictors) and the county and year dummy variables. Under the third specification, regional models contained all available predictors but no dummy variables. Under each of these specifications, we created models of Lyme disease incidence on a training dataset containing a randomly selected 75% subset of counties and years and used the withheld 25% of observations for validation (Caldwell et al. 2016; Hijmans 2012). To evaluate the performance of each model specification, we calculated the root-mean-square error and correlation coefficient between predicted and actual Lyme disease incidence for 2006 – 2013 (the years with complete data for all predictors) for each regional model.
To capture any non-linear relationships between climate predictors and Lyme disease incidence, we also generated models using quadratic versions of the climate predictors where applicable. Specifically, we used the stepwise variable selection approach starting with quadratic and linear versions of each climate variable to again determine the best model for each region. We then used these models to forecast Lyme disease incidence in 2090 - 2100 under both the upper and moderate climate change scenarios.
Results
Climate and Lyme disease incidence
At least one climate variable was included in the best model of Lyme disease incidence for all US regions with vector species present (Table 2). However, the specific climate variables included in the model varied between regions. Variables capturing precipitation conditions, such as cumulative precipitation or average spring precipitation, were included in models of Lyme disease incidence in the Southwest, Southeast, and Pacific regions. Conversely, only average winter temperature was predictive of Lyme disease incidence in the Pacific Southwest. In the Northeast and Midwest, multiple temperature and precipitation variables such as the number of hot dry days, average spring precipitation, average winter temperature, and temperature variance were included. Further, cumulative temperature was included in the Northeast model while cumulative precipitation was included in the Midwest. Where included, average winter temperature and cumulative temperature were positive predictors of Lyme disease incidence, while average spring precipitation and precipitation variance were negative predictors. The effects of cumulative precipitation, temperature variance, and the number of hot, dry days varied between regions.
Non-climate predictors and Lyme disease incidence
For all regions, the best model of Lyme disease incidence included tick awareness, diabetes incidence, and a land cover variable (Table 2). Specifically, the 1-year lagged tick search frequency was included rather than the contemporary equivalent as it led to greater reductions in model AIC. This tick awareness variable was a positive predictor in all regions. County-level diabetes incidence was a negative predictor in the Northeast, Midwest, and Southeast, and a positive predictor in the Pacific, Pacific Southwest, and Southwest. The percent land cover classified as mixed development was included in the best model for the Northeast (negative predictor), and for the Pacific Southwest and Southwest (positive predictor), while the percent forest cover was included in the Midwest and Pacific (negative predictor), and in the Southeast (positive predictor). The other available non-climate predictors—county-level poverty and health insurance coverage—did not meet the criteria for inclusion in any regional models (see Methods: Statistical approach).
The above predictors were included in each regional model of incidence along with county and year dummy variables. A large portion of the variance in incidence for each region was explained by the county dummy variable (Table 2), indicating that unobserved county-level heterogeneity is a large driver of variable Lyme disease incidence.
Model Validation
Hindcasted Lyme disease incidence matched the observed values with reasonable accuracy overall, with greater correlation between estimated and observed values in higher incidence regions (Northeast and Midwest) than in lower incidence regions (Pacific, Pacific Southwest, Southwest, and Southeast) (Table 3 and Supplementary Figure 1). For all regions, total estimated Lyme disease incidence was within 8.9% of the observed total incidence. Further, the correlation between estimated Lyme disease incidence for a particular county and year and the observed values were 0.86 and 0.90 for the Northeast and Midwest, respectively. In the lower incidence regions, the correlation coefficients were 0.51, 0.34, 0.34, and 0.49 for the Pacific, Pacific Southwest, Southwest, and Southeast, respectively. While the point estimates for hindcasted Lyme disease incidence tended to closely match the observed values, the prediction intervals around these estimates were large, particularly for the lower incidence regions.
Predictive accuracy also varied across the three model specifications evaluated here. As expected, the model specification without county and year dummy variables had higher root-mean-square error or lower correlation coefficients for nearly all regions, indicating lower predictive accuracy (Supplementary Table 3). However, the two model specifications with county and year dummy variables—the main model specification in which predictors were determined through variable selection, and the alternative model specification containing all possible predictors—were very similar in their predictive accuracy. The simpler, variable selection-based model specification was thus selected for the remaining analysis to minimize overfitting and decrease transferability concerns (Allen and Fildes 2001; Wenger et al. 2011; Wenger and Olden 2012), but forecasting results from both model specifications are shown in Supplementary Table 4. Forecasting results from the alternative model specification with all ecological predictors suggest smaller changes and higher uncertainty in Lyme disease incidence for each region, compared to the main model specification.
Several regional models were improved through replacing linear climate predictors with quadratic climate predictors. Specifically, after repeating the variable selection approach including quadratic and linear climate terms, the Northeast incidence model now included quadratic terms for average spring precipitation and cumulative temperature; the Southwest models included quadratic terms for cumulative precipitation, average spring precipitation and precipitation variance; and the Midwest models included quadratic terms for hot dry days, average winter temperature, average spring precipitation, cumulative precipitation, and temperature variance. The Pacific, Pacific Southwest, and Southeast incidence models were not improved through the inclusion of quadratic climate predictors. Forecasting results from models including these non-linear climate variables are similar to those with linear predictors under the moderate climate change scenario, although with smaller predicted changes in incidence (Supplementary Table 5). Forecasting results differ more substantially under the upper climate change scenario, with non-significant decreases predicted for the Northeast and Southeast when quadratic climate predictors are included, but significant increases predicted for these regions under the original model. As the climate predictors used in this study were drawn from the prior literature on climate and Lyme disease cases (see Table 1), in which linear versions of climate predictors were used, we use output from the linear models when presenting forecasting results (but see Supplementary Table 5).
Forecasted Lyme disease incidence
Under the upper climate change scenario (RCP 8.5), the total number of Lyme disease cases in the US is predicted to increase by 17,672 [-13322, 48666] by 2040 – 2050 and 27,630 [-6468, 61727] by 2090 – 2100 (Figure 2, Table 4). These case changes are relative to hindcasted 2010 – 2020 case counts and are based on 2017 population sizes. For the moderate climate change scenario (RCP 4.5), the predicted increases in cases for 2040 – 2050 and 2090 – 2100 were 15,395 [-15493, 46284] and 34,183 [1124, 67243], respectively. These results indicate that substantial future increases in US Lyme disease burden are likely, although the prediction intervals around these estimates are large, and overlap zero except under the moderate climate change scenario for 2090 – 2100. Further, the expected change in incidence varies strongly by region (Figures 2–3). Significant increases in cases are predicted in the Northeast by 2090 – 2100 under both climate change scenarios (29,813 [8311, 51315] under RCP 8.5 and 25,565 [4697, 46434] under RCP 4.5) and for the Southeast under the upper climate change scenario only (1,248 [252, 2244]). Modest, non-significant increases or decreases are predicted for the Pacific, Pacific Southwest and Southwest under both scenarios. For the Midwest, an increase in cases is predicted under the moderate climate change scenario (8,872 [-66, 17810]) while a decrease is predicted under the upper scenario (−3,432 [-12688, 5823]. While both of these predictions were not statistically distinguishable from zero, these results suggest there may be nonlinear effects of climate change in this region.
These predicted changes in Lyme disease case counts are likely conservative as estimates are based on 2017 population sizes. By assuming equal population growth across the US, at levels predicted by the USCB, we find the total number of Lyme disease cases in the US may increase by 48,545 [-11365, 108455] by 2100 under the upper climate change scenario and 60,020 [1974, 118146] under the moderate scenario (Supplementary Table 6). However, as the degree of population growth is highly uncertain, and population growth will vary in magnitude and direction by county, this analysis was largely exploratory. Further, as with the predictions assuming no population growth, the large prediction intervals around the point estimates here indicate the future effects of climate change on Lyme disease incidence are highly uncertain.
Discussion
Vector-borne diseases are inherently sensitive to climatic conditions, making accurately estimating effects of climate change on disease burden a public health priority. We found that climate was a key predictor of Lyme disease incidence in all US regions with established vector species (Northeast, Midwest, Pacific, Pacific Southwest, Southwest, and Southeast) in the past 17 years. However, the specific climate variable(s) predictive of Lyme disease incidence varied between regions. In general, the climate variables predictive of disease incidence for a given region tended to reflect climate conditions within the region and known relationships between tick life cycles and climate (reviewed in Eisen et al. 2016). For instance, in the Southeast and Southwest regions, which have the warmest and driest conditions during the tick questing period (Supplementary Tables 1-2), climate variables capturing precipitation conditions (e.g., cumulative precipitation) were key predictors of Lyme disease incidence. In the colder and more thermally variable Northeast and Midwest regions, climate variables capturing limiting temperatures (e.g., average winter temperatures and temperature variance) were predictive of Lyme disease.
These regionally-specific climate and Lyme disease relationships are consistent with a large body of literature on the physiology and ecology of the US vectors of Lyme disease, I. scapularis and I. pacificus. In particular, many prior studies have demonstrated substantial decreases in tick survival and questing activity under low moisture conditions (Berger et al. 2014b, 2014a; Jones and Kitron 2000; Knülle and Rudolph 1982; Needham and Teel 1991; Rodgers et al. 2007; Stafford 1994). Thus, variation in precipitation may have a greater impact on Lyme disease incidence in drier regions, as observed in this study, through changes in tick abundance and tick-human contact rates. Also consistent with the results of this study, extensive prior research indicates that cold winter and annual temperatures are associated with longer development periods and/or higher tick mortality (Brownstein et al. 2003; Estrada-Peña 2002; Leighton et al. 2012; McEnroe 1977; Ogden et al. 2004), reduced host-seeking abilities of the adult life stage (Carroll and Kramer 2003; Clark 1995; Duffy and Campbell 1994), and reduced abundance of the white-footed mouse, a key reservoir host species (Wolff 1996). Similarly, studies have found that warming temperatures at high latitudes contribute to quicker tick development rates, increased survival, and range expansion (Brownstein et al. 2003; Clow et al. 2017a; Leighton et al. 2012; Lindsay et al. 1995; Ogden et al. 2004; Rand et al. 2004). These studies suggest that milder winters would be associated with increasing Lyme disease incidence, with the largest effects observed in cooler regions, as detected in this study.
In addition to supporting prior literature on climate and tick ecology, the effects of climate conditions on Lyme disease incidence were detected while controlling for non-climate predictors of disease. In particular, we explicitly controlled for variation in human awareness of ticks, land use, a proxy for health-seeking behavior, and other unobserved heterogeneity between US counties and years in our modeling approach. Increasing tick awareness, as determined by the frequency of tick-related Google searches, was generally positively associated with Lyme disease incidence, while land cover and health-seeking behavior predictors had regionally-variable relationships. By controlling for these effects, we provide strong evidence that the positive effect of warming temperatures on Lyme disease in colder regions is not simply driven by increasing human awareness of tick-borne disease, temporal trends, or other concurrent changes as has been previously suggested (Morshed et al. 2006; Randolph 2010; Scott and Scott 2018).
While our statistical models included both climate and non-climate predictors of Lyme disease incidence, model accuracy varied widely between regions. Most notably, model accuracy was substantially greater for endemic regions (Northeast and Midwest), compared to low incidence regions (Pacific, Pacific Southwest, Southwest, and Southeast) (Ciesielski et al. 1988). The relatively poor predictive accuracy in non-endemic regions may be due to higher misdiagnosis rates and/or higher travel-associated Lyme disease transmission (Eldin and Parola 2018; Parola and Paddock 2018) decoupling the relationship between local conditions and disease. However, evidence suggests that most Lyme disease transmission occurs in the peri-domestic environment, in which the county of transmission and reporting are likely to be the same (Connally et al. 2009; Falco and Fish 1988; Jackson et al. 2006; Maupin et al. 1991). The lower predictive accuracy in these regions more likely reflects a lack of sufficient annual variation in Lyme disease incidence needed to detect effects of climate in these regions, and/or weaker effects of climate conditions on Lyme disease transmission relative to confounding drivers not included in our model such as host movement and community composition. In contrast, the largest effect of climate on disease transmission is expected at the edges of the climate suitability for transmission (Githeko et al. 2000). As the Northeast and Midwest are near the I. scapularis northern range limit, the higher model accuracy here likely indicates stronger climate-Lyme disease relationships. Supporting this assertion, more climate variables were included as predictors after variable selection in these regions than in low incidence regions.
Our Lyme disease forecasts, made using regionally-specific incidence models and projected climate and land cover data, suggest that climate change may lead to substantial increases in incidence in coming decades, but that the magnitude of these effects is highly uncertain and depend on assumptions about the functional form of climate-disease relationships. Across the US, an estimated additional 34,183 cases [95% PI: 1124, 67243] are predicted by 2100 under a moderate climate change scenario (RCP 4.5), representing a 92% increase in Lyme disease burden relative to 2010 – 2020 levels. These estimates are likely to be conservative as they relied on 2017 county population sizes. Applying predicted US population growth rates to all counties equally increases this estimate to 60,020 [95% PI: 1974,118146] additional cases by 2100 under the moderate scenario. The overwhelming majority of this increase would be experienced in the Northeast and Midwest while minimal changes are expected elsewhere. Under the upper climate change scenario (RCP 8.5), Lyme disease incidence is predicted to increase in the Northeast and Southeast by 2100, while changes are not statistically distinguishable from zero in other regions and for the US as a whole. However, the large prediction intervals suggest high uncertainty in future Lyme disease incidence, which could include either increases or decreases that could be regionally-specific. Further, the forecasting results differ, particularly for the upper climate change scenario, when generated assuming non-linear climate-disease relationships. These results indicate that climate change will very likely impact future Lyme disease incidence, but that effects will vary strongly between regions, and will depend on the degree of climate change.
Our prediction of climate change-induced increases in Lyme disease burden, particularly at higher latitudes, is consistent with prior studies predicting or observing increasing I. scapularis habitat suitability and range expansion under climate warming (McPherson et al. 2017; Ogden et al. 2008, 2014b). Similar range expansions have also been predicted and observed for Ixodes ricinus, the European Lyme disease vector, under climate warming (Gray et al. 2009; Jaenson and Lindgren 2011; Lindgren et al. 2000; Porretta et al. 2013). Further, our finding that the predicted changes in incidence depend on the degree of future warming is also consistent with prior work. I. scapularis range expansion and population growth, and the proportion of Eastern Canadians at risk for Lyme disease, are predicted to be higher under upper climate change scenarios than under mitigation scenarios (Leighton et al. 2012; McPherson et al. 2017). These results suggest that vector range expansions and future Lyme disease burdens depend in part on climate policy actions.
More generally, our results are consistent with expectations from vector thermal biology that suggest that warming temperatures generally increase transmission near the cold edge of a vector’s range limit, but may decrease or have variable effects elsewhere (Lafferty and Mordecai 2016; Martens et al. 1995; Mordecai et al. 2019; Ogden and Lindsay 2016). For tick-borne disease, as for other vector-borne diseases, multiple temperature-sensitive traits combine to influence transmission, including survival, development rates, and host-seeking (questing) (Ogden et al. 2004; Ogden 2017; Randolph 2004; Randolph et al. 2002). Nonlinear effects of temperature on these traits typically leads to vector-borne disease transmission peaking at intermediate temperatures and declining to zero outside of lower and upper thermal limits (Mordecai et al. 2019). This suggests that climate warming would most strongly increase transmission near the lower thermal limits, such as in the Northeast and Midwest regions, as was observed here. This further suggests the effects of climate warming would differ in magnitude and direction depending on the extent of warming, as seen in the Midwest region where increases in incidence were predicted under moderate warming (RCP 4.5) and decreases in incidence were predicted with more severe warming (RCP 8.5). The theoretical expectations of nonlinear thermal responses therefore help to explain some of the context-dependent effects of temperature found empirically in this study
While our results match expectations from empirical and theoretical vector-borne disease biology, our Lyme disease forecasts should be interpreted with caution. The large prediction intervals around our point estimates indicate a wide range of potential disease outcomes under climate change. While significant increases were predicted for some regions, many other factors contribute to Lyme disease transmission including host movement and community composition, and human avoidance behaviors (Berry et al. 2018; Brinkerhoff et al. 2011; Brownstein et al. 2005b; Larsen et al. 2014; MacDonald et al. 2019a; Ogden et al. 2008; Ostfeld 1997). Accordingly, we found that unobserved county-level heterogeneity, which would encompass these factors, was a predominant driver of incidence in each of our regional models. Further, while we examined the effects of two potential climate scenarios, uncertainty in these climate change projections was not incorporated into our predictive models and would add additional uncertainty in our Lyme disease predictions. Lastly, as our forecasting models extrapolate from climate and disease relationships observed in the previous 17 years, we assume that these relationships can be extended to climate conditions not yet experienced. That is, we assume the relationship between cumulative temperature, for example, and Lyme disease incidence in a given region will remain the same even as cumulative temperatures exceed prior values. This could generate inaccurate predictions for regions near current tick upper thermal limits such as the Southeast and Southwest as further warming and drought here may reduce tick survival and host-seeking abilities (Berger et al. 2014a; Randolph and Storey 1999; Schulze et al. 2001; Vail and Smith 1998). Generating more accurate predictions for these regions would require experiments investigating effects of future temperatures on aspects of tick-borne disease transmission.
Despite these limitations, our results are consistent with a growing body of evidence linking increased Lyme disease risk with climate warming (Brownstein et al. 2005a; Burtis et al. 2016; Clow et al. 2017b; Dumic and Severnini 2018; Kilpatrick et al. 2017; Leighton et al. 2012; Ogden et al. 2008, 2014b; Robinson et al. 2015; Subak 2003; Tuite et al. 2013). We demonstrate that climate is a key driver of Lyme disease incidence across the US, independently of other drivers of disease risk. We predict that future climate change could substantially increase Lyme disease burden, but the predicted effects are highly uncertain and regionally-specific. The largest changes in incidence are likely to be experienced in the Northeast and Midwest, where current climate-disease relationships are strongest and Lyme disease incidence has recently increased most substantially (Rosenberg 2018). Our predictions provide an essential first step in determining broad patterns of Lyme disease risk under climate change, but ongoing surveillance efforts and mechanistic studies linking changes in vector ecology under climate change to human disease incidence should be conducted to refine these risk assessments.
Acknowledgements
All data used in this study are free, publicly available, and can be accessed here: https://github.com/lcouper/LymeDiseaseClimateChange. We are grateful to the CDC Division of Vector-Borne Diseases for supplying Lyme disease case data, Mohammad Alhamdan from NASA for supplying climate data. and to Iain Caldwell, Jamie Caldwell, Marissa Childs, Johannah Farner, Elizabeth Hadly, Morgan Kain, Devin Kirk, Giulio de Leo, Nicole Nova, and Marta Shocket for providing helpful feedback on the manuscript. LIC was funded by the Stanford Graduate Fellowship. EAM was funded by an NSF Ecology and Evolution of Infectious Diseases grant (DEB-1518681), the Terman Award, and the NIH NIGMS Maximizing Investigators’ Research Award (1R35GM133439-01). AJM was funded by a UC Santa Barbara Faculty Research Grant.
Footnotes
The authors declare they have no actual or potential competing financial interests.