Abstract
Lyme disease is the most common vector-borne disease in temperate zones and a growing public health threat in the United States (US). The life cycles of the tick vectors and spirochete pathogen are highly sensitive to climate, but determining the impact of climate change on Lyme disease burden has been challenging due to the complex ecology of the disease and the presence of multiple, interacting drivers of transmission. Here we incorporated 18 years of annual, county-level Lyme disease case data in a panel data statistical model to investigate prior effects of climate variation on disease incidence while controlling for other putative drivers. We then used these climate-disease relationships to project Lyme disease cases using CMIP5 global climate models and two potential climate scenarios (RCP4.5 and RCP8.5). We find that interannual variation in Lyme disease incidence is associated with climate variation in all US regions encompassing the range of the primary vector species. In all regions, the climate predictors explained less of the variation in Lyme disease incidence than unobserved county-level heterogeneity, but the strongest climate-disease association detected was between warming annual temperatures and increasing incidence in the Northeast. Lyme disease projections indicate that cases in the Northeast will increase significantly by 2050 (23,619 ± 21,607 additional cases), but only under RCP8.5, and with large uncertainty around this projected increase. Significant case changes are not projected for any other region under either climate scenario. The results demonstrate a regionally variable and nuanced relationship between climate change and Lyme disease, indicating possible nonlinear responses of vector ticks and transmission dynamics to projected climate change. Moreover, our results highlight the need for improved preparedness and public health interventions in endemic regions to minimize the impact of further climate change-induced increases in Lyme disease burden.
Introduction
Arthropod-transmitted pathogens pose a severe and growing threat to global public health (World Health Organization 2014). Because vector life cycles and disease transmission are highly sensitive to abiotic conditions (Mattingly 1969, Sonenshine and Roe 2013), climate change is expected to alter the magnitude and geographic distribution of vector-borne diseases (Kilpatrick and Randolph 2012, World Health Organization 2014). Climatic changes, in particular warming temperatures, have already facilitated expansion of several vector species (e.g., Purse et al. 2005, González et al. 2010, Roiz et al. 2011, Clow et al. 2017a), and have been associated with increased vector-borne disease incidence (e.g., Loevinsohn 1994, Subak 2003, Hii et al. 2009). Identifying areas of high risk for current and future vector-borne disease transmission under climate change is critical for mitigating disease burden. However, the presence of interacting drivers of disease transmission such as land use change and globalization, and the complex ecology of vector-borne diseases make the effort to measure and predict effects of climate on vector-borne disease incidence challenging (Rogers and Randolph 2006, Tabachnick 2010, Mills et al. 2010, Ostfeld and Brunner 2015, Lafferty and Mordecai 2016).
This challenge is particularly apparent in the case of Lyme disease, the most common vector-borne disease in temperate zones (Kurtenbach et al. 2006, Rizzoli et al. 2011, Rosenberg et al. 2018), because transmission depends on a complex sequence of biotic interactions between vector and numerous host species that may respond differently to environmental change (Ostfeld 1997). In the United States (US), Lyme disease is caused by the bacteria Borrelia burgdorferi, and is vectored by two tick species: Ixodes scapularis in the eastern and midwestern US and Ixodes pacificus in the western US. After hatching from eggs, both tick species have three developmental stages—larva, nymph, and adult—during which they take a single blood meal from a wide range of vertebrate hosts before transitioning to the next developmental stage or reproducing (Sonenshine and Roe 2013). This life cycle takes 2-3 years to complete, 95% of which is spent at or below the ground surface in diapause, seeking a host, digesting a blood meal, or molting (i.e., off the host) (Sonenshine and Roe 2013, Ostfeld and Brunner 2015).
Given their long life spans, ectothermic physiology, and high degree of interaction with the physical environment, tick life cycles are sensitive to changes in climate and weather conditions (Sonenshine and Roe 2013). Prior research has demonstrated that temperature and moisture strongly influence tick mortality, development, and host-seeking abilities (reviewed in Ostfeld and Brunner 2015, Ogden and Lindsay 2016). In particular, both low and high temperatures decrease I. scapularis and I. pacificus survival and host-seeking activity (Lindsay et al. 1995, Vandyk et al. 1996, Padgett and Lane 2001). Further, cool temperatures prolong tick development and increase generation times, leading to greater proportional mortality before reproduction (Peavey and Lane 1996, Ogden et al. 2004, 2006). Rainfall and moisture availability also influence host-seeking activity in nonlinear ways. Low humidity exposure substantially increases tick mortality and inhibits host-seeking activity (Stafford 1994, Lane et al. 1995, Vail and Smith 1998, Schulze et al. 2001, Rodgers et al. 2007, Nieto et al. 2010, Ginsberg et al. 2017, MacDonald et al. 2019b). To avoid desiccating conditions, Ixodid ticks often modify their questing behavior to remain closer to the moist vegetative surface, or return frequently to rehydrate, both of which decrease the probability of obtaining a blood meal and thereby limiting survival and reproduction (Randolph and Storey 1999, Prusinski et al. 2006, Sonenshine and Roe 2013, Arsnoe et al. 2015, McClure and Diuk-Wasser 2019). However, heavy rainfall may also directly impede tick host-seeking (Randolph 1997). Given these physiological relationships, temperature and precipitation are important predictors of these tick species’ latitudinal and altitudinal range limits (McEnroe 1977, Estrada-Peña 2002, Brownstein et al. 2003, Ogden et al. 2005, Leighton et al. 2012, Berger et al. 2014, Eisen et al. 2016, Hahn et al. 2016), and northward range expansion of I. scapularis has been associated with warming temperature (Ogden et al. 2014b, Clow et al. 2017b, 2017a).
Yet despite well-known physiological relationships between specific climate variables and aspects of tick biology, and strong evidence of relationships between climate and tick range limits, it remains unclear how these effects translate into Lyme disease incidence - the outcome of interest to public health - and how broadly they apply across biogeographically distinct US regions. However, associations between climate and Lyme disease incidence are difficult to measure given the influence of many non-climate factors such as changing physician awareness, host movement, and human behavior (Morshed et al. 2006, Randolph 2010, Ostfeld and Brunner 2015, Kilpatrick et al. 2017, Scott and Scott 2018). A handful of prior studies have attempted to isolate the effect of climate on incidence, but have been limited in geographic or temporal scope, and/or not controlled for confounding drivers of incidence, leading to conflicting results about the role of climate change on transmission (Subak 2003, McCabe and Bunnell 2004, Schauber et al. 2005, Burtis et al. 2016, Dumic and Severnini 2018). As a result, our ability to predict effects of future climate change on Lyme disease incidence remains limited.
Here, we leverage an 18-year county-level Lyme disease case reporting dataset and explicitly control for other drivers of disease burden to ask: How has interannual variation in climate conditions contributed to past changes in Lyme disease incidence across distinct US regions? We include climate variables capturing changes in temperature and precipitation conditions and investigate how relationships between climate and Lyme disease outcomes vary across different regions of the US (i.e., the Northeast, Midwest, Southeast, Southwest, Pacific Southwest, and Pacific). We hypothesize that: a) warmer temperatures in northern regions and b) spring precipitation in all regions promote tick survival and therefore increase Lyme disease incidence, while c) hot, dry conditions during the questing period decrease tick host-seeking activity, survival and disease incidence. To avoid drawing spurious conclusions about the effects of climate, we analyze the effects of other known and potential drivers of disease incidence such as changing forest cover, public awareness of tick-borne disease, and health-seeking behavior, and use a statistical approach that explicitly accounts for unobserved heterogeneity in disease incidence between counties and years. We then use these modeled, regionally-specific relationships between climate and Lyme disease burden to investigate projected changes in US Lyme disease incidence under future climate scenarios. We report the projected change in Lyme disease incidence for individual US regions in 2040 – 2050 and 2090 – 2100 relative to hindcasted 2010 – 2020 levels under two potential climate scenarios: RCP8.5, which reflects the upper range of the literature on emissions, and RCP4.5, which reflects a moderate mitigation scenario (Hayhoe et al. 2017).
Materials and Methods
Lyme disease case data
We obtained annual, county-level reports of Lyme disease cases spanning from 2000 to 2017 from the US Centers for Disease Control and Prevention (CDC) (see Supporting Information). These disease case data provide the most spatially-resolved, publicly available surveillance data in the US. Raw case counts were converted to incidence using annual county population sizes from the US Census Bureau (USCB) and were expressed in cases per 100,000 people.
Climate data
An overwhelming number of climate variables, such as the mean, range, and maximum or minimum temperature or precipitation at different time scales, could conceivably affect Lyme disease transmission. To reduce the probability of identifying significant but spurious relationships between climate and incidence, we limited the variables considered here to: average winter temperature lagged 1.5 years; average spring precipitation; the number of hot, dry days in May – July (the nymphal tick questing period); cumulative average temperature; total annual precipitation; daily temperature variability; and daily precipitation variability (Table 1). These variables have either been previously associated with variation in Lyme disease incidence, tick range limits or abundance, or, in the case of daily temperature and precipitation variability, are grounded in physiological relationships between climate and tick life history but have not been previously tested. In particular, interannual variation in Lyme disease incidence in endemic regions has been positively associated with lagged average winter temperature (Subak 2003), average spring precipitation (McCabe and Bunnell 2004), and negatively associated with the number of hot, dry days in May – July (Burtis et al. 2016). A measure of cumulative annual temperature (degree days > 0°C) has been associated with I. scapularis population establishment and abundance (Jones and Kitron 2000, Ogden et al. 2004, 2006, Clow et al. 2017b), and cumulative annual precipitation has been associated with larval tick abundance (Jones and Kitron 2000). Frequent variation in temperature can decrease tick survival due to the energetic costs of adapting to changing conditions (Gigon 1985, Herrmann and Gern 2013), thus daily temperature and precipitation variability were included here to explore whether this effect scaled to affect transmission risk. Details about how these variables were calculated and further justification for their biological relevance are listed in Table 1.
For past climate conditions, we obtained daily, county-level average temperature and total precipitation data from the National Oceanic and Atmospheric Administration (NOAA) weather stations accessed via the CDC’s Wide-ranging Online Data for Epidemiological Research (WONDER) database. To estimate future climate variables, we used NASA Goddard Institute for Space Studies CMIP5 data on modeled temperature and precipitation (Schmidt et al. 2014). Specifically, we obtained estimates of daily near-surface air temperature and precipitation through 2100 under the upper climate change scenario (RCP8.5) and a moderate climate change scenario (RCP4.5) (van Vuuren et al. 2011, Taylor et al. 2012). These climate scenarios are relatively similar in the radiative forcing levels assumed through 2050 but diverge substantially in the latter half of the century. Climate estimates from these two scenarios are provided at a 2° x 2.5° resolution; values were then ascribed to counties based on county latitude and longitude (see Figure S1). Mean values for hindcasted and projected climate variables for each region are listed in Table S1.
Awareness data
We controlled for variation in public awareness of ticks and Lyme disease using data from Google trends on the frequency of “ticks” as a search term. We obtained data on “ticks” search frequency, normalized for a given location and year, for 2004 (the first year the data were available) to 2017. We also initially used “tick bite”, and “Lyme disease” as search terms, but found that these generated nearly identical coefficient estimates, thus we proceeded to use only the “ticks” search term as a predictor. Search frequency data were aggregated at the designated market area (DMA), the smallest spatial scale available. Search frequency values for a given DMA, which contained an average of 14 counties, were applied equally to all counties therein. We used a 1-year lagged version of the tick search variable, as awareness of tick-borne disease is likely endogenous to incidence (i.e., higher Lyme disease incidence likely contributes to higher tick search frequency and awareness) and using predetermined values reduces endogeneity concerns (Bascle 2008).
Health-seeking behavior data
We explicitly controlled for variation in health-seeking behavior, previously posited as a driver of Lyme disease reporting (Armstrong et al. 2001, Wilking and Stark 2014) by including health insurance coverage and poverty as potential predictors. Given the logistical and financial challenges in obtaining a Lyme disease diagnosis and treatment (Johnson et al. 2011, Adrion et al. 2015), access to health care services may play a role in whether a Lyme disease case is identified and reported. We obtained data on health insurance coverage, defined as the percent of county residents with any form of health insurance coverage in a given year, for 2005 to 2017 from USCB’s Small Area Health Insurance Estimates (SAHIE) program. We obtained data on poverty, defined as the percent of county residents living in poverty in a given year, for 2000 to 2017 from the USCB.
Land cover data
We included two land cover variables putatively associated with higher tick-borne disease risk: the percent forest in a given county and year, and the percent mixed development (Brownstein et al. 2005b, Dister and Fish 1997, Frank et al. 1998, Glass et al. 1995, Killilea et al. 2008, MacDonald et al. 2019a). We calculated these variables using 30-m resolution land cover data from the US Geological Survey (USGS) National Land Cover Database (NLCD) (Yang et al. 2018). Percent forest included any deciduous, evergreen, or mixed forest. Mixed development was defined as areas with a mixture of constructed materials and vegetation, including lawn grasses, parks, golf courses, and vegetation planted in developed settings. We calculated county-level values of these land cover variables for 2001, 2004, 2006, 2008, 2011, 2013, and 2016 as these are the only years the NLCD dataset is currently available.
To estimate future land cover variables, we used land cover projections generated by the USGS Earth Resources Observation and Science Center (EROS) using the IPCC Special Report on Emissions Scenarios (SRES) (Sohl et al. 2014). Although newer socioeconomic pathways have recently been developed (i.e., the “Shared Socioeconomic Pathways”), these scenarios have not yet been incorporated into US land cover projections (Sohl 2019). We used modeled land cover data under SRES B1, which reflects lower urban development, to align with the moderate climate change scenario (RCP4.5), and SRES A1B, which reflects higher urban development and conversion of natural lands, to align with the upper climate change scenario (RCP8.5) (Nakicenovic et al. 2000, Rogelj et al. 2012, Sohl et al. 2014). Using these data, we again calculated annual, county-level values of percent forest cover and mixed development for 2040 – 2050 and 2090 – 2100. However, as the ‘mixed development’ land cover class was not included in the projected data, we instead used the ‘mechanically disturbed’ public or private land cover class (see Supporting Information).
Regional divisions
Given the large variation in climatic conditions across the US, as well as variation in ecological dynamics of tick-borne diseases such as tick species identity, tick densities, tick questing behavior, and host community composition (Eisen et al. 2016, Kilpatrick et al. 2017, Ostfeld 1997, Salkeld and Lane 2010), we examined regional differences in climate-disease relationships. We used the US Fish & Wildlife Service regional boundaries to divide the US into the following seven regions for analysis: Northeast, Midwest, Mountain Prairie, Pacific, Pacific Southwest, Southwest, and Southeast (Figure 1). These regional divisions were selected as they roughly correspond to genetic structuring of I. scapularis and I. pacificus (Kain et al. 1997, 1999, Humphrey et al. 2010) and are likely distinct in environmental conditions and resources (Ricketts et al. 1999, Smith et al. 2018). These regional divisions are also similar to the nine ‘climatically consistent’ regions within the contiguous US identified by NOAA (Karl and Kloss 1984) but preserve larger regions in the South and Midwest to obtain higher power in the analysis. Further, each region contains only one vector species: I. scapularis in the Northeast, Midwest, Southeast, and Southwest, and I. pacificus in the Pacific and Pacific Southwest (Dennis et al. 1998). As neither species has an established presence in the Mountain Prairie, this region was removed from the analysis. Regional descriptions, including the population size (as of 2017), the number of counties, and the average climate conditions, are provided in Table S2.
Statistical analysis
We used a least squares dummy variable (termed “fixed-effects” in econometrics) regression approach to estimate changes in Lyme disease incidence using repeated observations of the same groups (counties) from 2000 – 2017 (Larsen et al. 2019). This class of statistical approaches has been developed to isolate potential causal relationships in the absence of randomized experiments where such experiments are not feasible (Larsen et al. 2019, MacDonald and Mordecai 2019). We included ‘county’ and ‘year’ dummy variables to control for any unobserved heterogeneity that may influence reported Lyme disease incidence in a particular county across all years (e.g., geographic features, number of health care providers), or influence Lyme disease in all counties in a given year (e.g., changes in disease case definition), respectively. All counties (n = 2,232) for which there were complete data on Lyme disease cases, climate, and other predictors were included.
To account for regional variation in the predictors of tick-borne disease incidence (Wimberly et al. 2008, Raghavan et al. 2014), we ran separate models for each US region (see Methods: Regional divisions). We used stepwise variable selection, in which variables were added if they reduced model Akaike information criterion (AIC) by two or more, to identify the climate, land cover, and non-ecological predictors that best explained Lyme disease incidence in each region (Yamashita et al. 2007, Zhang 2016). We assessed the multicollinearity of these models by calculating the variance inflation factor (VIF). No predictors had VIF values greater than 10 after the stepwise variable selection procedure, thus we did not remove any variables from the final models due to high collinearity (Hair et al. 2014).
We accounted for spatial and temporal autocorrelation of model errors by using cluster-robust standard errors. This nonparametric approach accounts for arbitrary forms of autocorrelation within a defined “cluster” to avoid misleadingly small standard errors and test statistics (Cameron and Miller 2015). We specified clusters as US Agricultural Statistics Districts (ASDs), which contain on average 9.9 ± 5.2 counties. These districts contain contiguous counties grouped by similarities in soil type, terrain, and climate such that each district is more homogenous with respect to these characteristics than the state as a whole (USDA 2018). Accounting for spatial and temporal correlation in this way may help to account for ecological similarities between neighboring counties not captured in the climate and land cover predictors. Along these lines, ASDs have previously been used to account for spatial autocorrelation when investigating relationships between forest fragmentation and Lyme disease incidence at the county-level (MacDonald et al. 2019a). When reporting on the significance of a predictor, we use standard errors and p-values calculated using this correction. To ensure our results were robust to cluster specification, we repeated the model runs using county as the cluster unit (Table S3). All analyses were conducted in R version 3.6 (R Core Team 2017)
To capture any nonlinear relationships between climate predictors and Lyme disease incidence, we generated models using linear and quadratic versions of the climate variables as potential predictors. Specifically, we used the stepwise variable selection approach starting with linear and quadratic versions of each climate variable to determine the best fit model for each region. We compare model accuracy and the output of these models to those using only linear versions of climate predictors to assess the sensitivity of our results to the functional form of climate-disease relationships (see Methods: Model validation).
Lyme disease projections
We projected Lyme disease incidence using the climate and land cover variables included in the best fit model for each region as well as a county dummy variable. Tick search frequency, poverty, and health insurance coverage were not included because annual, county-level projections for these variables are unavailable. Using these models, we obtained regional estimates for Lyme disease incidence under the upper and moderate climate change scenarios (RCP8.5 and RCP4.5) for 2040 – 2050 and 2090 – 2100. We calculated county-level changes in Lyme disease incidence by subtracting modeled incidence for 2010 – 2020 from projected incidence. Using modeled incidence for 2010 – 2020, rather than true case data for the years it was available, allowed for more direct comparisons between prior and projected cases because these estimates were made using the same climate and land cover data.
We converted projected Lyme disease incidence to cases under two differing assumptions about county population sizes. In the first calculation, we account for projected population growth by using county-level population projections under the Shared Socioeconomic Pathway “Middle of the Road” scenario (SSP2) as generated by Hauer 2019 (Samir and Lutz 2017). In the second, we assume that county population sizes remained the same as those in 2017, the last year of available county-level Lyme disease case reports. We focus our results and discussion on the projections made using population size projections, but compare results from these two approaches to ensure that changes in projected Lyme disease case counts resulted from predicted changes in incidence rather than projected population growth or decline. We report point estimates and 95% prediction intervals when discussing projected changes in Lyme disease case counts.
Model validation
To evaluate predictive model accuracy, we compared hindcasted Lyme disease incidence under both emissions scenarios to observed values for 2008 – 2017 (Judge et al. 1985, Clark et al. 2001). We compared model accuracy under varying model specifications to check the robustness of the climate-disease relationships. In the first specification, each regional model contained the predictors (climate, land cover, and non-ecological variables) determined through variable selection (see Methods: Statistical analysis) as well as county and year dummy variables. In the second specification, each regional model contained the same predictors as in the first specification, but only linear versions of the climate predictors were included. This is to assess the sensitivity of our results to the functional form of climate-disease relationships. Under the third specification, regional models contained the same climate and non-climate predictors as in the first specification but no dummy variables. Under the fourth specification, regional models contained all possible climate and non-climate variables, and the county and year dummy variables. Using each of these specifications, we created models of Lyme disease incidence on a training dataset containing a randomly selected 75% subset of counties and years and used the withheld 25% of observations for validation (Hijmans 2012, Caldwell et al. 2016). To evaluate the performance of each model specification, we calculated the root-mean-square error (RMSE) and correlation coefficient between projected and observed Lyme disease incidence for a given county and year between 2008 – 2017 (the years with complete data for all predictors) for each regional model. We also compared estimated average annual incidence to observed average annual incidence for each model specification and each region. We used the modeled climate and land cover data when hindcasting as these datasets were used for Lyme disease projections.
Results
Climate and Lyme disease incidence
At least one climate variable was included in the best fit model of Lyme disease incidence for all US regions with vector species present (Table 2). However, the specific climate variable(s) included in the model varied between regions and were often not significant predictors of incidence. As hypothesized, cumulative temperature was a significant, positive predictor in the Northeast, while the number of hot, dry days in May - July was a significant, negative predictor in this region (Table 2). Hot, dry days was also a significant, negative predictor in the Midwest. In the Southeast, daily temperature variability was a significant, positive predictor of incidence. In all other regions, the temperature and/or precipitation variables included in the best fit models were not statistically significant predictors. Further, for all regions, the climate predictors explained relatively little of the variation in Lyme disease incidence compared to the county dummy variables (Table 2). In many cases, quadratic versions of climate predictors were included in the best fit model for a particular region, indicating nonlinearity in climate-disease relationships (Table 2). For example, the number of hot, dry days, total annual precipitation, and temperature variability were all nonlinear predictors in the best fit model for the Northeast.
Non-climate predictors and Lyme disease incidence
For all regions, the best fit model of Lyme disease incidence included the 1-year lagged tick search frequency as well one health-seeking predictor and/or a land cover variable (Table 2). Lagged tick search frequency was a significant, positive predictor in the Northeast, and had regionally variable, and non-significant effects in other regions. Poverty was negatively associated with Lyme disease incidence in the Northeast, and positively associated with incidence in the Midwest and Southwest, but was not a significant predictor in any of these models. Health insurance coverage was a non-significant, negative predictor of Lyme disease in the Southeast. Forest cover was included in all regional models except the Southwest, but had regionally variable effects and was only a significant predictor in the Pacific. Mixed development cover was a positive predictor in the Southeast and Southwest, but only significant in the Southeast. The above non-climate predictors were included in each regional model of incidence along with county and year dummy variables. The majority of the variation in incidence for each region was explained by the county dummy variable (Table 2), indicating that there was a great deal of unobserved county-level heterogeneity driving Lyme disease incidence that was captured by the dummy variables. However, the estimated effect sizes of the predictors are the marginal effects of deviations from county- and year-means, meaning the total effect of a given variable, such as forest cover, may be larger if much of the variation is captured by the county fixed effects.
Model Validation
Under the main model specification, hindcasted Lyme disease incidence matched the observed values with reasonable accuracy in the high incidence regions (Table 3 and Figure S1). In the Northeast and Midwest, the correlations between estimated Lyme disease incidence for a given county and year and the observed incidence were 0.85 and 0.90, respectively. Model accuracy was lower in the Pacific, Pacific Southwest, Southwest, and Southeast, where incidence is much lower (r = 0.40, 0.26, 0.07, 0.32, respectively). However, the estimated annual average Lyme disease incidence (i.e., average incidence for a given region between 2008 – 2017) closely matched the observed annual average for all regions (Table 3). For each region, the estimated incidence was within 13% of the observed incidence, and was within 5% for the Northeast specifically.
Model accuracy also varied across the four model specifications (Table 3). In particular, model specifications with dummy variables outperformed (i.e., lower RMSE, higher correlation coefficients) those without. Models including only linear versions of climate predictors (i.e., model specification two) along with non-climate and dummy variables performed similarly to the main model specification but with slightly lower correlation coefficients and higher RMSE in the Northeast and Midwest, where the majority of cases occur. Coefficient estimates and Lyme disease projections using this model specification are shown in Tables S4 and S5. Models including all potential climate and non-climate predictors along with dummy variables had similar accuracy to the main model specification and model specification two (Table 3). The simpler, variable selection-based model specification using nonlinear climate predictors where selected was thus used for the remaining analysis to minimize overfitting and decrease transferability concerns (Allen and Fildes 2001, Wenger et al. 2011, Wenger and Olden 2012), and to achieve the greatest accuracy in high Lyme disease incidence regions.
Projected Lyme disease incidence
Under the upper climate change scenario (RCP8.5), the number of Lyme disease cases in the Northeast is projected to increase by 23,619 ± 21,607 by 2040 – 2050 and 61,776 ± 27,578 by 2090 – 2100 (Figures 2 and 3, Table 4). Non-significant decreases in the Midwest and increases in the Southeast were also projected under this scenario, and minimal, non-significant changes were projected for other regions (Table 4). By contrast, under the moderate climate change scenario (RCP4.5), no regions were projected to significantly increase or decrease. Non-significant increases in the Midwest, and non-significant increases or decreases, depending on the decade, were projected for the Northeast, with minimal changes elsewhere. Given the regionally variable projections and the large prediction intervals around all point estimates, total US Lyme disease incidence is not projected to change significantly under either climate scenario by 2040 – 2050 or 2090 – 2100 (Table 4). These results indicate that future changes in US Lyme disease burden are highly uncertain, vary strongly by region, and will depend on the degree of future climate change.
These Lyme disease projections were qualitatively similar to those generated using only linear versions of the climate variables (Table S5). Under this model specification (model specification two, see Methods: Model validation), the number of Lyme disease cases in the Northeast is projected to increase under the upper climate change scenario (21,467 ± 21,354 by 2040 – 2050 and 42,538 ± 24,129 by 2090 – 2100), but not under the moderate climate scenario. Non-significant decreases and increases in the Midwest were projected for the upper and moderate climate scenario, respectively, and non-significant changes in the US as a whole were projected under both scenarios and time periods. These results are all consistent with those generated under the main model specification, indicating that our projections are generally robust to the functional form of climate-disease relationships specified in the model. The one qualitative difference in results is the significant increase in cases in the Southeast under the upper climate change scenario (1,522 ± 1,213 by 2040 – 2050 and 3,460 ± 1,736 by 2090 – 2100) under model specification two, which was marginally non-significant under the main model specification.
Lyme disease case projections made using county-level population size projections were similar to those using constant (i.e., 2017) population sizes. In particular, large but uncertain increases in Lyme diseases cases were still projected for the Northeast under the upper climate change scenario (18,885 ± 19,509 by 2040 – 2050 and 40,320 ± 21,886 by 2090 – 2100) when assuming constant population sizes. This indicates that our results are generally robust to population size assumptions and are not solely driven by projected changes in human demography. However, because population growth is projected for the Northeast (Hauer et al. 2019; Table S7), projections made assuming constant population sizes are smaller (but not significantly) than those using projected population sizes.
Discussion
Given the increasing rate of vector-borne disease emergence and re-emergence in recent decades, including Zika in Central and South America and tick-borne encephalitis in Europe, identifying the environmental drivers of vector-borne disease transmission has been a major research theme (Rogers and Randolph 2006, Kilpatrick and Randolph 2012, Lafferty and Mordecai 2016, Swei et al. 2019). Extensive prior research indicates that temperature and moisture conditions can impact vector life cycles, activity patterns, abundance, and range limits (reviewed in Ogden and Lindsay 2016). Yet despite clear relationships between specific features of climate and aspects of vector life cycles and biology, identifying how these relationships translate to affect disease incidence has remained challenging. Here we use 18 years of disease and climate data in a panel data statistical modeling approach to identify the impacts of climate change on human Lyme disease incidence across biogeographically distinct US regions. We find that climate was a predictor of interannual variation in Lyme disease incidence in all US regions with established vector species (Northeast, Midwest, Pacific, Pacific Southwest, Southwest, and Southeast), even after controlling for potentially confounding factors and spurious relationships spatially and temporally. However, the specific climate variable(s) that best predicted burdens varied between regions and had highly variable effect sizes and often nonlinear relationships with incidence. While these results underscore the complexity of climate-Lyme disease relationships, the specific associations observed here tended to reflect known relationships between climate and the life histories of the US vectors of Lyme disease, I. scapularis and I. pacificus.
The strongest climate-disease association detected was between warming annual temperatures and increasing Lyme disease incidence in the Northeast. Previous studies have found that warming year-round temperatures at high latitudes contribute to more rapid tick development rates, increased survival, and I. scapularis range expansion (Clow et al. 2017a, Leighton et al. 2012, Lindsay et al. 1995, Ogden et al. 2004, Rand et al. 2004). This suggests warmer temperatures near the ticks’ northern range limit would promote Lyme disease transmission – an expectation empirically supported in this study. We also found a significant negative association between hot, dry conditions during the nymphal questing period (May – July) and incidence in the Northeast and Midwest. Prior studies indicate that desiccating conditions reduce tick questing activity, which can lead to decreased contact rates with larger vertebrate hosts, including humans (Randolph and Storey 1999, Prusinski et al. 2006, Sonenshine and Roe 2013). Further, Burtis et al. 2016 found the number of hot, dry days during this period was significantly negatively associated with I. scapularis questing density as well as Lyme disease incidence in the Hudson Valley, Southern New England, and northern New Jersey. Our work thus provides evidence that these prior relationships between desiccating conditions and tick questing behavior scale to incidence across the Northeast and Midwest. That this relationship was not observed or significant in the Southeast or Southwest is also consistent with prior evidence of differing questing behavior in northern and southern I. scapularis nymphs. Northern I. scapularis nymphs are much more likely to quest above the leaf litter, while southern I. scapularis nymphs primarily use habitats below the vegetative surface (Arsnoe et al. 2015). As this different questing behavior buffers southern I. scapularis from desiccating conditions, variation in the number of hot, dry days is less likely to impact tick-host contact rates and disease transmission here. Similar differences in questing behavior have been demonstrated between northern and southern population of I. pacificus (Lane et al. 2013, MacDonald and Briggs 2016), but we find no significant relationship between hot, dry days and incidence in the Pacific, potentially because low Lyme disease incidence in this region reduces the power to detect effects of variation in climate on incidence. Although we did find the expected negative relationship between hot, dry days and incidence in the Northeast and Midwest, we did not detect the hypothesized positive relationship between spring precipitation and Lyme disease incidence in any region. We did find a positive association in the Northeast and Pacific Southwest, but the association was not significant, and it was negative (but non-significant) in the Midwest and Southwest. This may be due to counteracting effects of precipitation on human behavior leading to reduced tick-human contact rates (Jaenson et al. 2012), independent of effects of precipitation on tick host-seeking suitability.
The associations between climate conditions and Lyme disease incidence found here were detected while rigorously controlling for non-climate predictors of disease as well as unobserved predictors that covary with climate at the county and year levels. In particular, we explicitly controlled for variation in human awareness of ticks, land use and land cover characteristics, proxies for health-seeking behavior, and other unobserved heterogeneity between US counties and years in our modeling approach. Increasing tick awareness, as determined by the frequency of tick-related Google searches, was generally positively associated with Lyme disease incidence, while land cover and health-seeking behavior predictors had regionally variable relationships. By controlling for these effects, we provide strong evidence that the positive association between warming temperatures and Lyme disease incidence in the Northeast found in this study is not simply driven by increasing human awareness of tick-borne disease, temporal trends, or other concurrent changes as has been previously suggested (Morshed et al. 2006, Randolph 2010, Scott and Scott 2018). Further, the total effects of climate and land use predictors may be larger than those estimated here, because these ecological predictors may underlie some of the variation included in the county and year dummy variables.
While our statistical models included both climate and non-climate predictors of Lyme disease incidence, model accuracy varied widely between regions. Most notably, model accuracy was substantially greater for endemic regions (Northeast and Midwest), compared to low incidence (non-endemic) regions (Pacific, Pacific Southwest, Southwest, and Southeast) (Ciesielski et al. 1988). The relatively poor predictive accuracy in non-endemic regions could be due to higher misdiagnosis rates and/or higher travel-associated Lyme disease transmission (Eldin and Parola 2018, Parola and Paddock 2018) decoupling the relationship between local conditions and disease. However, evidence suggests that most Lyme disease transmission occurs in the peri-domestic environment, in which the county of transmission and reporting are likely to be the same (Falco and Fish 1988, Maupin et al. 1991, Jackson et al. 2006, Connally et al. 2009). The lower predictive accuracy in these regions more likely reflects a lack of sufficient annual variation in Lyme disease incidence needed to detect effects of climate in these regions above and beyond the county and year fixed effects, and/or weaker effects of climate conditions on Lyme disease transmission relative to confounding drivers not included in our model such as host movement and community composition. In contrast, the largest effect of climate on disease transmission is expected at the edges of the climate suitability for transmission (Githeko et al. 2000). As portions of the Northeast and Midwest are near the I. scapularis northern range limit, the higher model accuracy here likely indicates stronger climate – Lyme disease relationships. Supporting this assertion, the climate predictors explained a relatively larger proportion of the variation in incidence in these regions.
Our Lyme disease projections, made using regionally-specific incidence models and projected climate and land cover data, suggest that climate change may lead to substantial increases in incidence in coming decades, but that these increases are largely concentrated in the Northeast, are highly uncertain, and depend upon the magnitude of climate change. In particular, under the upper climate change scenario (RCP8.5), Lyme disease cases in the Northeast are projected to increase by 23,619 ± 21,607 by 2040 – 2050 and 61,776 ± 27,578 by 2090 – 2100 (Table 4). However, increases are not projected in the Northeast under the moderate climate change scenario (RCP4.5), nor for any other region under either scenario. Large increases in the Midwest under less severe warming are possible, as are large increases in total US cases under more severe warming, but these projections are non-significant. While the significant increase in Lyme disease cases projected for the Northeast under RCP8.5 was robust to alternative model specifications and assumptions about county-level population growth, the large prediction intervals around our point estimates for this region and all others indicate a wide range of potential disease outcomes under climate change.
These results indicate that climate change will likely contribute to increasing Lyme disease incidence in the Northeast, but the specific numerical projections should be interpreted with caution. While significant increases were projected in the Northeast, many other factors contribute to Lyme disease transmission including host movement and community composition, and human avoidance behaviors (Ostfeld 1997, Brownstein et al. 2005b, Ogden et al. 2008, Brinkerhoff et al. 2011, Larsen et al. 2014, Berry et al. 2018, MacDonald et al. 2019a). Accordingly, we found that unobserved county-level heterogeneity, which would encompass these factors, was a predominant driver of incidence in each of our regional models. Thus, while climate may contribute to increasing Lyme disease incidence in northern regions, it may not be the dominant driver of future changes in Lyme disease. Further, while we examined the effects of two potential climate scenarios, uncertainty in these climate change projections was not incorporated into our predictive models and would contribute additional uncertainty in Lyme disease projections. Lastly, the projection models extrapolate from climate and disease relationships observed in the previous 18 years, assuming that these relationships can be extended to climate conditions not yet experienced. That is, we assume that the relationship between cumulative temperature, for example, and Lyme disease incidence in a given region will remain the same even as cumulative temperatures exceed prior values. This could generate inaccurate projections for regions near current tick upper thermal limits such as the Southeast and Southwest as further warming and drought here may reduce tick survival and host-seeking suitability (Vail and Smith 1998, Randolph and Storey 1999, Schulze et al. 2001, Berger et al. 2014, MacDonald et al. 2020). Generating more accurate projections for these regions would require experiments investigating effects of future temperatures on aspects of tick-borne disease transmission.
Despite these limitations and the large uncertainty in our Lyme disease projections, our results are consistent with a growing body of evidence linking increased Lyme disease risk with climate warming (Brownstein et al. 2005a, Burtis et al. 2016, Clow et al. 2017b, Dumic and Severnini 2018, Kilpatrick et al. 2017, Leighton et al. 2012, Ogden et al. 2008,2014b, Robinson et al. 2015, Subak 2003, Tuite et al. 2013). Specifically, our finding of climate change-induced increases in Lyme disease burden at higher latitudes, is consistent with prior studies projecting or observing increasing I. scapularis habitat suitability and range expansion under climate warming (Ogden et al. 2008, 2014a, McPherson et al. 2017). Similar range expansions have also been projected and observed for Ixodes ricinus, the European Lyme disease vector, under climate warming (Gray et al. 2009, Jaenson and Lindgren 2011, Lindgren et al. 2000, Porretta et al. 2013). Further, our finding that the projected changes in incidence depend on the degree of future warming is also consistent with prior work. I. scapularis range expansion and population growth, and the proportion of Eastern Canadians at risk for Lyme disease, are projected to be higher under upper climate change scenarios than under mitigation scenarios (Leighton et al. 2012, McPherson et al. 2017). These results suggest that vector range expansions and future Lyme disease burdens depend in part on climate policy actions.
More generally, our results are consistent with expectations from vector thermal biology that suggest that warming temperatures generally increase transmission near the cold edge of a vector’s range limit, but may decrease or have variable effects elsewhere (Martens et al. 1995, Ogden and Lindsay 2016, Lafferty and Mordecai 2016, Mordecai et al. 2019). For tick-borne diseases, as for other vector-borne diseases, multiple temperature-sensitive traits combine to influence transmission, including survival, development rates, and host-seeking (Randolph et al. 2002, Ogden et al. 2004, Randolph 2004, Ogden and Lindsay 2016, Ogden 2017). Nonlinear effects of temperature on these traits typically leads to vector-borne disease transmission peaking at intermediate temperatures and declining as temperatures approach lower and upper thermal limits (Mordecai et al. 2019). This suggests that climate warming would most strongly increase transmission near the lower thermal limits, such as in the Northeast, as was observed here. This further suggests the effects of climate warming would differ in magnitude and direction depending on the extent of warming, as seen in the Midwest region where non-significant increases were projected under the moderate climate change scenario while decreases were projected under the upper scenario. The theoretical expectations of nonlinear thermal responses therefore help to explain some of the context-dependent effects of temperature found empirically in this study.
Conclusions
We demonstrate that interannual variation in Lyme disease incidence is associated with climate in all US regions with established vector species, independent of other drivers of disease risk and excluding potentially spurious relationships with county- and year-specific variation. The specific climate variable(s) associated with incidence and their effect sizes varied by region, but the strongest climate-disease association observed was between warming temperatures and increasing incidence in the Northeast. However, in all regions, climate explained less variation in incidence than unobserved county-specific heterogeneity, highlighting that climate is one of many factors influencing Lyme disease transmission. We project that future climate change could substantially increase Lyme disease burden in the Northeast in coming decades under an upper climate change scenario. Cases in the Northeast were not projected to increase under a moderate climate change scenario, highlighting the potential for climate change mitigation to protect human health by preventing further increases in Lyme disease incidence. However, the projected effects in this region and all others are highly uncertain, indicating a wide range of potential disease outcomes under climate change. Our projections provide an essential first step in determining broad patterns of Lyme disease risk under climate change, but ongoing surveillance efforts and mechanistic studies linking changes in vector ecology under climate change to human disease incidence should be conducted to refine these risk assessments.
Author Contributions
LIC and EAM conceived of the project. All authors designed the analyses. LIC gathered the data and performed the analyses. LIC drafted the manuscript. AJM and EAM revised the manuscript. All authors read and approved the final manuscript. LIC was funded by the Stanford Graduate Fellowship. AJM was funded by a UC Santa Barbara Faculty Research Grant. EAM was funded by an NSF Ecology and Evolution of Infectious Diseases grant (DEB-1518681), the Terman Award, and the NIH NIGMS Maximizing Investigators’ Research Award (R35GM133439).
Data Accessibility
All datasets used in this study are free and publicly available. These datasets can be found here: https://github.com/lcouper/LymeDiseaseClimateChange, along with information about where and when they were originally accessed.
Acknowledgements
We are grateful to the CDC Division of Vector-Borne Diseases for supplying Lyme disease case data, Mohammad Alhamdan from NASA for supplying climate data, and to Iain Caldwell, Jamie Caldwell, Marissa Childs, Johannah Farner, Elizabeth Hadly, Morgan Kain, Devin Kirk, Giulio de Leo, Nicole Nova, and Marta Shocket for providing helpful feedback on the manuscript.
Footnotes
The authors declare they have no actual or potential competing financial interests.
The projected Lyme disease cases including population growth projections are now included in the main text while those assuming no population growth are in the supplementals. Similarly, the model specification including nonlinear climate variables has been moved to the main text while the specification including only linear climate variables is now in the supplementals. Diabetes incidence has now been removed from the list of potential non-ecological predictors of Lyme disease transmission. The discussion has been updated to emphasize that climate variables collectively explain less variation in incidence than unobserved county-level heterogeneity.