Abstract
We developed a dynamic forecasting model for Zika virus (ZIKV), based on real-time online search data from Google Trends (GTs). It was designed to provide Zika virus disease (ZVD) surveillance for Health Departments with early warning, and predictions of numbers of infection cases, which would allow them sufficient time to implement interventions. We used correlation data from ZIKV epidemics and Zika-related online search in GTs between 12 February and 25 August 2016 to construct an autoregressive integrated moving average (ARIMA) model (0, 1, 3) for the dynamic estimation of ZIKV outbreaks. The online search data acted as an external regressor in the forecasting model, and was used with the historical ZVD epidemic data to improve the quality of the predictions of disease outbreaks. Our results showed a strong correlation between Zika-related GTs and the cumulative numbers of reported cases, both confirmed and suspected (both p<0.001; Pearson Product-Moment Correlation analysis). The predictive cumulative numbers of confirmed and suspected cases increased steadily to reach 148,510 (95% CI: 126,826-170,195) and 602,721 (95% CI: 582,753-622,689), respectively, in 21 October 2016. Integer-valued autoregression provides a useful base predictive model for ZVD cases. This is enhanced by the incorporation of GTs data, confirming the prognostic utility of search query based surveillance. This accessible and flexible dynamic forecast model could be used in the monitoring of ZVD to provide advanced warning of future ZIKV outbreaks.