Applying particle filtering in complex compartmental models of pre-vaccination pertussis

Particle filtering is a modern state inference and identification methodology that allows filtering of general non-Gaussian and non-linear models in light of time series of empirical observations. Several previous lines of research have demonstrated the capacity to effectively apply particle filtering to low-dimensional compartmental transmission models. We demonstrate here implementation and evaluation of particle filtering to more complex compartmental transmission pertussis models – including application for aggregate, two-age-groups and 32-age-groups population structure with two different contact matrix, respectively – using over 35 years of monthly and annual pre-vaccination provincial data from the mid-western Canadian province. Such filtering supports estimation of (via sampling from) the entire state of the dynamic model, both latent and observable, for each point in time, thereby supporting estimation of proportion of susceptible individuals, and those at varying levels of immune protection associated within waning of pertussis immunity. Estimation of the latent state also supports capacity for model projection. Following evaluation of the predictive accuracy of these four particle filtering models, we then performed prediction and intervention experiments based on the most accurate model. Using that model, we contribute the first full-paper description of particle filter-informed intervention evaluation. We conclude that applying particle filtering with relatively high-dimensional pertussis transmission models, and incorporating time series of reported counts, is a valuable technique to assist public health authorities in estimating and predicting pertussis outbreaks in the context of incoming empirical data. Within this use, the particle filtering models can moreover perform counterfactual analysis of interventions to assist the public health authorities in intervention planning. Author summary Pertussis is a common and highly contagious childhood disease. In contrast to some other prevalent childhood diseases, immunity conferred by natural exposure or vaccination to pertussis is widely believed to wane relatively rapidly. Such waning and associated uncertainties raises challenges for accurate pertussis modelling and simulation, with the speed and current degree of such waning in the population remaining a notable point of uncertainty. In recent years, the sequential monte carlo method of particle filtering has been employed for incorporating empirical time series data so as to estimate underlying model state of simpler communicable disease transmission models, particularly for influenza and measles. This paper characterizes the first time in which such methods have been applied to the more complex and higher dimensional models of pertussis, using over 35 years of monthly and yearly empirical time series from the pre-vaccination era in a mid-western Canadian province. In addition, we contribute the first full-paper description of particle filter-informed intervention evaluation for transmission models. Our findings suggest that applying particle filtering with relatively high-dimensional pertussis transmission models can serve as a valuable technique to assist public health authorities in estimating and predicting pertussis outbreaks in the context of incoming empirical data. Application of particle filtering can moreover support evaluation of intervention outcomes to assist public health authorities in intervention planning.

the first time in which such methods have been applied to the more complex and higher dimensional models of pertussis, using over 35 years of monthly and yearly empirical time series from the pre-vaccination era in a mid-western Canadian province. In addition, we contribute the first full-paper description of particle filter-informed intervention evaluation for transmission models. Our findings suggest that applying particle filtering with relatively high-dimensional pertussis transmission models can serve as a valuable technique to assist public health authorities in estimating and predicting pertussis outbreaks in the context of incoming empirical data. Application of particle filtering can moreover support evaluation of intervention outcomes to assist public health authorities in intervention planning.
United States [5]. The most recent peak year of the reported cases of pertussis in the 23 United States is 2012, when the Centers for Disease Control and Prevention (CDC) 24 reported 48,277 cases, but many more are believed to go undiagnosed and 25 unreported [5]. Research aimed at estimating the level of population susceptibility and 26 predicting the transmission dynamics of pertussis could aid outbreak prevention and 27 control efforts by health agencies, such as performing intervention before the predicted 28 next outbreak, and in targeted outbreak response immunization campaigns [6]. 29 Dynamic modelling has long served as an important tool for understanding the 30 spread of the infectious diseases in population [16], including pertussis, and for 31 evaluating the impacts of interventions such as immunization and quarantine. In recent 32 years, particle filtering as a machine learning method has been employed for 33 incorporating empirical time series data (such as surveillance [7] and online 34 communicational behavior data [8]) to ground the hypothesis as to the underlying 35 model state models in some previous researches [9][10][11][12], especially for the infectious 36 diseases of influenza [13,14] and measles [7]. In this paper, we apply the particle 37 filtering algorithm in a more complex and widely used compartmental model [15] of 38 pertussis by incorporating the reported pertussis cases in Saskatchewan during the 39 pre-vaccination era. Particle filtering for pertussis is different than for other pathogens 40 April 1, 2019 2/17 on account of the need for state estimation to estimate the population segments at 41 varying levels of immunity. Another need concerns extends from the heterogenous 42 nature of the mixing and incidence burden between different age groups. For this 43 reason, age-structured models are examined here. Specifically, we have examined two 44 categories of age-structured particle filtering models -with 2 age groups and with 32 45 age groups. Moreover, we have proposed and explored three methods for calculating the 46 contact matrix, so as to reduce the degrees of freedom associated with characterization 47 of the contact matrix. This contribution compares the results obtained from all the 48 particle filtering models by incorporating the empirical data across the whole timeframe 49 evaluating the predictive accuracy of the models. Finally, using the minimum 50 discrepancy particle filtering model, we demonstrate how we can evaluate intervention 51 effects in a fashion that leverages the capacity of particle filtering to perform state 52 estimation. As noted above, the infectious dynamics of pertussis is more complex than the infectious 56 diseases that confer lifelong immunity, such as measles, due temporary character of the 57 immunity acquired by Bordetella pertussis infection. As the time since the most recent 58 pertussis infection increases, the immunity of a person wanes [16]. People with lower 59 immunity generally tend to be more easily infected, and exhibit a higher risk of 60 transmitting the infection once infected.

61
In this paper, we have employed the structure of the popular pertussis mathematical 62 model in [15]. To capture the characteristics of pertussis in waning of immunity and the 63 different level of infectiousness and susceptibility involved with infection in light of 64 pre-existing immunity, the compartmental model in [15] further divides the infectious 65 population into three groups: weakly infectious (I w ), mild infectious (I m ), and fully 66 infectious (I), and divided the recovered population into four groups: R 1 , R 2 , R 3 and 67 R 4 of successively increasing immune system strength. Moreover, in reflection of the 68 focus on the pre-vaccination era (pertussis reported cases in Saskatchewan from 1921 to 69 1956), the stocks related to vaccination (V 1 , V 2 , V 3 and V 4 ) in the original 70 compartmental model [15] are not included in this research.  [15] . In this compartmental model of pertussis in the pre-vaccination era, 73 the total population is divided into 8 distinct epidemiological classes. Newborns enter symptoms. When individuals recover from the state I of infectives, they achieve full 79 immunity and enter state R 4 . In this state, they are fully protected and can not be 80 infected by pertussis. However, as time goes by, their immunity wanes and they enter 81 into a less strong immunity class of R 3 . When individuals in class R 3 are exposed to an 82 infective, they are assumed to return to the highest immunity class of R 4 without (re-)exposed to an infective for transmission to occur, the infected individual enters the 86 I w state with weak infectivity. Individuals in the I w class have the weakest infective 87 capability to infect a susceptible. After they recover, the individuals in class of I w then 88 secure the highest immunity, (re-) entering the class of R 4 from which they originally 89 waned. By contrast, if people in the class of R 2 are not re-exposed to the infectives, 90 their immunity continues waning, and they enter the minimally immune class of R 1 .

91
Similarly, if a person in class R 1 is re-exposed to an infective, this person gets infected 92 with mild infectivity and enters the class of I m . Individuals in the class of I m have a 93 higher infectious capability compared with those in the class of the weak infective (I w ), 94 but exhibit a lower infectious capability compared to the fully infective individuals in I. 95 When recovered, the individuals in class I m enter the class R 4 again. If the individuals 96 in the class of R 1 are not re-exposed, they eventually lose all of their immunity and 97 move back to the class of S whence they originated at birth. Given the presence of 98 multiple infection states (I, I w , I m ) as well as multiple levels of immunity (R 1 , R 2 , R 3 , 99 R 4 ), three invariants bears noting. Firstly, regardless of the pre-existing level of 100 immunity, following recovery from an infection, that individual always returns to the 101 full level of natural immunity (R 4 ). Secondly, in any of the recovered states (R 1 , R 2 , 102 R 3 , R 4 ), immunity continues to wane absent re-exposure. Thirdly, as the level of 103 immunity is reduced, the severity of resulting infectiousness rises, with no infectiousness 104 being possible at all from exposure in states R 3 and R 4 .

105
It is notable that the model of Hethcote (1997) [15] makes use of a formulation in 106 which each state variable is of unit dimension, representing a fraction of the population 107 in different age groups of the same class. However, for the sake of comparison against 108 empirical data, the model in this paper is represented in a re-dimensionalized fashion, 109 with the state variables representing counts of persons. Based on this structure in Fig 1, 110 four models are considered in this research. Using n to denote the total number of age 111 groups in the models, we consider models of the aggregate population (n = 1), of two 112 age groups (n = 2), 32 age groups model (n = 32) with the contact matrix introduced in 113 the paper of Hethcote (1997) [15] and a further 32 age groups (n = 32) model with a 114 re-balanced contact matrix. In the two age group model, the population is divided into 115 child and adult categories, where the child age group includes individuals in their first 5 116 years of life, while the adult age group includes individuals aged 5 and above. In the 32 117 age group models, we follow Hethcote in assuming age groups as 0-1 month, 2-3 month, 118 4-5 month, 6- year, 80-89 year, 90 years and above [15]. It bears noting that while more detailed age 122 structure can better capture both the effects of population aging and inter-group 123 heterogeneity, in terms of particle filtering, it will require estimation of a larger 124 underlying model state -potentially adversely affecting the accuracy of that estimation. 125 In addition to examining the effects of assuming different age structures, we have 126 explored 3 different methods in formalizing the contact matrix that characterizes 127 contacts between different age groups [7,15] in transmission models. In the two age 128 group model (n = 2), the contact matrix is consistent with the contact matrix 129 introduced in a previously contributed application of particle filtering to measles [7].

130
This contact matrix used in [7] is powerful and effective in mathematical models with a 131 small number of age groups. However, this method encounters problems with extension 132 to larger number of age groups, given that the number of free parameters rises as the 133 square of the total number of age groups in the model (readers interested in the 134 mathematical proof are referred to S4 Appendix). Thus, in the 32-age-groups models, 135 we have explored two alternative contact matrices exhibiting a reduced number of free 136 parameters. In the first 32-age-groups model, we employed an identical contact matrix 137 to that employed in [15]. Specifically, to generate the contact matrix in [15], this 138 research has assumed that the average number of people contacted a person in age 139 group i per unit time are distributed among people in age group j in proportion to the 140 fraction of all contacts per unit time received by people in the age group j [15].

141
However, this contact matrix used in [15] is un-balanced, which means the total 142 contacts of age group i to age group j is not equal to the total contacts of age group j 143 to age group i (see S1 Appendix). Thus, we have explored another contact matrix in 144 the 32-age-groups model where the contact matrix is balanced, according to the 145 methodology established in [17]. As a result of the balancing, the total contacts in age 146 group i to age group j is taken as equal to the total contacts of age group j to age Particle filtering is a modern state inference and identification methodology that allows 153 filtering of general non-Gaussian and non-linear models in light of time series of 154 empirical observations [7,9,10,14,18,19]. At a given time, each state in the distribution 155 (represented by "particles" according to the principles of sequential importance 156 sampling) can be seen as representing a competing hypothesis concerning the underlying 157 state of the system at that time. The particle filtering method can be viewed as 158 undertaking a "survival of the fittest" of these hypotheses, with fitness of a given 159 particle being determined by the consistency between the expectations of the hypothesis 160 associated with that particle and the empirical observations.

161
Particle filtering offers value in the context of underlying state equation models 162 exhibiting stochastic variability. In the implementation of the particle filtering 163 algorithm, we extended the pertussis compartment model by adding "system noise" by 164 letting some parameters evolve with time. Readers interested in additional details with 165 the state space models implementation of the pertussis particle filtering models are 166 referred to S2 Appendix. In performing the particle filtering, we followed several past 167 contributions [7,9,11,13]  represents the simplest and widely used variant of particle filtering algorithm. We 172 further followed [7,9,11,13,21] in assuming that the likelihood function in the particle 173 filtering algorithm is based around the negative binomial distribution. Specifically, for 174 the particle filtering model with an aggregate population, the likelihood function simply 175 applies the negative binomial distribution to link the result of the model and 176 observation data via monthly reported pertussis cases. By contrast, for the particle 177 filtering models with an age-structured population, the likelihood function is given by 178 the product of negative binomial density functions applied successively to the 179 observation and corresponding model quantity for each empirical dataset [7]. It bears 180 noting that the value of the dispersion parameter r [7,9] associated with the negative 181 distribution in all scenarios in this paper is chosen to be 10. pertussis reported cases, yearly pertussis reported cases for (separately) child and adult 196 age groups. It bears remarking that this structure mirrors that used in a previous 197 contribution using particle filtering for a measles model [7]. For assessing accuracy of 198 the two age-grouop model, the discrepancy includes the monthly RMSE and the sum of 199 yearly RMSE of each age group divided by 12 (so as to convert the unit to from year to 200 month).

201
In the pertussis age-structured particle filtering models involving 32 age groups, 7 202 empirical datasets are employed -monthly population-wide reported pertussis cases, 203 and yearly pertussis reported cases for each of six age groups, including less than 1 year, 204 from 1 to 4 years, from 5 to 9 years, from 10 to 14 years, from 15 to 19 years and 20 205 years and older. The update weight rule is similar to that for the age structured 206 pertussis model of two age groups, except that at the close of each year it is the product 207 of likelihood functions applied for each of the 7 empirical datasets. When assessing the 208 predictive accuracy of the model, the discrepancy also includes the monthly RMSE and 209 the sum of yearly RMSE of each age group divided by 12 (so as to convert the unit from 210 year to month).

211
Particle Filtering Algorithm

212
The particle filter algorithm that we employed in this paper is given as 213 follows [7,9,18, 23]: 214 1. At time k=0: (2) Compute a weight for each particle w (1) Advance the sampled state by sampling X (2) Update the weights (by the Condensation Algorithm) to reflect the probabilistic 222 and state update models as follows: ).

224
Normalize the weights W  Recall that to explore the predictive performance of particle filtering in different 233 compartmental pertussis models, four particle filtering models have been built in this 234 research -the aggregate particle filtering pertussis model (denoted as P F aggregate ), the 235 age-structured particle filtering model of 2 age groups (denoted as P F age 2 ), the 236 age-structured particle filtering model of 32 age groups with the original Hethcote 237 contact matrix (denoted as P F age 32 Hethcote ), and the age-structured particle filtering 238 model of 32 age groups with the re-balanced contact matrix (denoted as 239 P F age 32 rebalanced ). In all of the four particle filtering models, the total number of 240 particles in the particle filtering algorithm is 3000; for clarity in exposition, we sampled 241 the same number when generating the plots of the 2D histogram and for calculating the 242 discrepancy. To compare the accuracy of a particle filtered model against that of a Each of the five particle filtering models was run 5 times (the random seed generated from the same set). Shown here are the average and 95% confidence intervals (in parentheses) of the mean discrepancy for each model variant.
By comparing the discrepancy of these models, we sought to identify the model 252 offering the greatest predictive validity. We then use the most favorable model to 253 perform prediction and intervention analysis. To assess model results, each of the four 254 particle filtering models was run 5 times with random seeds generated from the same set. 255 We then calculate the average and 95% confidence intervals of the mean discrepancy. observations are stratified by age, but no age stratification is implemented in these two 263 models. Table 1 indicates that the particle filtering models are significantly more 264 Fig 2. Boxplot of monthly and yearly discrepancy of all models at monthly observation points, considering empirical data across all observation points. "Calibrate" indicates the calibration model with aggregate population structure; "PF a1" indicates the particle filtering model with aggregate population structure; "PF a2" indicates the particle filtering model with 2 age groups; "PF a32H" indicates the particle filtering model with 32 age groups and the contact matrix introduced in [15]; "PF a32R" indicates the particle filtering model with 32 age groups and the re-balanced contact matrix. " M" indicates the discrepancy of the models compares with the monthly empirical data -the pertussis reported cases among all population; " Y" indicates the sum of discrepancy (of each age group) of the models compares with the yearly empirical data -the pertussis reported cases divided in age groups -and adjust the unit to Month by dividing 12. It is also notable that the dot in the boxplot indicates the mean value, while the horizontal line indicates the median value.

256
accurate compared to the calibration model -the average discrepancies of the particle 265 filtering models are significantly lower than the average discrepancy of the calibrated 266 deterministic model. Moreover, although the monthly average discrepancies among the 267 four particle filtering models with different population structure and contact matrix 268 structure are quite close, the particle filtering models P F age 2 and P F age 32 rebalanced 269 exhibit smaller average discrepancies. With respect to the yearly average discrepancies, 270 table 1 shows that the age-structured model with two age groups offers better predictive 271 performance than the model with 32 age groups; as noted, the aggregate model lacks 272 the age stratification required to calculate yearly discrepancy. It is notable that the 273 total number of the yearly empirical datasets against which the calibration is assessed is 274 different between the age-structured models with 2 age groups (which is compared with 275 2 yearly empirical datasets) and that with with 32 age groups (which is compared with 276 6 empirical yearly datasets). The yearly average discrepancies listed in table 1 are the 277 sum of the average discrepancy with across each empirical dataset. Thus, this difference 278 may contribute to the result that the yearly average discrepancies of the model with 32 279 age groups are greater than the model with 2 age groups. However, we also employ the 280 particle filtering model with 2 age groups as the minimum average discrepancy model to 281 explore the performance of the outbreak prediction of pertussis below.  This boxplot also indicates that when considered over time, the discrepancies of all the 291 particle faltering models tend to be smaller than for the calibrated model, although 292 there are similar median discrepancy values. More notable yet is the fact that the 293 discrepancies associated with the calibrated model are significantly more variable than 294 those for the particle filtered models. This suggests that particle filtering improves the 295 consistency of the model's match against empirical data, when compared to a 296 traditional deterministic model with calibrated parameters. Finally, it bears note that, 297 the datasets of the discrepancy of the model P F age 2 has a particularly narrow 298 distribution, especially for the dataset of the yearly discrepancy.     spread in a wider range. This difference in dispersion indicates that the weight update 332 process of particle filtering algorithm in this paper has the capability to combine the 333 empirical data to the particle filtering model to constrain the particles in a tighter range 334 as suggested by the empirical data.  Fig 4, Fig 6, Fig 7 (for posterior distribution) suggest that 365 all four pertussis particle filtering models are capable of tracking and estimating the 366 pertussis outbreaks, in the interest of brevity of exposition, we employ the minimum 367 discrepancy model -the age-structured particle filtering pertussis model of 2 age groups 368 -to perform the prediction and intervention analysis below.

369
The pertussis particle filtering models described here are particularly notable by 370 virtue of their capacity to perform ongoing estimation of entire model state -including 371 the latent and observable state of the dynamic models during the period of 372 incorporating the empirical observations. In the investigation of pertussis considered 373 here, the empirical data -monthly population-wide pertussis reported cases and yearly 374 reported pertussis cases of different age groups -are only related to an aggregation of 375 the latent states of weakly infectious (I w ), medium infectious (I m ), and fully infectious 376 (I). However, by applying particle filtering to the compartmental models of pertussis 377 and the empirical data, the pertussis particle filtering models are capable of estimating 378 the entire model state across the whole model time horizon during which empirical  Figure 8 indicates that most of 383 the fully infectious individuals are estimated to be located amongst the children (less 384 than 5 years of age) age group, while most of the weak infectious and medium infectious 385 individuals are located in the "adult" (equal and greater than 5 years) age group. Most 386 of the recovered individuals are also located in the adult age group. Such estimates 387 accord with empirical observations with respect to pertussis transmission and build 388 confidence in the capacity of the model to meaningfully estimate the latent state. As  5 and up). The latent states in plots both (a) and (b) are "S", "I", "I m ", "R 1 ", "R 2 ", "R 3 ", "I w ", "R 4 " in order.

395
To assess the predictive capacity of the pertussis particle filtering models in anticipating 396 outbreaks, we performed out-of-sample prediction experiments. Informally, each such 397 experiments examines the capacity of the model to project results into the future, having 398 considered data only to some "current" time. That is, the model is particle filtered so as 399 to incorporate data only to up to but not including a "Prediction Start Time" (T * ), and 400 then begins projecting (predicting) forward, starting at T * . More specifically, in this 401 process, the weights of particles will cease updating in response to observations at time 402 T * , following that point, all of the particles run without new empirical data being archetypal types of prediction challenges. It is notable that the minimum discrepancy 407 model -the age structured model with 2 age groups where the child age group is up to 408 5 year and incorporating both the monthly and yearly empirical datasets, which is 409 identified in the previous section -is employed to perform all of these experiments.

410
(1) Prediction started from the first or second time points of an outbreak.

412
(3) Prediction started from the peak of an outbreak.

413
(4) Prediction started from the end of an outbreak.    plots of Figs 9-12, the empirical data having been considered in the particle filtering 417 process (i.e., incorporated in training the models) are shown in red, while the empirical 418 data not having been considered in the particle filtering process (and only displayed to 419 compare with model results) are shown in black. The vertical straight line labels the 420 "Prediction Start Time" (T * ) of each experiment.

421
These prediction results suggest that the pertussis particle filter model offers the 422 capacity to probabilistically anticipate pertussis dynamics with a fair degree of accuracy 423 over a year or so. From the 2D histogram plots, empirical data lying after the 424 prediction start time -and thus not considered by the particle filtering machinery -425 mostly lie within the high-density range of the particles. While not formally evaluated 426 here, it is notabe that in such examples the particle filter model appears to be able to 427 accurately anticipate a high likelihood of a coming outbreak and non-outbreak. Such an 428 ability could offer substantial value for informing the public health agencies with 429 accurate predictions of the anticipated evolution of pertussis over the coming months. Based on the previous discussion, the capacity to accurately estimate (sample from) the 432 latent state of a model allows particle filtering used with pertussis models examined 433 here to be capable of estimating the entire latent state and projecting outbreak 434 occurrence and progression. The capacity for such state estimation also support particle 435 filtering models in simulating intervention strategies.

436
In this section, we have implemented several experiments to simulate stylized public 437 health intervention policies, based on the minimum discrepancy particle filtering 438 pertussis model identified above. The stylized intervention strategies are characterized 439 in an abstract way, and are typically performed before or at the very beginning of an 440 outbreak. For simplicity, we examine them as a historical counterfactual -taking place 441 at a certain historic context. Moreover, to support easy comparison with the baseline 442 prediction results of the minimum discrepancy model absent any interventions, all of the 443 intervention strategies are simulated starting at the start month of an outbreak (month 444 269) in this project. Moreover, we assume here that month 269 is the "current time" in 445 the scenario -that we wish to examine the effects of that intervention using all data 446 and only data available up to but not including month 269, and simulate the results of 447 the intervention forward from that point. The baseline prediction result of the minimum 448 discrepancy model without any interventions is shown in Fig 9 (b). We examine below 449 the impact of two stylized intervention policies -quarantine and vaccination.   14. 2D histogram of simulating quarantine during a pertussis outbreak. This is realized by decreasing the contact rate by 50%. Fig 15. 2D histogram of simulating an immunization campaign during an outbreak. This is realized by characterizing a stylized vaccine-induced protection level among 20% of the population .   Fig 16. 2D histogram of simulating an immunization campaign during an outbreak. This is realized by characterizing a stylized vaccine-induced protection level among 50% of the population.  [24] whose effects are characterized as decreasing the contact rate parameter 452 to be 20% and 50% less than its pre-intervention value, respectively. Similarly to the 2D 453 histogram plot of the prediction result without any intervention shown in Fig 9 (b), the 454 red dots represent the empirical data incorporated in the particle filtering model (here, 455 up to just prior to the point of intervention), while the black dots represent the 456 empirical data not incorporated in the model, but presented for comparison purposes. It 457 is notable that because the interventions being characterized are counterfactual in 458 character -i.e., did not in fact take place historically -the empirical data shown in black 459 reflect the situation of absent any intervention (that is, in a baseline context that lacked 460 an intervention of the sort simulated here). By comparing the quarantine intervention 461 results (see Fig 13 and Fig 14) with the model result without intervention shown in 462 Fig 9 (b) and the empirical data during the intervention period (that laying after the 463 triggering of the intervention, and not incorporated in the particle filtering model), we 464 can see that, although the interventions are implemented in a stylized fashion, by virtue 465 of its ability to estimate the underlying epidemiological state at the point of 466 intervention, the pertussis particle filtering model is capable of using the estimated state 467 to serve as a tool for probabilistically evaluating pertussis related intervention policies. 468 To simulate an immunization intervention during a pertussis outbreak, a vaccination 469 parameter is incorporated to represent the fraction of the population moved with respect 470 to their immunity status. Specifically, recall that the pertussis model characterizes a   The models examined demonstrated a notable degree of accuracy in predicting pertussis 486 dynamics over multi-month timeframes -the 2D histogram plots comparing the 487 empirical data and samples from the posterior distributions of the particle filtering 488 models' output data (the monthly and yearly reported cases of pertussis) indicates that 489 the high probability density region of the model's prediction of empirical data 490 encompasses or lies near those data. Moreover, the discrepancy of the pertussis particle 491 filtering model's predictions vs. observed data is reduced by approximately 60% when 492 compared with a traditional calibration model. This reduction indicates that the 493 particle filtering algorithm is capable of enhancing model predictions when compared 494 with predictions resulting from the traditional calibration process. Additionally, it bears 495 emphasis that the calibrated deterministic model encounters marked difficulties in 496 tracking the fluctuation of the outbreak pattern of measles, while the particle filtering 497 model is capable to tracking the oscillation of the outbreak of pertussis.

498
Moreover, it is of great significance that the pertussis particle filtering models could 499 support effective estimation of the entire state of the pertussis transmission models 500 during the time incorporating the empirical datasets -including latent states of strong 501 interest, including states associated with waning of natural immunity and differing 502 levels of infection severity. Combined with the capability of performing outbreak 503 projections, such particle filtering models can serve as powerful tools for understanding 504 the current epidemiology of pertussis in the population, for projecting forward evolution 505 of pertussis spread -including occurrence of outbreaks.

506
Beyond that, in a further contribution, this research further marks the first instance 507 of research demonstrating the capacity to perform public health intervention 508 experiments using particle filtering models.

509
Despite the strengths of these contributions, there remain a number of limitations of 510 this work, and priorities for future research. We briefly comment on several.

511
Within this work, four particle filtering models have been researched, including an 512 aggregate population model, a two age group-stratified population model, and 32 age 513 group population models with both a contact matrix derived from Hethcote (1997) [15] 514 and (separately) using a re-balanced contact matrix. Although the results of all these 515 four particle filtering models could match the empirical data quite well, the minimum 516 discrepancy model emerged from the 2 age group age-stratified particle filtering model, 517 where the individuals in the child age group represent children in the first 5 years of life, 518 and incorporating both monthly and yearly empirical datasets. In this regard, it is 519 notable that according to the mathematical deduction of the age structured population 520 model introduced in [7] -adapted to pertussis in this research -with more age groups 521 considered in the age-structured model, the model can simulate the aging rate (c i ) more 522 accurately. However, in this paper, the discrepancies associated with both of the 32 age 523 group particle filtering models do not show evident decrease compared with the two age 524 group particle filtered models. We provide here some possible reasons are listed as 525 follows. Firstly, the stochastic processes considered in both the 32 age group 526 age-structured model and the two age-groups age-structured model are different, 527 especially in characterizing the stochastic evolution of the contact rate. Secondly, the 528 likelihood functions employed in this project -which are captured as the product of 529 negative binomial density functions across all empirical datasets and sharing a common 530 dispersion parameter -may be too naïve to capture the difference between the age 531 groups within the empirical datasets. Thirdly, as the number of age groups increase, the 532 dimensions of the state space of the particle filtering models increase dramatically. This 533 latter issue must be considered in light of the limitations of the particle filtering 534 algorithm, particularly the fact that the particle filtering method employing a 535 condensation algorithm may encounter problems in high-dimensional systems. In such 536 systems, the probability density functions would be more complex, which may require 537 high dispersion for representation using the likelihood functions employed, because it is 538 difficult to represent the details of the multivariate likelihood function using the product 539 of simple probability density functions. Research is needed into more effective 540 multivariate likelihood function design. The relationship between the the nominal state 541 space dimensionality and the number of particles required for effective particle filtering 542 also merits additional research, particularly in late of observed limitations in the benefit 543 of particle filtering for high dimensional systems [12]. Finally, when comparing the 544 discrepancy for distinct models, our lack of normalization for the count of datasets used 545 may lead to artificially stacking the comparison against the 32 age group model; while 546 the 32 age group does not exhibit markedly better discrepancy against the monthly 547 aggregate observations than does the 2 age group model, this consideration may suggest 548 that it is stronger than the yearly discrepancy numbers would suggest.

549
It also bears emphasizing the critical role of the stochastic process noise within the 550 state space models plays within successful particle filtering, and the challenges 551 associated with managing such noise. The stochastics associated with these factors 552 represents a composite of two factors. Firstly, there is expected to both be stochastic 553 variability in the measles infection processes and some evolution in the underlying 554 transmission dynamics in terms of changes in reporting rate, and changes in mixing.

555
Secondly, such stochastic variability allows characterization of uncertainty associated 556 with respect to model dynamics -reflecting the fact that both the observations and the 557 model dynamics share a high degree of fallability. Such stochastics impact the particle 558 filtering model in distinct ways during the estimation and prediction period, but results 559 in both periods are sensitive to the degree of stochastics involved. It was important that 560 the noise in the particle filtering models in this paper should be controlled in a proper 561 range, by tuning the parameters of diffusion coefficients in the stochastic processes 562 related to the Brownian motion. The need to characterize and tune stochastic noise 563 effectively can impose challenges in the speed with which particle filtering models can 564 be prepared for a new area of application.

565
The initial values of the age-structured population models in this paper are 566 estimated both manually and by the particle filtering algorithm. Specifically, the 567 population distribution among the different age groups are tuned manually, while the 568 population distribution among different stocks within a given age group is estimated by 569 the particle filtering algorithm by setting the initial values of stocks in a proper range 570 following a uniform distribution, but maintaining a total number of individuals for that 571 age group across the stocks. Especially in building the 32-age-groups particle filtering 572 models, much time and efforts was been used in estimating the population distribution 573 among the different age groups. 574 Particularly in light of the growing risk of pertussis outbreaks triggered by 575 combinations of vaccine hesitancy and waning dynamics from earlier generations of 576 MMR vaccine, a keen need for future work involves application the models presented 577 here to data in the vaccination era. While vaccination elements of the models discussed 578 here were only glancingly tapped by this research (in the context of demonstrating 579 capacity to reason about the effects of an immunization intervention), because of their 580 incorporation into the existing model structure, extension of this work to vaccine-era 581 dynamics should require only very limited changes to the models involved. 582 While application of particle filtering to pertussis dynamics is not without its 583 challenges, the approach examined here demonstrates great promise for creating models 584 that are automatically kept abreast of the latest evidence, for understanding the 585 underlying epidemiology of pertussis in the population -including the balance of the 586 population at varying levels of immunity -for projecting forward pertussis dynamics 587 and outbreak prediction over a year's time, and for evaluation of counter-factual 588 interventions. The results of this paper -which represents both the first application of 589 particle filtering to pertussis and the first to use particle filtering to assess the tradeoff 590 between public health applications -suggest that particle filtering may represent an 591 important element in the arsenal of public health tools to address the increasingly 592 difficult challenge of controlling pertussis in the context of vaccine hesitancy and waning 593 of both natural-and vaccine induced-immunity. 594 Supporting information 595 S1 Appendix. The introduction of mathematical models.