Stochastic Simulation Algorithm for effective spreading dynamics on Time-evolving Adaptive NetworX (SSATAN-X)

Modelling and simulating the dynamics of pathogen spreading has been proven crucial to inform public heath decisions, containment strategies, as well as cost-effectiveness calculations. Pathogen spreading is often modelled as a stochastic process that is driven by pathogen exposure on time-evolving contact networks. In adaptive networks, the spreading process depends not only on the dynamics of a contact network, but vice versa, infection dynamics may alter risk behaviour and thus feed back onto contact dynamics, leading to emergent complex dynamics. However, stochastic simulation of pathogen spreading processes on adaptive networks is currently computationally prohibitive. In this manuscript, we propose SSATAN-X, a new algorithm for the accurate stochastic simulation of pathogen spreading on adaptive networks. The key idea of SSATAN-X is to only capture the contact dynamics that are relevant to the spreading process. We show that SSATAN-X captures the contact dynamics and consequently the spreading dynamics accurately. The algorithm achieves up to 100 fold speed-up over the state-of-art stochastic simulation algorithm (SSA). The speed-up with SSATAN-X further increases when the contact dynamics are fast in relation to the spreading process, i.e. if contacts are short-lived and per-exposure infection risks are small, as applicable to most infectious diseases. We envision that SSATAN-X may extend the scope of analysis of pathogen spreading on adaptive networks. Moreover, it may serve to create benchmark data sets to validate novel numerical approaches for simulation, or for the data-driven analysis of the spreading dynamics on adaptive networks. A C++ implementation of the algorithm is available at https://github.com/nmalysheva/SSATAN-X. Author summary Modelling and simulating of infectious disease spreading supports public heath decisions, such as prevention and containment strategies and allows to perform cost-effectiveness calculations. Detailed modelling approaches consider stochastic pathogen spreading on time-evolving contact networks. In adaptive networks, the spreading process depends not only on the dynamics of a contact network, but vice versa, infection dynamics may alter risk behaviour and thus feed back onto contact dynamics. Stochastic simulation of these complex dynamics is currently computationally prohibitive. We propose a new algorithm (SSATAN-X) that can significantly speed up stochastic simulations on adaptive networks, while being accurate at the same time. Our algorithm achieves this speed-up by only considering the contact dynamics that are relevant to the spreading process. The benefit of algorithm is particularly pronounced when contacts are short-lived and per-exposure infection risks are small, which is applicable to most infectious diseases. We envision that SSATAN-X may allow simulation and analysis of pathogen spreading on more complex adaptive networks than previously possible. Moreover, data sets may be created with SSATAN-X that are useful for benchmarking novel numerical schemes and analytic approaches.

Modelling and simulation of contagion spreading to forecast disease prevalence and to 2 assess different public health interventions has a long history in mathematical 3 biology [1]: Traditional approaches involve compartmental models, such as 4 susceptible-infectious-recovered (SIR) or susceptible-infected-susceptible (SIS) [2]. 5 However, traditional compartmental models do not capture relevant spreading paths 6 and may even provide incorrect predictions [3]. Network models incorporate more 7 realism, explicitly considering interactions (=edges) between individuals and locations 8 (=nodes). Within this context, mostly static networks have been studied in the 9 past [4,5]. However, in static networks, the extent of linkage of nodes is determined only 10 after integrating information derived from observing the contact (or spreading) process 11 over a period of time. Several examples highlight that analysis of static networks lacks 12 important temporal information about causal paths that underlie the spreading process, 13 consequently yielding false conclusions for the control of the spreading process [6]. In 14 order to capture the temporal causality of the underlying system, different 15 time-evolving network models have been introduced recently [7,8]. A particularly 16 relevant class are adaptive networks. In this class of network models, the network 17 structure itself changes dynamically in response to the dynamics of the spreading 18 process [9,10]. Examples are the concurrency of sexual partnerships which is thought to 19 be important for HIV spread [11][12][13] or measures of self-isolation (quarantine) for 20 individuals diagnosed with SARS-CoV2. This creates a feedback loop between the 21 spreading dynamics on the network and the dynamics of the network itself, leading to 22 emergent complex behaviour. Epidemic spreading has been extensively studied on these 23 types of networks to understand the influence of social contact structure on disease 24 prevalence [14][15][16][17]. However, the dynamics of contagion spreading on adaptive networks 25 are usually complex, and analytical results can only be obtained in special cases [17,18]. 26 This makes numerical studies based on stochastic simulations indispensable. time and then perform parallel updates of the contact structure and the contagion 41 spreading (akin to a tau-leaping procedure in systems biology [32,33]). Time 42 discretization and synchronous updates are implemented in EpiModel [34] and 43 NepidemiX (http://nepidemix.irmacs.sfu.ca/). (ii) The method proposed by 44 Vestergaard [35] borrows ideas from the 'integral method' originally proposed and 45 applied in the Systems Biology field [36]. It assumes deterministic contact dynamics, e.g. 46 in Vestergaard [35], a temporal set of static contact configurations is used. If the 47 contact dynamics would in fact be deterministic for the given time intervals, the method 48 then estimates the exact time to the next spreading event. However, in Vestergaard's 49 original implementation the underlying contact network is not adaptive any longer; i.e., 50 the network configurations affect the spreading dynamics, but not vice versa. Therefore, 51 this approach is limited to particular scenarios, in which human behaviour is not 52 adapted as the pandemic unfolds, in contrast to what is currently observed for the 53 SARS-CoV2 outbreak. 54 In this article be develop an efficient and exact stochastic simulation method that 55 allows to predict contagion spreading in the case when both the contact dynamics, as 56 well as the spreading dynamics are governed by inter-dependent stochastic processes, i.e. 57 human behaviour is adapted as a function of the epidemic process. The algorithm is 58 written in C++, available through https://github.com/nmalysheva/SSATAN-X 59 Materials and methods 60 Model: spreading process on an adaptive contact network 61 In the following, we introduce the proposed simulation algorithm using an example of a 62 simple spreading process that evolves on a time-dependent, adaptive contact network.

71
The contacts change over time. To describe the contact dynamics of the model, each 72 individual is assigned rates of gaining and loosing contacts over time. 73 Moreover, in case the individual is diagnosed with the infection, he/she may become 74 aware of his/her status, which affects the individual's behaviour. In case of our 75 modelling exercise, the individual cuts all of his/her contacts, and the individual rate of 76 establishing new contact drops to 30% of its pre-diagnosis value. Thus, this simple 77 model describes adaptive dynamics, where the contact dynamics change depending on 78 the epidemic state, Fig. 1C.

79
The presented model can be described as an undirected weighted temporal Contact 80  Each node v i ∈ V has the following individual characteristics: (i) a rate at which 84 a new contact is made λ i and (ii) a rate of loosing an existing contact θ i .

85
• The set of edges is denoted and v k are in contact. Parameter γ represents the transmission rate from j to k, (v j , v k ), j = k which is not connected by an edge, the rate of assembling an 95 edge is defined by λ (j,k) = λ j · λ k i.e. product of the assembling rates of the 96 two nodes.

97
• Disassembling of an existing contact. For each pair of nodes (v j , v k ), j = k 98 that are connected by an edge, the rate of disassembling is defined as 99 θ (j,k) = θ j · θ k i.e. the product of the disassembling rates of the two nodes. In many applications, we are only interested in the epidemic dynamics, e.g. to quantify 113 the effect of particular epidemic control measures, such as contact restrictions or other 114 behavioral changes, on the infection incidence. However, the transmission dynamics 115 directly depend on the dynamics of the contact process. Moreover, an important 116 characteristic of any adaptive network is that not only the contact dynamics affect the 117 spreading process, but also vice versa (Fig. 1C). I.e. in the example above, the rate λ i 118 and number of contacts of the individual depends on the actual state of node i. In our 119 example, we sample λ i (t) ∼ U(0.5, 2.5) (1/day) and set λ i (t) = λ i (t) · 0.3 in case of a 120 diagnosis.

121
Note that any time-dependent function is in principle possible (see Discussion).

122
The presented model can now be evolved using the stochastic simulation algorithm 123 (SSA), Fig. 1D. The SSA would sample the time to the next event, which could either 124 be an event that relates to the contact dynamics, or an event that relates to the 125 epidemic dynamics as outlined in Algorithm 1. Algorithm 1: SSA initialize : Network G(t), reaction rate functions r ∈ R, a ∈ A, final time T F , t = 0 1 while t < T F do 2 Compute reaction propensities r (·,·) and a (·) based on the current contact network G(t) 3 Compute the sum of the reactions propensities Choose the next reaction, e.g. a contact dynamics reaction with probability: or an epidemic dynamics with probability:  In the following, we will present an algorithm that is able to precisely sample the 132 time to the next infection event, while performing bulk updates on the contact dynamics. 133 Notably, the bulk updates preserve all statistical properties of the contact dynamics that 134 are relevant to the spreading process (epidemic dynamics). Hence, the algorithm allows 135 to study the relevant dynamics at full resolution, while reducing computational burden. 136 Simulation of effective spreading dynamics with SSATAN-X

137
In the proposed algorithm, we are splitting the above described stochastic process into a 138 process Y comprising all reactions R belonging to the contact dynamics and a process 139 Z that is driven by reactions A (epidemic dynamics). We then leap-forward the 140 background process (Fig. 1E) , until a reaction of the epidemic process happens (Alg. 2). 141 Algorithm 2 implements the core idea of SSATAN-X and is an adaptation of the 142 EXTRANDE method proposed by Voliotis in the context of Systems Biology [37]. bound will result in many "thinning" reactions, which can make the algorithm 153 inefficient. In S 1 Appendix, we provide a method to calculate B T L . With respect to 154 the upper bound B T L we can propose the time to the next reaction a ∈ A, ∆t (line 4). 155 Notably, since B T L ≥ a 0 for network G(s) ; s ∈ [t ; T L ], the proposed time to the next 156 reaction might be shorter than the actual time to the next reaction event. This 157 apparent under-estimation is "corrected" by the "thinning" step (lines [16][17], such 158 that the statistics of the process are being preserved (for details refer to [37]). When the 159 proposed time step ∆t lies within the look-ahead horizon T L , time t is updated by ∆t 160 (line 9). At this point, the actual propensity functions a need to be updated. In order 161 to do so, the contact dynamics (edges) of the network G(t) are bulk-updated using 162 Algorithm 3 (tau-leaping)(line 8) and the next reaction to update (or "thin" ) the state 163 of the nodes (main process) is chosen (lines 12-18). Here, the adaptivity step (line The bulk-updating algorithm is based on the τ -leaping algorithm, which was originally 170 proposed by Cao and Gillespie [32] and was later modified by Anderson [38] by 171 incorporating post-leap checks to guarantee accuracy. 172 We use the bulk-update algorithm only for reactions R of the background process S 173 (contact dynamics). A key adaptation is to consider all reaction of one type ξ ∈ {+, −} 174 (addition or deletion of edges) simultaneously. In the network example, only two types 175 of reactions change the contact structure of the network: In a finite population, the amount of edges in the network is limited by the number 180 of edges in a full undirected graph N (N − 1)/2. Therefore, the number of edges 181 available for deletion during one iteration is limited by the number of edges existing in 182 the network |E(t)| at time t, such that the number of edges available for addition during 183 one iteration is limited by N (N − 1)/2 − |E(t)|. 184 We denote a network G = {E, V }, as well as its complement network G = {E , V }. 185 To accurately capture the contact dynamics, we store both these networks. Both of the 186 networks co-evolve synchronously (i.e. a deletion of an edge in the contact graph is 187 accompanied by the addition of the same edge in the complement graph). This allows 188 to apply less complex search algorithms to correctly simulate the contact dynamics.

189
Adding and edge to G will be complemented by deleting this edge from G and vice   Based on the initial state of the system, the time step of the leap τ is calculated 197 (line 2) based on the following formulas: where |E|, |E | denote the number of edges in the contact network and its complement 199 graph. The expected net change to the network graph, as well as the complement graph 200 is given by: where r +/− 0 denote the sum of reaction propensities for addition and deletion 202 respectively at the current network state G(t). The variance of the net change is given 203 by because both addition and deletion reactions are first-order reactions in the contact 205 network and its complement graph.

206
Algorithm 3: τ -leaping for bulk-updating the contact network initialize : Network G(t), contact dynamics reaction functions r ∈ R, final time T F , t = 0, 0 < p < p * < 1 and 0 < q < 1. q * > 1 1 Calculate propensities r ξ (·,·) and for each reaction type ξ = {+, −} based on the current network state G(t). Compute r ξ 0 = r ξ (j,k) , as well as if leap conditions eq. (4) holds then 21 for ξ do 22 Delete all rows from S ξ less than or equal to row ξ If the chosen τ is less than value of 10 r 0 , we abandon the leaping algorithm and 208 execute a fixed amount of steps using the SSA algorithm (usually taken to be 100 [33] Find first y such as: remove edge e (j,k) corresponding to index y from the network and remove the corresponding element from cs − 8 add new element r + (j,k) + cs + cs + .length to the end of cs + 9 else if addition then 10 r + 0 = cs + cs + .length

11
Find first y such as: add edge e (j,k) corresponding to index y to the network and remove the element from cs + 13 add new element r − (j,k) + cs − cs − .length to the end of cs − 14 end 218 Implementation and availability 219 We implemented the algorithm in C++ using an object-oriented approach. The contact 220 network was implemented with the LEMON Graph library In order to analyse the accuracy and performance of SSATAN-X, we first evaluated 226 simulations with pure contact dynamics.

227
Whereas the exact stochastic simulation algorithm (Alg. 1) conducts changes to the 228 contact dynamics network one-by-one, as illustrated in Fig. 2 (left), SSATAN-X uses 229 bulk updates of the contact network, as described in Alg. 3 and shown in Fig. 2 (right). 230 We remember that the main principle of SSATAN-X is that it accurately captures the 231 statistics of the contact dynamics that is relevant to modelling the epidemic dynamics. 232 Therefore, the statistics of the contact dynamics only need to be accurate at the time 233 point when a bulk update was performed, but not in between bulk updates (Fig. 2,   234 right).  To further quantify the differences between the evolving contact network over time, 252 we computed the statistics of the Kolmogorov-Smirnov test, as shown in Fig. 4A points, further highlighting that differences in the degree distribution are insignificant. 258 We hence conclude that the bulk-updating (Alg. 3) of the contact network dynamics, 259 as implemented in SSATAN-X, accurately captures the statistics of the time-evolving 260 contact network at the discrete time steps.

261
Next, we evaluate whether the SSATAN-X also accurately captures epidemic 262 dynamics that unfold on an adaptive contact network.  In order to test the accuracy of the proposed algorithm, we conducted 10 3 278 simulations using the SSA (Alg. 1), as well as SSATAN-X (Alg. [2][3][4]. Foremost, we were 279 interested to see whether the statistics of the times to the next epidemic event were 280 correct, as proposed in Fig. 1D-E. In Fig. 5, we show a histogram of the time-steps to 281 the next epidemic event (infection, diagnosis, death) for the SSA (orange bars), as well 282 as SSATAN-X (blue bars). As can be seen, both algorithms yield indistinguishable 283 statistics regarding the epidemic inter-event times. This reaffirms the correctness of the 284 proposed algorithm in being able to modelling the net effect of contact dynamics on the 285 spreading process. Fig. 6 shows the corresponding simulation results for the epidemic  simulation results presented in Fig. 8A show that the run times increase with increasing 335 population size. However, for the utilized parameter setting, the runtime of SSATAN-X 336 is about 100-fold lower than the run time of SSA.

337
Next, we wanted to assess the runtime performance as a function of the utilized 338 parameters. To that end, we chose homogeneous contact dynamics rates λ i , θ i across 339 the population (all individuals having the same rates). This allows to assess how the 340 run time changes as a function of the probability that transmission occurs on a contact 341 edge before it dissolves. This probability is given by γ γ + θ 2 , where γ is the transmission 342 rate on a contact edge and θ 2 is the rate of loosing an existing edge. In the analysis, we 343 increased λ and θ.

344
The run time of both SSA and SSATAN-X for performing 10 3 simulations for each 345 parameter set is shown in Fig. 8B. As can be seen, when the relation γ γ + θ 2 is close to 346 1, i.e. transmission is always bound to happen before an edge dissolves, SSA may be 347 slightly faster than SSATAN-X. The reason is that the time-interval between two 348 epidemic events becomes so small that the bulk update of the contact dynamics reduces 349 to the SSA. Noteworthy, such models would actually not require to model separate 350 contact dynamics altogether, making this example a bit artificial.

351
When the relation γ γ + θ 2 decreases, i.e. the contact dynamics being much faster 352 than the epidemic dynamics, the run time of SSATAN-X becomes superior over SSA.  In summary, SSATAN-X is computationally more efficient than SSA. The run time 358 of both algorithms scales with the population size and the "boost-factor" of SSATAN-X 359 is preserved across different population sizes. However, the computational "boost-factor' 360 of SSATAN-X increases, when the contact dynamics are fast in comparison to the 361 epidemic dynamics.

363
Spreading of infectious disease can occur when individuals encounter in contacts that 364 are relevant for the route of transmission of a pathogen. These contacts can be sexual 365 contacts for sexually transmitted disease, or spatial proximity for airborne infections. In 366 order to more realistically model spreading of infectious diseases, different network 367 models have been proposed in the past [10]. Adaptive networks encompass a class of 368 models, where a contact network structure changes dynamically in response to the 369 dynamics of the spreading process [9]. These types of models may therefore allow to 370 realistically consider behavioural changes that occur when individuals become aware of 371 their infection and self-isolate [39], or experience debilitating symptoms.

372
In this article, we introduce SSATAN-X, an efficient and exact algorithm to simulate 373 effective spreading dynamics on adaptive networks.

374
SSATAN-X takes advantage of the fact that contact dynamics must be faster than 375 epidemic dynamics. Instead of performing detailed simulations of the underlying contact 376 dynamics, the core idea of SSATAN-X is to only consider the effect that the underlying 377 contact dynamics have on the epidemic dynamics. For this purpose, SSATAN-X uses 378 bulk updates of the contact structure between epidemic events, such as disease 379 transmission, diagnosis or death (Fig. 2). 380 In Figure 3, we show that the proposed bulk-updating conserves the statistical 381 properties of the underlying contact network at times that are relevant to the epidemic 382 dynamics. Because the statistics of the underlying contact network at times that are 383 relevant to the epidemic dynamics are preserved, the statistics related to the firing of 384 the next epidemic event (Fig. 5) are preserved, too. Consequently, the resulting 385 epidemic dynamics are statistically correct, Fig. 6-7. However, the bulk updating avoids 386 computational overheads from updating contact edges one-by-one. Therefore, depending 387 on the choice of parameters, SSATAN-X can speed up computation time quite with SSATAN-X will be considerable. On the other hand, if transmission will always 393 occur on a contact edge before it dissolves, the computational speed up will only be 394 moderate or even absent. In Fig. 8B, we expressed this dependency as γ/(γ + θ 2 ), 395 where θ 2 denotes the rate of loosing an edge and γ is the transmission rate along a 396 contact edge.

397
Note that the transmission rate γ denotes a lumped parameter, that can be further 398 decomposed to allow for more sophisticated modelling, realistic parameterization and 399 model extensions: For example, if exposures on a contact edge occur with rate r e and if 400 the exposures are of the same type, then the transmission rate is just a scaled variant of 401 the exposure rate, i.e. γ = r e · p tr|e , where p tr|e is the probability that a transmission 402 occurs for a single exposure event e. Notably, for many infectious diseases and types of 403 exposures p tr|e is readily known [25][26][27] and the effects of pharmaceutic [30,31] or  Taken together, the lumped transmission rate γ is a function of the transmission 408 probability per exposure event p tr|e , as well as the frequency of exposures r e when 409 individuals are in contact. For many infectious diseases and types of contacts it may be 410 argued that γ/(γ + θ 2 ) << 1; I.e., most transient contacts may not result in pathogen 411 transmission. Behavioral changes, such as distancing, isolation or quarantine, may 412 further reduce the transmission rate by reducing r e , whereas pharmaceutic (vaccination, 413 prophylaxis, treatment) [42][43][44] or non-pharmaceutic interventions (e.g. mask wearing 414 for airborne-, and condom usage for sexually transmitted infections) [45,46] may lower 415 p tr|e in a static or time-dependent manner.

416
In our examples, we considered reaction rates (for R and A) to be constants.

417
However, it is also possible to consider time-dependent rates: For example, the 418 transmission rate γ may be a function of time, due to some time-dependent 419 pharmacokinetic effect of a prophylaxis [28,29,47], or depending on how long an 420 individual is being infected or contagious [40]. These time-dependent effects may be 421 incorporated when determining the upper bound B in Algorithm 2 (S 1 Appendix).

422
While this modelling may further reduce the value of B it may even boost the overall 423 computational performance by increasing the time to the next epidemiological event.

424
In our paper, we use the Anderson tau-leap algorithm (Alg. 3) [36] to compute the 425 bulk updates of the contact dynamics between two epidemiological events. It is also 426 possible to use other tau-leap based approaches, as summarized in [48], however, we 427 used the Anderson tau-leap algorithm, because it does guarantee accuracy through step 428 size adaptation.

429
While τ −leaping may be considered as a stochastic simulation algorithm that has 430 some analogies to explicit (first order) Euler methods for solving systems of ordinary 431 differential equations (ODE) [49], another possibility for the bulk updating of the 432 contact dynamics is to adapt higher-order schemes with adaptive step sizes that have 433 been proposed in the context of ODEs [50]. For example, the Runge-Kutta-Fehlberg 434 method [51] could be used to adapt step sizes for the bulk updating. A clear advantage 435 of higher-order schemes in the context of bulk-updating the contact network is the fact 436 that larger step sized may be chosen that preserve accuracy [52]. In the future, we will 437 further improve SSATAN-X by incorporating these higher-order approximation schemes. 438 In our paper we consider finite populations. In finite populations, the maximum graph' approach, making the algorithm more efficient.

448
Some authors have also proposed more restrictive models for the number of 449 contacts [11]. Some of these concepts may be inspired by the popular Dunbar 450 number [53,54], which is rooted in the presumption that individuals may not maintain 451 stable social relationships with more than 150 other individuals. Notably, Dunbar's 452 presumption has not been without critique [55].   Methods for simulating spreading processes on networks. We consider two processes Y |Z : E → E and Z|Y : V → V . The first process Y |Z affects the edges E of the network (topology, contact dynamics), while the second process Z|Y affects the state of the nodes V (epidemic dynamics). A-C. Different approaches for modelling spreading processes on networks. A. In static network approaches, the network topology E, which determines the linkage and thus the spreading process Z, remains fixed. B. In temporal networks, the network topology evolves according to some contact dynamics Y : E → E, which is independent of the epidemic process Z|Y . C. Adaptive networks denote a class of co-evolving stochastic processes, where both the contact structure affects the spreading process Z|Y , and the spreading process alters the contact dynamics Y |Z. D. Exact stochastic simulation [19] of spreading processes on adaptive networks. In the stochastic simulation algorithm, every change to the network is explicitly modelled, including changes in the edges/topology, as well as changes to the state of the nodes (empty circles vs. filled red circles). E. Proposed method: The proposed method only considers the net effect of topological changes on the epidemic dynamics. An upper bound B L is computed such that B L ≥ j∈Z a j (V, E), and the network topology is bulk updated for the time step ∆(B L ) ∼ exp (B L ) using τ -leaping. A change to the state of the nodes (epidemic process) is conducted if r ≤ a 0,Z /B L ; r ∼ U(0, 1) and a 0,Z = j∈Z a j (V, E).    The relation γ γ + θ 2 , with γ being the transmission rate and θ being the rate of edge disassembling, can be interpreted as the probability of transmitting an infection before the contact is dissolving.