## Abstract

A fundamental question in ecology is the generation and maintenance of biodiversity. Classical approaches consider multi-species communities with Lotka-Volterra ODE models where inter- vs. intra-species interactions are key. Typically in high-dimensional systems, analysis is hard and model reduction is needed. Here, we describe and study a new system of multi-type interactions that arise in co-colonization, and develop a useful model reduction framework based on slow-fast dynamics under quasi-neutrality. We show that in a multi-type, Susceptible-Infected-Susceptible (SIS) system with co-colonization, neutral coexistence dynamics between *N* closely related strains occurs on a fast timescale, based on mean trait values, whereas the non-neutral selective forces act on a slow timescale, driven by the variance and asymmetries of the co-colonization interaction matrix. The explicit *N* equations for relative type frequencies on the slow manifold enable efficient computation of complex multi-strain dynamics, and provide analytical insight into high-dimensional community ecology, with potential applications to other multi-body systems.

## Introduction

A fundamental question in ecology and evolutionary biology relates to how biodiversity is generated and maintained [1, 2, 3, 4]. Many factors have been shown to be important, including inter- vs. intra-species interactions [5], population size [6], number and functional links with resources [7], and movement in space [8]. Most research in this field has considered multi-species interactions with classical Lotka-Volterra models or with evolutionary game theoretic approaches.

In this Letter, we describe and study a new system of multi-type interactions that arise in the epidemiological dynamics of co-colonizing strains of the same microbial species. An example of such a system could be polymorphic *Streptococcus pneumonia* bacteria. We generalize a previously-introduced Susceptible-Infected-Susceptible (SIS) framework for 2 circulating strains with co-colonization [16, 9], to the context of *N* strains. We show that an explicit reduced system emerges from a time scale separation, expressing the total dynamics as a fast plus a slow component, related to broken symmetries in co-colonization interactions.

In our system, microbial strains can infect a host simultaneously, and here we concentrate only on the case of up to 2 strains co-colonizing a host, whereby single colonization by a resident strain alters the susceptibility to incoming strains, (increasing or decreasing it) by a factor *K*_{ij}, relative to uninfected hosts, without acquired immunity. The transmission and clearance rates of all strains are equal, except for the interactions in co-colonization given by an *N × N* matrix. In earlier theoretical studies such interaction has been studied in a slightly different context in models considering arbitrary infection multiplicity [10]. Later evolutionary frameworks have also considered vulnerability to co-infection [11]. In the experimental literature, for example in microbiota studies, the phenomenon of colonization resistance has been reported [12]. In pneumococcal bacteria, more relevant to our model, carriage of one serotype has been shown to alter the acquisition rate of a second serotype [13]. Some studies have shown how this trait at the host-pathogen interface impacts disease persistence [14], coexistence and vaccination effects [15, 16], and contributes to diversity in other traits, e.g. virulence [17] and antibiotic resistance [18].

Yet, robust mathematical frameworks to analyze the full spectrum of eco-epidemiological patterns emerging exclusively from co-colonization interactions among a high number of strains remain undeveloped. Here, we fill this gap. We model an arbitrary number of closely-related strains, and show that in the *N*-strain system with altered susceptibilities in co-colonization, neutral coexistence dynamics between types occurs on a fast timescale, whereas the non-neutral selective forces act on a slow timescale, driven explicitly by trait variation in the co-colonization interaction space. These dynamics can be very complex, and are nonlinearly modulated by global parameters such as overall transmission intensity *R*_{0}, and mean interaction coefficient between strains. We derive a closed analytic solution for relative type frequencies over long time-scales in a changing fitness landscape.

### N-strain SIS model with co-colonization

We consider a multi-type infectious agent, transmitted via direct contact, following susceptible-infected-susceptible (SIS) epidemiological dynamics, with the possibility of co-colonization. The number of potentially co-circulating strains is *N*. With a set of ordinary differential equations, we describe the proportion of hosts in several compartments: susceptibles, *S*, hosts colonized by one type *I*_{i}, and co-colonized hosts *I*_{ij}, with two types of each combination, independent of the order of their acquisition. We have:
where gives the force of infection of strain *i*. Remark that here we assume that hosts colonized with a mixture of two subtypes *i* and *j* transmit either with equal probability. For more information and interpretation of parameters see Table 1 (Supplementary materials 1). Notice that *S* = 1 − Σ(*I*_{i} + *I*_{ij}), thus the dimension of the system is indeed *N* + *N*^{2}. In practice, *I*_{ij} and *I*_{ji} cannot be distinguished so the dimension is effectively *N* + *N*(*N* − 1)/2. For the special case of *N* = 2, this model has been described and analyzed elsewhere [16, 9]. We study pathogen diversity only in relation to how strains interact with each other upon co-colonization (*K*_{ij}), assuming equivalence in transmission *β* and clearance rate *γ* (although extension to further asymmetries in these traits is ongoing). The coefficients *K*_{ij} can be above or below 1, indicating facilitation or competition between any two types in co-colonization. In the above notation *m* = γ + *r*, encapsulating both clearance rate *γ* of colonization episodes and recruitment rate of susceptible hosts *r*, equal to the mortality rate from all compartments (*r* = *d*). For a model summary diagram see Fig.1a-b

## Results

### Re-writing the system in aggregated form

In order to derive a reduced model for such a general *N* strain system, displaying complicated dynamics (Fig. 1b), we use the following notation:
for the prevalence of strain *i* in the population, and the total prevalence of all strains, respectively. Total prevalence satisfies *T* = Σ_{i}*J*_{i} and the forces of infection are: *F*_{i} = *βJ*_{i}. With these notations, the system (1) reads:

This system of 2 + *N* + *N*^{2} equations admit a triangular structure: i) First we describe block of 2 equations (*S*, *T*). ii) Next, we study the block of 2*N* equations (*I*_{i}, *J*_{i}), which is the most complicated. iii) Lastly, we deal with the block of the *N*^{2} equations of *I*_{ij}, which is simple once the dynamics of *I*_{i} and *J*_{i} are known. In the following we describe briefly these three blocks.

#### The 1st part (*S*, *T*)

Clearly, if the basic reproduction number , then . Hence, without loss of generality, we consider that .

#### The 2nd block of equations (*I*_{i}, *J*_{i})

If we reduce the system to the invariant manifold , we obtain the following sub-system of 2*N* differential equations:
together with: . The dynamics of (4) are complex and we use a slow-fast approach to analyze it.

#### The co-colonization block *I*_{ij}

This is the simple system

Once it is known that , then (5) implies . Thus, once the dynamics (4)-(5) are explicit, so are the dynamics of (1).

#### Slow-fast representation of (4)

Since we model *N* closely-related strains, we can write each co-colonization coefficient as: *K*_{ij} = *k*+*εα*_{ij}, where 0 ≤ *ε* << 1. Replacing these in (4), and re-arranging, we obtain:
where 1 ≤ *i* ≤ *N*, and *I* = Σ*I*_{j}.

#### Neutral system

If *ε* = 0, then from above, we obtain the *Neutral model*

The first equation gives . Co-colonization prevalence is simply derived as: **D** = **T** − **I**. Fixing *I* = **I**, yields the (degenerate) linear system:

Matrix *A*_{0} has the two eigenvalues 0 and −*ξ* = *tr*(*A*_{0}) < 0, and eigenvectors , . We define *H*_{i} and *z*_{i} as:

We have

Thus
tends to zero, and
remains constant in a neutrally stable manner. *H*_{i} measures the difference between the part of strain *i* on the single infected versus the part of the strain *i* on the total number of infected. Thus, *H*_{i} → 0 means that the proportion of strain *i* in single colonization, tends to be the same as the proportion of strain *i* in the total mass of all colonization. This implies it also tends to be equal to the proportion of strain *i* in co-colonization. Because on the slow manifold, we have *H*_{i} = 0, we can infer that during fast dynamics *z*_{i} tends to:
whereby on the slow manifold *z*_{i} describes exactly the frequency of strain *i* among all hosts that are colonized, with Σ*z*_{i} = 1.

#### The slow-fast dynamics

In (6) *I* plays a role in the order 0 term of the last two equations. Thus, we use the ansatz: *I*(*t*) = **I** + *εX*(*t*) + *O*(*ε*), and rewrite (6) (for details see Supplementary Materials 2). After some algebraic manipulations, using newly defined variables we obtain the explicit slow-fast system:
for 1 ≤ *i* ≤ *N*, where is a constant.

If *ε* → 0 then *z*_{i} remains constant and (*X*(*t*), **H**(*t*)) → (*φ*(**z**), 0) (shown in detail in S2). However, on the slow time scale *τ* = *εt* in (11), *z*_{i} is not constant, and letting *ε* → 0, one obtains the slow dynamics on *z*_{i}, given by the *N*-dimensional system:
which reads explicitly
for 1 ≤ *i* ≤ *N*, where the constants Θ, *μ* > 0 are:
and the term *q*(*z*) is given by:

The constant Θ is the rate that sets the tempo of multi-strain "motion" on the slow manifold towards an equilibrium. Notice that Θ depends specifically on the absolute transmission rate of the pathogen *β*, but also on mean traits *R*_{0} and *k*, via the conserved aggregated quantities **T**, **I**, **D**. The quantity *µ* on the other hand represents the ratio between single colonization and co-colonization prevalence from the neutral system, a factor that amplifies the importance of type asymmetry.

The function *q*(*z*) represents the time-changing impact of all the strains on their environment, which in turn modifies the fitness landscape for each in an equal manner. A more explicit way to see *q*(*z*) is in terms of relative change in ‘effective’ mean interaction coefficient between all extant types in the system, which if negative, indicates a global trend toward more competition in co-colonization, and if positive, indicates a global trend toward more facilitation. At this stage, we can apply quasistationarity methods (see [19, 20, 21, 22]) to show that the solution of (11) tends to the solution of (12) as *ε* → 0 locally uniformly in time on [*t*_{0}, +∞[for some *t*_{0} > 0.

#### Link with pairwise invasion fitnesses

Next, we provide another equivalent very useful representation of the model by using the notion of pairwise invasion fitness [23]. Let be the exponential growth rate of the strain *i* evaluated when introduced at the trivial endemic equilibrium of the strain *j* alone. If the fitness , strain *i* will invade, and viceversa. By considering the rate of growth of strain *i* in an equilibrium set by *j* (*dz*_{i}/*dt* (12) in the special case where all *z*_{k} = 0 for *k* ≠ *j*), we find the exact formulation of pairwise invasion fitness is given by:

We can thus recast the system (12) using only these fitness notations (for details see Supplementary Material 3):

The above can be made more compact by denoting the pairwise invasion fitness matrix, and using vector notation:

It is this matrix Λ that defines all ‘edges’ of the rescaled interaction network between *N* strains. For *N* = 2, similar to the classical Lotka-Volterra model, in our model, as already shown [9], there are only four possible outcomes between 2 strains (edge linking 1 and 2): **i)** : stable co-existence of 1 and 2; **ii)** : bistability of 1-only and 2-only; **iii)** : 1-only competitive exclusion; **iv)** : 2-only competitive exclusion. Knowing all (15) we can reconstitute the ultimate dynamics of the full system with *N* types and co-colonization.

#### Going back to the original *N* + *N*(*N* − 1)/2 *variables*

Recall the original system with *N* strains is given by the SIS model with co-colonization interactions (1). While the decomposition *K*_{ij} = *k* + *εα*_{ij} is mathematically non-unique, and can be applied with respect to any reference *k*, provided that the resulting *ε* is small, a possible convenient choice is to define *k* as the average of the original interaction matrix entries *K*_{ij}:
and to define deviation from neutrality, *ε*, as the root mean square distance of the *K*_{ij} from the mean interaction coefficient:
thus representing the standard deviation of the *K*_{ij} traits. The direction of deviation from neutrality (bias) for the interaction between strain *i* and *j* is then obtained as:

Thus, the matrix is the *standardized interaction matrix*, with . This matrix *A*, and the ratio *µ*, determine the pairwise invasion fitness matrix (13), which contains nearly all the qualitative information about the non-neutral dynamics (15). So far, we have shown that provided the deviation from neutrality, *ε*, is small, the behavior of (12) describes very well the long term dynamics of (1). To recover the original variables, after solving for strain frequencies *z*_{i} on the slow manifold, as the dynamics of (1) are well approached by those of (12), we can use the relations:
to obtain the total prevalence of susceptibles *S*, total prevalence of colonized individuals *T*, and prevalence of single colonization *I*. The prevalence of co-colonization is simply **D** = **T** − **I**. Further, to recover strain-specific single colonization, and cocolonization prevalences, we apply:
where the slow time scale is *τ* = *εt*, and the slow variables **z**(*τ*) = (*z*_{i}(*εt*))_{i}verify Σ_{i}*z*_{i}(*εt*) = 1 and follow dynamics (15). In Figure 2 and in Supplementary Movie 1, we provide illustrations of the modeling framework and some dynamics that are possible for an arbitrary number of strains. Some further illustrations for other values of parameters are given in Supplementary Figure 1 and Supplementary Movie 2.

#### Accuracy of the approximation and simulation speed

The slow-fast model being a good approximation of the original system, more precisely, means that there exists *t*_{0} > 0 such that for any *T* > *t*_{0} and any *ε* > 0 small enough, there is *M* > 0 such that for any *τ* ∈ (*t*_{0},*T*), the error between the two is small: . We confirm this with numerical simulations in Figure 3a)–b), where the neutral model is shown to be a good approximation of the original system in a fast time-scale, and the slow-dynamics reduction a good approximation on the longer time-scale. Furthermore, we verify that the slow-fast approximation enables exploration of dynamics in a much more computationally efficient manner over a very large number of strains *N* (Fig.4), which is difficult with the original system.

## Discussion

In this study, we investigated how co-colonization interaction networks drive coexistence in a system with *N* closely related types. We derived a slow-fast decomposition for the dynamics, which links explicitly variation in the interaction matrix with the slow time scale under which multi-type selective dynamics unfold. Model reduction for small deviations from neutrality allowed us to express strain frequencies on the slow manifold via only *N* equations (instead of *N* + *N*(*N* − 1)/2 equations in the full system). These can be easily re-mapped to the more complex epidemiological variables of the original system. The slow variables *z*_{i} (1 ≤ *i* ≤ *N*), describing relative strain frequencies in the host population, necessarily equal in single and co-colonization, could be used as a practical test for quasineutrality, when strain prevalence data are available.

Our focus here was on presenting the timescale decomposition method for this system. The wider and more complete ecological picture, as well as the analysis of diversity-stability relationships in coexistence will be the focus of another paper [24]. We believe this work makes a crucial step towards the full characterization of conservative multi-type SIS dynamics.

The slow-fast framework allows to reduce the complexity of multi-strain systems, and understand transient and long-term coexistence on conserved manifolds near neutrality. We expect our results to stimulate further progress for the investigation of coexistence and evolution in multi-strain epidemiological systems such as the one of pneuomococcal bacteria [25], where explicit mathematical results linking neutral and niche mechanisms, for *N* large, are thusfar missing. Although motivated by such a pathogen with altered susceptibilities in co-colonization, the global transmission dynamics captured here have broader implications. As an example, information on the resilience of a group of strains may be derived from the sign of *q*(*z*) in our system (12): if *q*(*z*) > 0 then each strain is less competitive within the group but the overall community is more resistant to invasion by a new strain. Our model’s generality and closed analytical form for relative type dynamics provide a new bridge between epidemiology and community ecology, invite comparison with the classical Lotka-Volterra model, and offer extension to other multi-body problems in physics and genetics.

We acknowledge support by the University of Tours and EURAXESS for an invited researcher stay of Erida Gjini at IDP, in Tours during 2019.

## SUPPLEMENTARY MATERIAL

### 1. Table of illustrative model parameters for a biological system (e.g. pneumococcus bacteria transmission)

### 2. Derivation details for the slow-fast dynamics

Using the ansatz: *I*(*t*) = **I** + *εX*(*t*) + *O*(*ε*), system (6) reads
wherein we have denoted:

The final slow fast form is given by using the new variables *H*_{i} and *z*_{i} defined in (9). Hence, define
and let us introduce the **G**_{i}= (*g*_{i,1}, *g*_{i,2})^{T} as ^{1}
and
we have

Finally, using these new unknows, one obtains the expression of the slow fast system (11):
where is a positive constant. If *ε* → 0 then *z*_{i} remains constant and (*X*(*t*), **H**(*t*)) → (*φ*(**z**), 0) where:

On the slow time scale *τ* = *εt* in (11), and letting *ε* → 0, one obtains the slow dynamics

Taking (*X*, **H**) = (*φ*(**z**), 0) yields the slow dynamics reducted to the slow manifold
where
*φ*(**z**) is given by (18) and verifies

Denoting *µ* = **I**/**D** and , the explicit slow dynamics is then given by system (12) which may be rewritten as

Remark that we have: which implies that the simplexe is invariant under (12).

### 3. Characterization of the model via invasion fitnesses

We focus on the trivial solutions of (12), which are the extreme competitive exclusion equilibria *E*_{i} = (*δ*_{ik})_{1≤k≤n} where *δ*_{ik} is the Kronecker delta:

In particular, we note the invasion fitness of the strain *i* with respect to the trivial solution *E*_{j}. More precisely, is the per-capita growth rate of the strain *i* evaluated at *E*_{j}. Since the *i*-th line of the system reads , we have the explicit formula:
that is

We have that the line *i* of (21) is

With the notations introduced above, this can be rewritten as:

From , we infer finally, the fact that yields

Plugging this into (23), we obtain a new version of system (21), enterely parametrized by the fitnesses :
or more shortly by noting and the *pairwise invasion fitness matrix* :