## Abstract

Tuberculosis (**TB**), the disease caused by *Mycobacterium tuberculosis* (**Mtb**), remains a major health problem with 10.6 million cases of the disease and 1.6 million deaths in 2021. It is well understood that pulmonary TB is due to replication of Mtb in the lung but quantitative details of Mtb replication and death in lungs of patients and how these rates are related to the degree of lung pathology are unknown. We performed experiments with rabbits infected with a novel, virulent clinical Mtb isolate of the Beijing lineage, HN878, carrying an unstable plasmid pBP10. In our in vitro experiments we found that pBP10 is more stable in HN878 strain than in a more commonly used laboratory-adapted Mtb strain H37Rv (the segregation coefficient being *s* = 0.10 in HN878 vs. *s* = 0.18 in H37Rv). Interestingly, the kinetics of plasmid-bearing bacteria in lungs of Mtb-infected rabbits did not follow an expected monotonic decline; the percent of plasmid-bearing cells increased between 28 and 56 days post-infection and remained stable between 84 and 112 days post-infection despite a large increase in bacterial numbers in the lung at late time points. Mathematical modeling suggested that such a non-monotonic change in the percent of plasmid-bearing cells can be explained if the lung Mtb population consists of several (at least 2) sub-populations with different replication/death kinetics: one major population expanding early and being controlled/eliminated, while another, a smaller population expanding at later times causing a counterintuitive increase in the percent of plasmid-bearing cells. Given that HN878 forms well circumscribed granulomas in rabbits, our results suggest independent bacterial dynamics in subsets of such granulomas. Our model predictions can be tested in future experiments in which HN878-pBP10 dynamics in individual granulomas is followed over time.

## Introduction

Tuberculosis, (**TB**) caused by *Mycobacterium tuberculosis* (**Mtb**), remains as a global health threat, killed 1.6 million people in 2021 despite available chemotherapy [1]. The infection with Mtb results in heterogeneity of disease outcomes such bacterial clearance, latent TB, and active TB with and without cavitary disease [2, 3]. Several studies suggested that between quarter to a third of the world’s population has evidence of latent TB [4]. While most are not at risk to develop active disease, individuals who do progress to active TB seem to do so within 1-2 years after exposure [5, 6].

Studying TB progression in humans is difficult as most exposed individuals do not develop the disease, and thus, it requires dedicated long-term cohort studies (e.g., [5, 7]). Most of what we know about within-host Mtb replication early after infection comes from animal studies [8–11]. Yet, despite decades of research, the pathophysiology of TB and the replication dynamics of Mtb during acute and chronic stage of infection remains poorly understood. Earlier studies reported that Mtb can persist in non-replicating status in tissues especially under hypoxic condition, which may help it resist antibiotic-mediated killing [12]. Studies in B6 mice, infected via aerosol with a conventional dose of Mtb (about 100 CFU), have shown that bacterial numbers in the lung increase during acute phase of infection but become almost static during the chronic phase of infection [13–16]. This condition was defined as “static equilibrium”, where the bacteria remain viable but do not divide [13]. Another study reported that there is no marked difference in the number of colony-forming units (**CFUs**) and the number of chromosomal equivalents (**CEQs**) supporting the hypothesis of static equilibrium between pathogen’s replication and death during chronic phase of infection [17]. In contrast to these results, other studies conducted in zebrafish and mice reported that Mtb enters in a dynamic life status and replicates in the host during chronic stage of infection [18, 19]. In addition, isoniazid has been successfully used to treat latent TB [20]. Because isoniazid is effective only against actively replicating bacteria, success of isoniazid-based treatment suggests that during latent disease bacteria are not dormant but do replicate, and apparent constancy of bacterial numbers in latency/chronic infection is due to a delicate balance between Mtb replication and death rates.

Gill *et al*. [16] elegantly addressed this question of static equilibrium in a mouse model of TB. They used an unstable plasmid pBP10, dubbed a “replication clock plasmid”, that was lost at a constant rate with each cell division. By combining the data on the total number of bacteria and percent of plasmid-bearing cells in the population with mathematical modeling, they estimated the rate of Mtb division and death during the first 110 days of Mtb infection in lungs of B6 mice. They found that Mtb replicates at substantial rates during the chronic phase of infection — about 4 fold lower than in acute phase of infection — thus, challenging the conventional concept that Mtb enters in static equilibrium during chronic stage of TB [16]. We have recently showed how the plasmid segregation probability influences estimates of Mtb replication and death rates, and quantified the fraction of non-replicating bacteria that may be formed in these experiments [21]. More recent work used the ratio of abundance of short-vs. long-lived ribosomal RNA (RS ratio) to further confirm that Mtb may indeed be replicating during the chronic stage of infection in murine lungs [22].

While infection of B6 mice with Mtb is a widely used animal model of TB it has some limitations. Typically mice are infected with conventional doses (*∼* 100 bacteria) that likely exceed infection dose of humans [23, 24]. At such higher doses given a small size of murine lung, nearly all areas of the lung have sign of inflammation and this is also not typically observed in humans [5]. In contract, rabbits or monkeys exposed to similar or lower Mtb doses do develop localized lesions that may progress to well-circumscribed granulomas [25–28]. Interestingly, the number of bacteria recovered from individual lesions/granulomas from the same animal can vary orders of magnitude but whether such variability arises due to differences in replication rate or death rate (or both) of bacteria in different lesions remains largely unknown [26, 29, 30]. Furthermore, in the same animal some lesions may heal and some may become more active (determined, for example, by using PET/CT technology, [29]) but the reason for such discordant dynamics (e.g., is this due to changes in replication or death rates?) remain unknown.

In this study we used a clinical Mtb isolate HN878, transformed with the replication clock plasmid pBP10, to infect rabbits via aerosol challenge. We followed the total number of bacteria and the percent of plasmid-bearing cells in rabbit lungs over time. Interestingly, the percent of plasmid-bearing cells did not decline monotonically as has been previously observed in B6 mice, and our previous mathematical model failed to accurately fit the observed data on Mtb CFUs in the lung. Rather, a model in which Mtb population in the lung consists of at least 2 somewhat independent populations fitted the data better. Stochastic simulations of Mtb dynamics suggest that it is possible to use replication clock plasmid (and HN878-pBP10 specifically) to estimate the rates of Mtb turnover in individual lesions, paving the way to rigorously understand why individual granulomas in a given animal vary dramatically in the number of viable bacteria.

## Materials and methods

### Data

#### In vitro experiments

We used a clinical Mtb strain HN878, carrying pBP10 plasmid. The plasmid was transformed into HN878 by D. Sherman’s group (University of Washington) using a methodology described previously [16]. To estimate how unstable the plasmid is in this specific Mtb strain, we performed experiments with HN878-pBP10 strain grown in vitro in different conditions. Experiments were performed as described by Gill *et al*. [16] with some modifications. Specifically, Mtb HN878-pBP10 was grown in vitro starting with 2 *×* 10^{7} cell/mL in 10 mL for 3 days in 2 different media: complete Middlebrook 7H9 media (“7H9”), 1:3 7H9:PBS diluted media (“1:3 7H9”). After 3 days, 1 mL of the 10mL culture was inoculated into 9 mL fresh media. The transfer was repeated 7 times. The experiment was performed with 3 independent cultures for each medium.

#### In vivo experiments

We have previously described a protocol of a rabbit model of pulmonary TB, using the clinical Mtb isolates such as HN878 or CDC1551 to infect animals via the respiratory route [31–33]. In the present study, rabbits were infected with about 500 colony forming units (CFU) of HN878-pBP10 by aerosol. Specific pathogen-free, female, New Zealand white rabbits (*Oryctolagus cuniculus*), weighing 2.2 to 2.6 kg, were used (*n* = 23) for aerosol infection by Mtb HN878-pBP10 in four separate experiments (*n* = 2 to 4 per time point per experiment). Rabbits were exposed to Mtb-containing aerosol using a nose only delivery system. At 3 hours after exposure, a group (*n* = 4) of rabbits was euthanized, and serial dilutions of the lung homogenates were cultured on Middlebrook 7H11 (Difco BD, Franklin Lakes, NJ) agar plates to enumerate the number of initial (time 0) bacterial CFUs implanted in the lungs. At 28, 56, 84, and 112 days (or 4, 8, 12, and 16 weeks) post infection (p.i.), groups of rabbits (*n* = 2 to 4) were euthanized and left and right lungs were harvested for CFU assay by plating lung homogenates on 7H10 plates without and with kanamycin. Plasmid-bearing Mtb can grow on kanamycin-containing plates while the plasmid-free cells cannot [16].

### Mathematical models

#### Linear regressions

There are two alternative methods of estimating rates of Mtb replication and death from the experimental data involving growth of bacteria bearing the replication clock plasmid [21]. One method proposed by Gill *et al*. [16] involves the use of linear regressions of the log-transformed numbers of bacteria found in the lungs at different times after the infection, and another method proposed by McDaniel *et al*. [21] is to fit the mathematical model to whole dataset assuming that model parameters are time-dependent.

In the linear regression analysis of Gill *et al*. [16], for every time interval *i* in the data (e.g., in **Supplemental Figure S2** the first time interval is 0-28 days post infection), three slopes can be calculated: *slopeN*_{i} (the rate of exponential change in the total number of bacteria in the time interval *i*), *slopeP*_{i} (the rate of exponential change in the number of plasmid-bearing bacteria), and *slopeF*_{i} (the rate of exponential decline in the fraction of plasmid-bearing bacteria in the population). By using two of these slopes, the rate of Mtb replication and death in the *i*^{th} interval can be estimated as
where *SlopeN*_{i} and *SlopeF*_{i} are slopes estimated by fitting linear regression lines through plots of the *i*^{th} intervals of ln[*N*(*t*)] and − ln[*F*(*t*)] versus *t*, respectively [16]. Note that because in the one population model the fraction of plasmid-bearing cells declines with time (**eqn. (6)**), *slopeF* is positive and denotes the decline rate *rs* in *f* (*t*). Alternatively, one can use slopes *slopeN*_{i} and *slopeP*_{i} to estimate the rates of Mtb replication and death from **eqns. (1)–(2)** [21].

#### Estimating segregation parameter *s* in Mtb strain HN878

To estimate the probability of plasmid loss per division *s* we performed in vitro experiments with HN878 carrying the replication clock plasmid pBP10. In vitro we expect that bacteria do not die (i.e., *δ* = 0) which then allows to estimate the segregation parameter *s* using **eqn. (1)**. To do this we first calculated the total bacterial density in the cultures over time by accounting for 1:10 dilution at each transfer. Then for every growth condition/medium we calculated the growth rate of the populations *r* (by performing linear regression of log(*N* ) vs. time and estimating *slopeN* ). Then we calculated change in the percent of plasmid-bearing cells in the population over time, and calculated the decline in the percent as log(*F* ) vs. time, i.e., we estimated *SlopeF* . Using **eqn. (2)** with *δ* = 0 we then estimated the plasmid loss rate as *s* = *slopeF/slopeN* (for each of two media conditions separately and by pooling all the data together).

#### One population model

In order to quantify the rate of Mtb replication in mice *in vivo*, Gill *et al*. [16] proposed a mathematical model that described the change in the number of plasmid-carrying and plasmid-free bacteria with respect to time. One important assumption of the model is that all bacteria in the lung divide and die at the same rates, representing, thus, one homogeneous population (see **Figure 1**A). The rates of Mtb replication and death could vary with time since infection. Plasmid-free and plasmid-bearing strains were assumed to have the same growth and death rates; dynamics of the cell populations are then given by the system of equations [21]:

where *N* (*t*) = *P* (*t*) + *F* (*t*) is the total number of bacteria in the lung, *P* (*t*) is the number of plasmid-carrying bacteria, *F* (*t*) is the number of plasmid-free bacteria, *f* (*t*) = *P* (*t*)*/N* (*t*) is the fraction of plasmid-bearing bacteria in the population, *r*(*t*) and *δ*(*t*) are the rates of Mtb replication and death, respectively, *t* is time in days, and the segregation probability *s* denotes the probability that a plasmid-free cell will be produced from a division of a plasmid-bearing cell. To solve the model one also needs to define the initial number of plasmid-bearing and plasmid-free bacteria, *P* (0) and *F* (0), respectively.

In general, the rates of Mtb replication and death may vary over time since infection, but in the absence of additional information, the simplest assumption is that the rates are constant between individual times where Mtb counts were measured but may vary between different time intervals [16, 21]. In our data, bacterial counts were measured in the lungs of rabbits at 0, 28, 56, 84, and 112 days since infection (**Supplemental Figure S2**), therefore, for one population model the rates of replication and death are defined as follows:
Model parameters *r*_{i} and *δ*_{i} can be estimated by fitting the mathematical model (**eqns. (3)–(4)**) to data assuming a fixed value for the segregation coefficient *s* [21]. In the previous work the probability of plasmid segregation in Mtb strain H37Rv was determined *in vitro* as *s* = 0.18 [16]. Our in vitro experiments suggested that the pBP10 plasmid is more stable in HN878 with the segregation constant being *s* = 0.10. However, we have confirmed that most of our conclusions remain valid for a higher *s* = 0.18 even though the estimates of the model parameters do depend on *s* (see Results section).

### Multiple sub-populations model

As we show in the Results the “one population model” is not able to accurately describe the dynamics of the plasmid loss from the bacterial population in rabbits over time. We propose that a non-monotonous change in the percent of plasmid-bearing cells in the bacterial population may arise due to asynchronous dynamics of several independent sub-populations of bacteria (**Figure 1**B). The basic mathematical model given in **eqns. (3)–(4)** can be easily extended by assuming *n* independent sub-populations each starting from initial number of plasmid-bearing (*P* ^{(j)}(0)) and plasmid-free (*F* ^{(j)}(0)) bacteria, and each replicating and and dying at rates *r*^{(j)} and *δ*^{(j)}, respectively, with index *j* denoting the sub-population (*j* = 1 … *n*, **Figure 1**B and **eqn. (7)**).

#### Fitting models to data

An alternative method to estimate the rates of Mtb replication and death from the data is to numerically solve the mathematical model given in **eqns. (3)–(4)** with replication and death rates defined as in **eqn. (7)** and fit the solution to the data [21]. For example, in our previous work we fitted the solution of the one population model to data on total number of bacteria and the number of plasmid-bearing bacteria in lungs of B6 mice [21]. We found, however, that while the model can reasonably well describe the dynamics of total number of bacteria and the number of plasmid-bearing bacteria, the model does not fit well the fraction of plasmid-bearing cells (e.g., see Fig 4C in McDaniel *et al*. [21]). Therefore, here we propose to use a maximum likelihood method to fit the model to the data on the total number of bacteria *N* and the fraction of plasmid-bearing cells in the population *f* [34, 35].

Because measurement errors in calculating total number of bacterial in the lung and the percent of plasmid-bearing cells in the population may be different, we assume that the distribution of errors for two datasets are normal but have different variances, (total cell numbers data) and (fraction of plasmid-bearing cells data). For our model and the data, the general likelihood of observing the data given the model prediction on the number of bacteria found in the lung at time *t*_{i}, *N* (*t*_{i}), and the fraction of plasmid-bearing cells, *f* (*t*_{i}), is
where is the vector of all model parameters, *n* is the number of time points at which measurements were taken (*n* = 5 for our data), *k*_{i} is the number of animals with measured number of bacteria at *i*^{th} time point (generally, *k*_{i} = 4), and are experimental measurements at time point *i* and animal *j*. Because likelihood is in general a very small number, it is convenient to rewrite **eqn. (8)** in terms of negative log-likelihood ℒ = − log(*L*):
where we omitted constant terms. Model parameters were estimated by minimizing negative log-likelihood ℒ (**eqn. (9)**). Model comparison was done using F-test for nested models [36] or Akaike Information Criterion, AIC [37].

#### Stochastic simulations

We simulated formation of lesions/granulomas in rabbits assuming that the infection starts with a single plasmid-bearing bacterium, dividing and dying at time-invariant rates *r* and *δ*, respectively, in accord with **eqns. (3)–(4)**. We used `GillespieSSA2` library in R with the following rates to simulate the dynamics stochastically using a Gillespie algorithm:

We used tau-leaping algorithm with *τ* = 0.1 [38] and typically performed 1,000 simulations per a parameter set.

## Results

### Estimating segregation probability of pBP10 plasmid in Mtb strain HN878 in vitro

To estimate stability of the pBP10 plasmid in Mtb strain H37Rv, Gill *et al*. [16] grew the H37Rv-pBP10 strain in vitro in different media and calculated the decline rate in the percent of plasmid-bearing cells (*slopeF* ) and exponential increase rate in total cell numbers (*slopeN* ) in culture over time (see **eqns. (1)–(2)**). Because it is expected that the bacteria do not die in culture in vitro within the experimental time frame [16], with *δ* = 0 segregation coefficient can be then calculated, for example, using **eqn. (2)** as *s* = *slopeF/slopeN* . Clinical Mtb isolate HN878 was transformed with the replication clock plasmid pBP10 and was provided for our experiments by D. Sherman’s group (University of Washington).

We grew Mtb strain HN878-pBP10 in complete 7H9 or 1:3 (7H9:PBS) diluted media in serial passage experiments (**Figure 2**A and see Materials and Methods for more detail). In the experiment at each transfer we determined the total number of bacteria and the number of plasmid-bearing bacteria by plating the samples on 7H11 or 7H11+kanamycin plates and calculated the bacterial numbers over time by taking into consideration 1:10 dilutions during the serial passage experiments.

As expected, the number of all bacteria and of plasmid-bearing cells increased over time reflecting the impact of serial transfers (**Figure 2**B-C). Indeed, bacterial concentration remained similar at day 3 before the transfer when Mtb was cultured in complete media but not for plasmid-bearing cells (**Supplemental Figure S1**A&C). However, we observed non-exponential growth of Mtb in the diluted media where initially concentration of all cells or of plasmid-bearing cells increased slowly but then after two passages, the rate of growth matched that observed for cells in complete media (**Figure 2**B-C). This was because at the end of day 3 culture before the transfer, the concentration of bacteria declined for two transfers (6 days) but then stabilized at slightly lower than 10^{7} cell/ml while the concentration of plasmid-bearing bacteria continued to decline albeit at a slower rate (**Supplemental Figure S1**B&D). We do not know why the dynamics of bacterial growth changed after 6 days of culture; it may be related to the adaptation of the natural Mtb isolate HN878 to the growth in culture. This dynamics is not consistent with the continuous monotonic increase in the concentration of H37Rv-pBP10 in 7H9 media of different dilutions (e.g., Figure 1d in Gill *et al*. [16]).

As expected for an unstable plasmid pBP10, the percent of plasmid-bearing cells declined with time (evaluated in days or when re-calculated as the number of generations or cell divisions, **Figure 2**D-E) and at day 3 before each transfer (**Supplemental Figure S1**E&F). We calculated generation time for the bacteria in two media (full-strength 7H9 or diluted 7H9) using the relationship *g* = *rt/* ln(2) and data in **Figure 2**A). Interestingly, the decline in the plasmid-bearing cells per generation did not collapse into one curve as was observed previous for H37Rv-pBP10 (Figure 1F in Gill *et al*. [16]). This suggests that the kinetics of pPB10 loss from HN878 strain may be growth rate-dependent.

By using **eqns. (1)–(2)**) we estimated segregation coefficient to be *s* = 0.12 and *s* = 0.08 for complete and diluted 7H9 media, respectively, with the average . When pooling all data together for the decline in the percent of plasmid-bearing cells with the number of generations the segregation parameter per generation with 95% confidence intervals was *s* = 0.106 (0.086 − 0.125).

This is a smaller value than that estimated for pBP10 plasmid in H37Rv strain (*s* = 0.18, [16]) suggesting that the rate at which this unstable plasmid is lost depends on the host strain. In following analyses we used the average estimate *s* = 0.10 for HN878-pBP10.

### Non-monotonic loss of plasmid pPB10 from Mtb strain HN878 in rabbits

To study the dynamics of HN878-pPB10 in vivo, we infected rabbits with aerosolized bacteria (*∼* 500 CFU) and measured the number of all bacteria or of plasmid-bearing cells in the lungs at different time points after the infection (**Figure 3**A). Importantly, we found similar numbers of bacteria in the left and right lungs even though there was a tendency of observing higher CFU numbers in the right lung (**Supplemental Figure S2**); this is similar to our recent observation in B6 mice [24].

Mtb dynamics in the lungs of rabbits was not monotonic (**Figure 3**B); there was a large increase in the total number of bacteria from about 500 to nearly 10^{8} in the first 28 days, then the number of bacteria declined for the next 56 days and then the number of bacteria exploded to about 10^{10} CFU/lungs in some rabbits or was maintained at 10^{7} in others (**Figure 3**B). Interestingly, changes in the total number of plasmid-bearing bacteria followed that of total bacteria with initial increase, decline, and again increase in the last 28 days of infection. Change in the percent of plasmid-bearing cells was also non-monotonic with decline between days 0 and 28 and days 56 and 84 but with an increase between 28 and 56 days and stable levels between days 84 and 112 (**Figure 3**C); the latter stable frequency of plasmid-bearing cells was contrasted with the increase in the total number of bacteria in the last 28 days of the experiment (**Figure 3**B). This non-monotonic dynamics of the percent of plasmid-bearing cells was unexpected because we observed a monotonic decline in the percent of plasmid-bearing cells in vitro (**Figure 2**D) and previous studies with H37Rv-pBP10 in B6 mice also documented a monotonic loss of the plasmid with time [16, 21]. By using regression methods of Gill *et al*. [16] (**eqns. (1)–(2)**) with estimated *s* = 0.10, we calculated the rates of HN878 replication and death in the 4 time periods (0-28, 38-56, 56-84, and 84-112 days, **Figure 3**B-C). The analysis predicted rapid Mtb replication in the first 4 weeks of infection with *r*_{1} = 1.04*/*day with also a relatively high death rate *δ*_{1} = 0.65*/*day but predicted negative rates in second and fourth time periods. This was driven by a high increase in the percent of plasmid-bearing cells (or their somewhat stable level) in these time intervals.

### Mathematical models including two of more subpopulations of Mtb with different replication kinetics are required to explain experimental data in rabbit lungs

One of the limitations of the linear regression method of Gill *et al*. [16] is that to estimate the rate of Mtb replication and death one must use a pair of sequential data points. In the case when the percent of plasmid-bearing cells increases with time, the method provides negative estimates for the Mtb replication and/or death rate which is biologically unrealistic. We have previously developed an approach to fit the model of Mtb replication and death to the data for the whole time period but assuming that the model parameters depend on time period [21, see **eqns. (3)–(6)** and **eqn. (7)**]. In this approach we can restrict the parameters of the model to be non-negative and the quality of the model fit to the data can be evaluated. There are different ways of how this model (**eqns. (3)–(4)**) can be fitted to data and we have previously fitted the model predictions to the number of total bacteria and plasmid-bearing bacteria in murine lungs [21]. An alternative approach is to fit the model to the data on the total number of bacteria and the percent of plasmid-bearing cells in the population using a likelihood approach (**eqns. (8)–(9)**). In our following analyses we chose this latter, likelihood-based approach as it allows to penalize models that do not accurately describe changes in the percent of plasmid-bearing cells.

We then fitted the basic model of Mtb replication and death with time-dependent parameters to the data while constraining the parameters to be non-negative. Interestingly, the model could relatively well describe the dynamics of the total number of bacteria and of plasmid-bearing bacteria (**Figure 4**A); however, the model did not accurately match the change in the percent of plasmid bearing cells in two time periods (**Figure 4**A) and predicted no Mtb replication between 28 and 56 days post infection. Taken together, two alternative approaches that we and others have used to accurately describe the dynamics of pBP10-containing H37Rv strain of Mtb in B6 mice could not explain the data on HN878-pBP10 growth dynamics in rabbit lungs.

There may be several potential reasons for an increase in the percent of plasmid-bearing cells between 28 and 56 days post-infection. One possibility is an experimental error in measuring the percent of plasmid-bearing cells. However, because with the exception of data for 84 days post-infection, measurements of percent of plasmid-bearing cells have relatively low variance, we believe this explanation is unlikely. Another possibility is that plasmid-free cells may acquire the plasmid from the environment by transformation. However, given difficulty at transforming plasmids into Mtb strains, we also do not believe this is a viable explanation. Finally, it is possible that our mathematical modeling approach makes assumptions that break for Mtb growth dynamics in rabbit lungs; specifically, the assumption that all bacterial cells replicate and die at the same rates within the host. Indeed, it has been previously observed that Mtb infection of rabbits results in formation of heterogeneous granulomas, special structures of bacteria and various host cells [25, 28, 31, 39–41]. It is possible that the Mtb dynamics may not be fully synchronized in different granulomas, and indeed, PET/CT imaging has suggested that in monkeys, different Mtb granulomas can have asynchronous growth dynamics [29].

Therefore, we extended our initial model to include two or three bacterial sub-populations with different replication and death rates (**Figure 1**) and fitted different versions of such a model to the data on HN878-pBP10 dynamics in rabbits. Modeling each additional sub-population requires at most 10 parameters (2 parameters for the initial condition and 8 parameters for replication and death rates for four time intervals) and it may be easy to overfit the model to data. We therefore assumed the model with two sub-populations while constraining the parameters for the two sub-populations to be same as much as possible. Interestingly, such a model could well describe experimental data on the dynamics of all bacteria and plasmid-bearing bacteria, and it could reasonably well predict increase in the percent of plasmid-bearing cells between 28 and 56, and 84 and 112 days post infection in rabbit lungs (**Figure 4**B,D) with only 13 parameters (including 2 parameters for the errors *σ*_{N} and *σ*_{F} ). According to the model fit, Mtb population consists of two sub-populations of bacteria. One sub-population rapidly replicates early during the infection ( and ) and thus few bacteria in this sub-population have the plasmid. This sub-population is then controlled by immunity and contracts while still dividing ( and ) and further losing the plasmid over time (**Figure 4**F). Another sub-population replicates more slowly early in infection and and many bacteria still have the plasmid by 56 days post-infection (**Figure 4**F). Between 56 and 84 days, this second sub-population replicates rapidly ( and day) resulting in the loss of the plasmid in the population but because total numbers of bacteria still decline rapid replication is balanced by a relatively high death rate. Finally in the last time interval, continued growth of the second sub-population ( and ) results in increase in the number of bacteria in the lung while slightly increasing the percent of plasmid-bearing cells in the population. Therefore, the fit of the model with two sub-populations is significantly better than that with one kinetically homogeneous population ( and ΔAIC *»* 10).

There are clear reasons of why at least two sub-populations with different replication kinetics are required to accurately describe the change in the number of bacteria and the percent of plasmid-bearing cells. First, the model must explain large decline in the percent of plasmid-bearing cells and large increase in the total number of bacteria in the lung during first 28 days (**Figure 4**B&D). One population of bacteria can explain such data (**Figure 4**A&C). Then to explain the increase in the percent of plasmid-bearing cells, while the total number of bacteria declines between 28 and 56 days post infection, one must assume that the initial dominant population contracts which another population with a higher frequency of plasmid-bearing cells (due to fewer divisions) becomes more dominant. Large decline in the percent of plasmid-bearing cells and yet also a decline in the total number of cells between 56 and 84 days post-infection implies rapid cell division that must be counterbalanced by high death rate (**Figure 4**B,D,F). Finally, a large increase in the number of bacteria with ra elatively constant percent of plasmid-bearing cells between 84 and 112 days post-infection suggest expansion of previously poorly replicating/subdominant sub-populations with a high initial frequency of plasmid-bearing cells.

The model with two sub-populations can describe the Mtb growth dynamics data in rabbit lungs relatively well. However, predictions of the model of the dynamics of plasmid-bearing cells seem puzzling with a sharp peak at day 56 and a sharp nadir at day 84 suggesting that the model may not be fully biologically realistic (**Figure 4**D). We therefore investigated if a model with three sub-populations may describe the data more reasonably. This was a challenging exercise since we started with a model with 30+ parameters that clearly overfitted the data. By constraining many of the rates to be same between different time periods (e.g., as we have done previously [21]) as well as by assuming the same initial numbers of cells in different subpopulations we could find sets of parameters that fitted the data better than all our previous models but a with a relatively small number of parameters (e.g., in **Supplemental Figure S3** we used 13 parameters). In this model, the third population that was declining until day 84 was needed to explain a large increase in the total number of plasmid-bearing bacteria while maintaining a relatively high frequency of plasmid-bearing cells by 112 days post infection **Supplemental Figure S3**C). Interestingly, while the three population model fitted the data statistically better than either a single population model (, *p "* 0.001) or two population model (ΔAIC *>* 10), it still predicted non-monotonic dynamics of plasmid-bearing cells after 56 days post-infection (**Supplemental Figure S3**A&B). This result suggests that a more continuous model prediction on the dynamics of plasmid-bearing cells may require having more sub-populations and/or more parameters.

### Alternative analyses to check robustness of our conclusions

When fitting models with two or three sub-populations, we typically started with many parameters and fitting such a model to data often resulted in overfitting. Obviously, there are many different ways to reduce the number of model parameters and we could come up with several different parameter sets that could reasonably well describe the data (results not shown). This exercise suggested that the specific parameters inferred by fitting the model to data need to be interpreted with caution.

We noted that at the last experimental time point (112 days post infection) Mtb growth dynamics in rabbit lungs follow a divergent pattern - in two animals total CFU numbers remained stable at about 10^{8} bacteria, and in two animals they exploded to about 10^{10} bacteria per lung (**Figure 4**A-B). We reasoned that describing such dynamics using average parameters may not be fully appropriate. Therefore, we repeated our analyses for the data including only measurements up to 84 days. Importantly, we found that a model with a single/homogeneous Mtb population cannot explain Mtb dynamics accurately and including at least 2 sub-populations with different replication/death rates is required to adequately describe the data (**Supplemental Figure S4**).

In our experiments we found that the plasmid segregation coefficient (*s* = 0.10) is lower than that measured in H37Rv in a previous study (*s* = 0.18). Previously we have shown that value of the segregation coefficient is critical at determining the actual values of Mtb replication and death rates [21]. We therefore investigated if a higher segregation coefficient is consistent with our experimental data on Mtb growth dynamics in rabbits. Interestingly, the model with three sub-populations and *s* = 0.18 was able to explain the Mtb growth dynamics with the same quality as the model with *s* = 0.10 (**Supplemental Figure S5**). Importantly, however, increase in the segregation coefficient resulted in lower estimated Mtb replication and death rates (e.g., *r* = 0.58*/*day and *δ* = 0.15*/*day for the first sub-population in the first 28 days post-infection) as expected [21]. Therefore, it is critical to make sure that estimates of the segregation coefficient *s* are as precise as possible.

### Identifying conditions to inform replication history of Mtb in individual granulomas of rabbits

Our finding that experimental data on Mtb dynamics in rabbits can be better explained by models of heterogeneous sub-populations suggests that tracking Mtb dynamics in individual lesions may help explain why the number of bacteria recovered from different lesions varies orders of magnitude [29]. However, because it is currently believed that individual lesions are likely to formed by single bacteria (e.g., [24, 29, 42]) only lesions established by plasmid-bearing cells would inform about Mtb replication and death rate in the lesion. To investigate whether typical sampling of Mtb in individual lesions in rabbits 3-4 weeks post-infection (21-28 days) is informative, we ran Gillespie simulations of Mtb dynamics starting with one plasmid-bearing cell and varying the rate of Mtb replication *r* and death *δ* (**Figure 5** and **Supplemental Figure S6**).

Consistent with our recent results [24], starting infection with a single bacterium often results in extinction, for example, with *r* = 1.04*/*day and *δ* = 0.64*/*day, only 405*/*1000 ≈ 40% runs resulted in lesions with Mtb counts above 0, and only 30% of the runs resulted in lesions containing plasmid-bearing cells above the limit of detection (**Figure 5**A-B). Most extinctions occurred when Mtb counts were low, so they would not be detectable experimentally. The dynamics of plasmid-bearing cells was also highly variable between individual runs with about 25% of runs (at *r* = 1.04*/*day and *δ* = 0.64*/*day) resulting in the loss of the plasmid from the population within 28 days (**Figure 5**C). Importantly, while the total number of bacteria remained relatively constant for these parameter combinations (*r* − *δ* = 0.4*/*day), the percent of plasmid-bearing cells was progressively lower for a higher Mtb replication rate, consistent with the prediction of the deterministic model (**Figure 5**D and **eqn. (6)**). Furthermore, in simulations with increased bacterial replication rate and a constant death rate resulted in progressively lower percent of plasmid-bearing cells (**Supplemental Figure S6**A&C); however, keeping bacterial replication rate constant while varying death rate predicted similar percent of plasmid-bearing cells (**Supplemental Figure S6**B&D). This result suggests that measuring the percent of plasmid-bearing cells in individual granulomas of rabbits should allow to estimate the rate of Mtb replication and death, and thus, should provide information on reasons why some lesions contain many bacteria and others few. Measuring lesion-localized T cell response may also help clarify how immunity influences Mtb replication and death rates. Additionally, by measuring the frequency of granulomas containing plasmid-bearing cells (and comparing this number with the percent of plasmid-bearing cells shortly after aerosol infection), we should be able to estimate the average rate of Mtb replication in the whole animal.

## Discussion

In this study we followed the kinetics of replication of a pathogenic Mtb strain HN878 containing an unstable, “replication clock” plasmid pBP10 in rabbit lungs. We determined that this plasmid is lost from HN878 in vitro with a segregation coefficient *s* = 0.1 (**Figure 2**) suggesting that the plasmid is more stable in this strain as compared to H37Rv [16]. Interestingly, the dynamics of plasmid-bearing cells in rabbit lungs was not monotonic as expected from previous studies [16, 21] increasing between 28 and 56 days post-infection (**Figure 3**). We showed that a previously suggested model assuming a single, kinetically homogeneous Mtb population is not consistent with these data, and including two and three sub-populations with different replication/death kinetics would allow to more accurately fit these data (**Figure 4** and **Supplemental Figure S3**). Stochastic simulations suggested that measuring the percent of plasmid-bearing cells in individual granulomas of rabbits should allow to estimate Mtb replication and death rates in the granuloma (**Figure 5** and **Supplemental Figure S6**), and thus, should provide quantitative explanation of why different granulomas have dramatically different numbers of viable bacteria.

As far as we are aware, our study is the first one providing quantitative estimates of how quickly pathogenic Mtb strain HN878 replicates in rabbit lungs after aerosol infection. It is difficult to directly compare our estimates with those published previously (e.g., [16, 21]) given that we found that previous models do not adequately describe our experimental data. The highest replication rate we estimated was *r* = 1.04*/*day, higher than that for H37Rv in mice (*r* = 0.72*/*day in McDaniel *et al*. [21]). There are other methods used to evaluate how quickly Mtb replicate in vivo including cell chromosome equivalent or RS ratio [17, 22, 29]; however, these methods remain qualitative and do not provide rigorous estimates of Mtb replication and death rates, particularly in heterogeneous in vivo granulomas. Future studies may need to compare Mtb replication kinetics using these different methods to identify which methods are most accurate.

Our work has several limitations. Estimated segregation coefficient *s* of the plasmid pBP10 in HN878 strain differed slightly between two culturing conditions and the plasmid loss kinetics was changing over time (**Figure 2**). This could be due to the inherent nutritional composition of the growth media used (i.e, full-strength versus diluted 7H9 media). Additional experiments to measure the instability of this plasmid in more diverse conditions in vitro would be useful.

We detected an increase in the percent of plasmid-bearing cells between 28 and 56 days post-infection and interpreted it as being due to heterogeneous dynamics of Mtb in rabbits. However, other possibilities exist. For example, measurements of the percent of plasmid-bearing cells have experimental errors and such an increase may arise due to such error. However, given that for each sample, the percent of plasmid-bearing cells is measured twice and that we found consistent percent of plasmid-bearing cells between different animals, we believe that this explanation is unlikely. Plasmid pBP10 may theoretically be taken up by plasmid-free cells that would then convert into plasmid-bearing cells. However, given that clinical isolates do not carry plasmids and of difficulty of transforming plasmids into Mtb, this is also an unlikely hypothesis.

To have a large population of plasmid-bearing cells, we infected rabbits with a relatively high dose of Mtb that likely exceeds the dose of Mtb that humans are typically exposed to acquire stable infection [24]. How the dynamics would proceed at lower doses is uncertain although we made model-driven predictions on how parameters of Mtb replication rate and death may influence the size and composition of Mtb replicating in individual granulomas of rabbits (or other animal species) starting with a single plasmid-bearing cell. While we did use stochastic simulations, we did not derive analytical predictions of how Mtb replication/death rates would impact the distribution of the total and the percent of plasmid-bearing cells in heterogeneous granulomas. This remains an important direction of our research.

Our work opens avenue for future research of Mtb growth dynamics in TB granulomas. Transforming pBP10 plasmid into other clinical Mtb isolates would allow to compare how these strains replicate in various animal models. The percent of plasmid-bearing cells indicates cumulative replication history of bacteria, while other metrics such as RS ratio are instantaneous measures of Mtb replication [22]. Generating quantitative link between these two metrics would allow to better understand processes regulating size and complexity of individual granulomas in vivo. By measuring dynamics of plasmid loss in individual granulomas in rabbit lungs we should be able to determine if granulomas with more rapidly dividing bacteria are related to the location of these granulomas in the lung; in humans, TB patients typically have cavities in upper and back parts of the lung and importance of lesion location for TB in rabbit lungs has been highlighted before [43–45]. Considering that the rabbit model of pulmonary Mtb infection can produce cavitary granulomas in the lungs, correlating Mtb replication kinetics in individual granulomas in control and vaccinated animals can help identify immune components that control Mtb growth, helping with development of the next generation of TB vaccines.

## Data sources

The data from the paper along with the codes are available on github: https://github.com/vganusov/rabbits_replication_clock/tree/main.

## Code sources

All analyses have been primarily performed in Mathematica (ver 12). Stochastic (Gillespie) simulations were done in R using GillespieSSA2 library.

## Ethics statement

Specific pathogen-free, female New Zealand White rabbits of 2.3 to 2.6 kg body weight were purchased from Covance (Envigo, Indianapolis, IN, USA). Each rabbit was individually housed with food and water consumption ad libitum. Animals were handled humanely according to the Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC) and the United States Department of Agriculture (USDA) guidelines. All animal procedures involving Mtb, including infection and necropsy were performed in Biosafety Level-3 (BSL-3) facilities according to the protocols approved by the Institutional Animal Care and Use Committee (IACUC) of the Rutgers University.

## Author contributions

VVG and SS conceived the overall concept of the study. In vitro experiments were performed by AK under supervision of SS. In vivo experiments were performed by SS. Data analysis and mathematical modeling was done by VVG. VVG wrote the first draft of the paper and all authors read, edited, and agreed on the final version.

## Supplemental Information

## Acknowledgments

We would like to thank David Sherman for providing Mtb strain HN878-pPB10 for our experiments. This work was supported in part by the NIH /NIAID grants R01AI158963 to VVG and R01AI127844 to SS.

## Abbreviations

- TB
- tuberculosis
- Mtb
*Mycobacterium tuberculosis*- LOD
- limit of detection
- CFU
- colony forming units
- CEQ
- chromosomal equivalents.