Estimating the probable cause of recurrence in Plasmodium vivax malaria: relapse, reinfection or recrudescence?

Background Relapse of Plasmodium vivax infection is the main cause of vivax malaria in many parts of Asia. However at the individual patient level, recurrence of a blood stage infection following treatment within the endemic area can be either a relapse (from dormant liver-stage parasites), a recrudescence (blood-stage treatment failure), or a reinfection (following a new mosquito inoculation). Each requires a different prevention strategy, but previously they could not be distinguished reliably. Time-of-event and genetic data provide complimentary information about the cause of P. vivax recurrence, but the optimum approach to genotyping and analysis remains uncertain. Methods Individual-level data from two large drug trials in acute vivax malaria patients (Vivax History: VHX; Best Primaquine Dose: BPD) conducted on the Thailand-Myanmar border with follow-up of one year were pooled (n=1299). A total of 710 isolates from both acute and recurrent P. vivax episodes were genotyped using 3-9 highly polymorphic microsatellite markers. These pooled data were analyzed using a novel population statistical model incorporating an assessment of genetic relatedness, treatment drug administered, and the time-to-recurrence. Results 99% of genotyped recurrences in individuals who did not receive primaquine (n=365) were estimated to be relapses. In comparison, 14% of genotyped recurrences (n=121) were estimated to be relapses following high-dose supervised primaquine. By comparing episodes across individuals (90194 comparisons), the false-positive rate of relapse identification using genetic data alone was estimated to be 2.2%. We estimated the true failure rate after high-dose primaquine (7mg/kg total dose) to be 2.6% in this epidemiological context, substantially lower the reinfection unadjusted estimate of 12%. Simulation studies show that 9 highly polymorphic microsatellite markers suffice to discriminate between recurrence states. Drug exposures reflected by plasma carboxy-primaquine concentrations were not predictive of treatment failure, but did identify non-adherence. Conclusion Using this novel statistical model, relapse of P. vivax malaria could be distinguished reliably from reinfection. This showed that in this population supervised high-dose primaquine could avert up to 99% of relapses. In low transmission settings, microsatellite genotyping combined with time-to-event data can accurately discriminate between the different causes of recurrent P. vivax malaria. Author summary One hundred years ago, Plasmodium vivax, the most globally diverse cause of human malaria, was present across most of the old-world, throughout tropical and temperate climes. Its main evolutionary advantage over Plasmodium falciparum, responsible for the most deadly type of human malaria, is its ability to stay dormant in the liver, emerging weeks to years later causing recurring illness and continuing transmission. The dormant liver-stage parasites are called hypnozoites. A recurrent infection can either be hypnozoite-derived (a relapse), a blood-stage treatment failure (recrudescence), or from a new mosquito bite (a reinfection). At a population level, each requires a different preventative strategy but no widely applicable methodology existed to discriminate between the three possible states: relapse, recrudescence and reinfection. Parasite genetic data alone cannot resolve the different possibilities. We developed a novel probabilistic framework which uses both epidemiological and genetic data to determine the most likely cause of vivax recurrence at the individual level. We applied this method to the largest available pooled clinical trial data from Southeast Asia incorporating information from highly polymorphic microsatellite markers, drug treatments received and time-to-recurrent illness. This analysis provides a tool for estimating probabilities of relapse, recrudescence and reinfection and, importantly, shows that hypnozoiticidal high-dose primaquine radical cure is much more effective than previously thought.

Plasmodium vivax is the most geographically widespread Plasmodium spp. that causes 3 human malaria, with an estimated 2.5 billion people at risk of infection [1]. According 4 to the 2017 WHO World Malaria Report, there were an estimated 6-11 million cases of information which complements the genetic data. 67 The inability to distinguish between relapses and reinfections has several 68 consequences for the understanding of the epidemiology of P. vivax malaria and the 69 assessment of treatment efficacy. First, routinely collected case data cannot be used to 70 estimate the force of infection, which is an important measure of transmission intensity 71 and a key metric for disease surveillance. In particular, the relationship between force of 72 infection and the number of P. vivax cases is non-linear because of relapses [27]. Second, 73 the inability to distinguish between relapses and reinfections means that in endemic 74 populations, estimates of the efficacy of radical curative drugs (i.e. primaquine, the only 75 radical cure currently available, and tafenoquine, approved in some countries but yet to 76 be deployed in endemic areas) in populations at continued risk of infection will be 77 biased downwards by the background transmission rate. Finally, the inability to 78 distinguish between relapse and recrudescence (recurrence of the blood-stage infection 79 as result of blood-stage treatment failure) means that it is difficult to estimate the 80 curative efficacy of a blood-stage treatment in endemic areas [28]. This contrasts with 81 Plasmodium falciparum malaria where standardised genotyping is used effectively to 82 distinguish recrudescence from reinfection, allowing for the adjustment of efficacy 83 estimates in populations at continued risk of reinfection [29][30][31]. 84 In this study, we combined genetic information with longitudinal epidemiological 85 data from two clinical trials of antimalarials in patients infected with P. vivax. This was 86 used to test a novel methodological framework that can discriminate between the causes 87 of recurring P. vivax infections. We estimated the probabilities of each of these causes: 88 relapse, reinfection and recrudescence, using a generative Bayesian model that combined 89 data on the time-to-recurrence, the drug treatment (taking into account the varying 90 pharmacokinetics and pharmacodynamics of antimalarial drugs), and parasite genetics 91 as characterized by genotyping 3 to 9 highly polymorphic microsatellite markers (repeat 92 length polymorphisms). This analysis pooled data from over 1200 patients collected in 93 two randomized controlled trials on the Thailand-Myanmar border, comparing 94 schizontocidal and hypnozoiticidal drugs for the treatment of P. vivax malaria, both 95 with one year follow-up of patients. We showed that by combining P. vivax genotyping 96 and time-to-event information relapses could be distinguished reliably from reinfections 97 in this low transmission setting.

99
In the VHX study patients of all ages were randomised to receive either artesunate, 100 chloroquine or chloroquine and primaquine [4]. In the BPD study patients were 101 randomised first to receive either chloroquine or dihydroartemisinin-piperaquine and 102 second to receive either 14 days of primaquine 0.5mg/kg/day or 7 days of primaquine 103 1mg/kg/day [32]. All doses were supervised in both studies. These individual 104 patient-level pooled data contained a total of 2708 time-to-recurrence intervals for 1299 105 patients (more than 1000 patient-years of combined follow-up time including right  Table 1. Mean (95% credible intervals) posterior probability estimates of overall recurrence states in the different treatment arms. All estimates have been rounded up to one decimal place. Blood stage drug: chloroquine or dihydroartemisinin-piperaquine. 114 Because of the periodicity of relapse, and the rarity of recrudescence emerging more 115 than two months after administration of efficacious treatments, the interval between 116 successive episodes of P. vivax was shown to be highly informative. We applied a 117 Bayesian population time-to-event mixture model to the 2708 time-to-recurrence 118 intervals. Mean posterior probabilities of the recurrence states (recrudescence, relapse, 119 and reinfection) generated under this model varied over 3 orders of magnitude as a 120 function of time since last episode and treatment drug (Fig 1). Following high-dose 121 primaquine and partner drug, a recurrence in the first few months had a high likelihood 122 of being a relapse, and subsequent recurring infections were almost entirely caused by 123 reinfections (Fig 1). 124 The posterior uncertainty intervals for the individual probabilities of relapse for each 125 recurrence are shown in Fig 2. For approximately 75% of the recurrences observed after 126 treatment without high-dose primaquine, the posterior distributions were extremely 127 narrow with the probabilities of relapse very close to 1 (Fig 2, top left). The remaining 128 25% all had relapse posterior probabilities greater than 0.3 but with wide credible 129 intervals. For the recurrences observed after high-dose primaquine, approximately 15% 130 had mean probabilities greater than 0.1 of being relapses and the remaining 85% had 131 mean probabilities less than 0.1 of being relapses (Fig 2, top right panel). In both cases, 132 the time-to-recurrence is correlated with the posterior uncertainty (Fig 2, bottom 133 panels). The least uncertainty was observed around the peak expected timing of relapse 134 following treatment. This is dependent on whether a slowly eliminated blood-stage drug 135 was administered (chloroquine or dihydroartemisin-piperaquine) or a rapidly eliminated 136 (artesunate monotherapy), irrespective of whether high-dose primaquine was added (Fig 137  2, D). 138 The mean estimates of the recurrence states averaged over all observed recurrences 139 in the pooled time-to-event analysis are given in Table 1. Overall, after supervised 140 high-dose primaquine in this epidemiological context, the model estimated that ≥ 90% 141 of the recurrences are reinfections, as compared to less than 4% when radical cure is not 142 given. There was little evidence of recrudescence for all treatments considered. These 143 results are consistent with previous modelling results from the same area [33]. 144 This pooled analysis confirms the high periodicity of relapse for the frequent relapse 145 phenotype present in Southeast Asia [6]. Some previous statistical models of vivax  (dots). Colours correspond to the treatment used in the previous episode, where Primaquine+ refers to primaquine with partner drug. D: population mean posterior probabilities (normal scale) for the three recurrences types as a function of time-to-recurrence following high-dose primaquine with partner drug (PMQ+) and no primaquine but a slow acting blood-stage drug such as chloroquine (No PMQ).

Fig 2.
Individual probabilities of relapse based from time-to-event pooled analysis of the VHX and BPD studies. Top panels: per-episode mean probability of relapse along with 95% credible intervals (shaded grey) for 1309 recurrences after no primaquine (left) and for 130 recurrences after high-dose primaquine (right). The recurrences are ranked by their mean probabilities of relapse state. A zone of 'uncertainty' (same as in Fig 3) is highlighted in pink. The upper and lower bounds are arbitrary. Bottom panels: the relationship between time since last episode and treatment, and the uncertainty of the posterior estimates (width of the 95% credible interval on the log 10 scale) after no primaquine (no PMQ, left) and after high-dose primaquine plus partner drug (PMQ+, right). The black lines represent fitted LOESS curves to highlight trends. mixture of distributions would fit an 'activation' hypothesis, as given by [6], where 152 relapses could generally emerge at a constant rate with a superimposed illness induced 153 feedback (i.e. fever itself activates hypnozoites), thus increasing the probability of 154 activation and creating periodicity.

155
Combining time-to-event and genetic information 156 In order to obtain more informed individual-level estimates of each recurrence state we 157 incorporated information from highly polymorphic microsatellite marker data. We  primaquine groups, respectively. Overall, the vast majority of recurrent episodes for 177 which we had genetic data had low uncertainty in the probabilities of their recurrence 178 state (the uncertainty is shown by the vertical lines in Fig 3). We note that trial 179 summaries based on probabilities of the individuals who did not receive high-dose 180 primaquine (all were in the VHX study and the majority received chloroquine 181 monotherapy) are biased by selective genotyping of individuals with the highest number 182 of recurrences. These are presumably the individuals with the largest number of liver 183 hypnozoites and so will not be representative of the general enrollment population. In 184 particular, there are likely competing risks between relapsing infections and reinfections. 185 Relapsing infections will usually reach patency before reinfections, and if they occur 186 simultaneously then the genetic signature could point to relapse. Thus frequent relapse 187 likely hides reinfection. Under the genetic model, we made the simplifying assumption 188 that recurrence states are mutually exclusive.

189
Of particular interest are two recurrences in two separate patients which were 190 classified with high certainty as relapses. Both occurred after a 300 day infection free 191 interval (Fig 3, bottom panel). One individual had received high-dose primaquine and 192 the other had not (Fig 4).

194
We estimated the false-positive discovery rate of the genetic model of recurrence by 195 calculating the probabilities of the relapse state when comparing isolates from episodes 196 in separate individuals. This resulted in 90194 pairwise comparisons with data truly 197 generated under the null distribution: i.e. the parasites in the pairwise comparisons are 198 known to derive from different people and thus cannot be relapses, and genetic data are 199 drawn from the true population distribution. This gave an estimated false positive rate 200 (defined as the probability of relapse greater than the upper bound of an arbitrarily  defined 'uncertainty zone' from 0.3 to 0.7) of 2.2%. Not only does this highlight the 202 discriminatory power of our panel of 9 microsatellites, but it also highlights the 203 considerable population diversity within a small geographic location with low seasonal 204 transmission.

205
Radical cure efficacy of primaquine on the Thailand-Myanmar 206 border 207 We estimated the reinfection adjusted failure rate of high-dose (total dose of 7mg/kg) 208 supervised primaquine to be 2.6% (80% CI: 2.0-3.5), using the genetic and time-to-event 209 model. This was derived from the final model estimate that 15.8% of recurrences 210 following high-dose primaquine in the BPD study were relapses. If we consider the 211 background transmission to be constant over the 4 years studied, this translates into 212 99% of all relapses were averted, decreasing the relapse rate from 3.32 per year to 0.03 213 per year (Table 1). This estimate of number of relapses averted is a function of the 214 individual hypnozoite loads and therefore of the background transmission. These 215 estimates of the radical curative efficacy of high-dose primaquine used all available data 216 from the 655 individuals enrolled in the BPD study, which had requisite genetic data for 217 all but 5 of 92 recurrences. This reinfection adjusted estimate is a considerable 218 reduction from the original reinfection unadjusted estimate of the failure rate at 12% 219 (80% CI: [10][11][12][13][14] [32].

220
To investigate these results further, we assessed the contribution of individual patient 221 drug exposures by examining the relationship between the day 7 trough concentrations 222 of carboxy-primaquine (the slowly eliminated inactive metabolite of primaquine), and 223 treatment failure (defined as a probability of relapse or recrudescence greater than 0.5), 224 adjusted for primaquine regimen administered (either 14 daily doses of 0.5mg/kg or 7 225 daily doses of 1mg/kg). A statistically significant trend was observed, but this was 226 driven by a few outliers, defined as episodes in which the plasma carboxy-primaquine 227 trough concentrations were more than 3 standard deviations below the mean 228 (supplementary Fig 9). Concentrations this far below expected values are likely to reflect 229 incomplete drug absorption resulting from protocol deviations (e.g. non adherence, 230 vomiting). After adjusting for these outliers, there was no statistically significant 231 relationship between drug exposure and radical cure failure. This result suggests that it 232 is possible to discriminate between drug failures due to biological mechanisms (e.g. high 233 hypnozoite load, cytochrome P450 2D6 polymorphisms, intrinsic drug resistance, etc.) 234 and drug failures because of vomiting the medication or non-adherence. This is   Posterior probabilities of recurrence states as the number of microsatellites typed is increased Each plot corresponds to a different relationship simulation scenario. Coloured bars show the median posterior probabilities with error bars extending ± one standard deviation. The effective cardinality of each marker was set equal to 13 in these simulations. The COIs in the first and second infections were 1 and 1 respectively. The prior probability of each recurrence state was 1 /3. As the number of microsatellites genotyped is increased, the following are expected. 1) Under the clone scenario, the probability of recrudescence should converge a probability greater than the prior and the complement of that of relapse; while the probability of reinfection should converge to 0. 2) Under the sibling scenario, the probabilities should converge to 1 for relapse and 0 otherwise. 3) Under the strange scenario, the probability of reinfection should converge to a probability greater than the prior and the complement of that of relapse; while the probability of recrudescence should converge to 0.
an exact clone, a sibling, or a stranger with respect to the primary infection. probabilities. When the simulated data had sibling relationships (exclusive evidence of 254 relapse) nine markers or more were needed to obtain a median posterior probability of 255 relapse close to one. These simple simulations suggest that reliable posterior estimates 256 of the unknown recurrence state can be obtained with approximately 9 markers of 257 effective cardinality equal to 13.

259
Distinguishing recrudescence, relapse and reinfection in recurrent vivax malaria is 260 necessary for optimum planning of malaria control and elimination interventions as each 261 recurrence state has a different implication for case management. To date, there are no 262 available host or parasite biomarkers of the different recurrence states. Our approach is 263 to our knowledge the first principled attempt to use treatment information along with 264 time-to-event and genetic data to estimate individual probabilities of the possible 265 recurrence states. The most important operationally relevant result from this analysis is 266 a re-evaluation of the estimated radical cure failure rate following high-dose primaquine. 267 We re-estimated the true failure rate in the BPD study to be 2.6% as compared to the 268 published reinfection unadjusted estimate of 12% [32]. The radical curative efficacy of 269 primaquine is not a fixed property -even when reinfections are discounted, it depends 270 on hypnozoite burden and thus the background level of transmission [6,34] -so these 271 results also probably reflect recent improvements in malaria control in the area [36].

272
Our analysis reinforces the value of high-dose primaquine radical cure in this setting. 273 We note that the estimated false positive discovery rate for relapse was extremely 274 low (2.2%) when the genetic model alone was tested on isolates from separate patients 275 (null data). This guards against underestimating the efficacy of high-dose primaquine. 276 Additionally, the estimated number of relapses in non-primaquine treated patients was 277 very high (exceeds 98%). This means that the false negative rate for relapse 278 identification in non-primaquine treated patients cannot be more than around 2% and 279 tentatively suggests a similarly low rate for relapses in patients who do receive 280 primaquine, which guards against overestimating the efficacy of high-dose primaquine. 281 Much remains unknown regarding the biology of relapse and the mode of action of Thailand-Myanmar border, where P. vivax exhibits the short-latency phenotype [6,7], 285 we recovered an approximate 60:40 split between early-periodic relapse and constant 286 rate relapse. Whether this truly captures the biology of relapse activation or whether a 287 more complex system operates is uncertain. We also observed two late recurrences (300 288 days since last episode) with high probability (and low estimate uncertainty) of relapse. 289 Our results thus suggest that short-latency hypnozoites can remain dormant in the liver 290 for up to year, confirming previous reports with presumably similar P. vivax phenotypes 291 (the Chesson strain) [5,13]. These late relapsing hypnozoites likely awaken via a 292 different mechanism to that of the highly periodic 'long-latency' P. vivax [6]; the most 293 parsimonious explanation would be that they awake at 'random' (constant rate of 294 relapse).

295
Various statistical and mathematical models of P. vivax exist, ranging from models 296 of within host dynamics to global geostatistical descriptions of The main limitations are poor ability to infer a recrudescent state, and computational 320 complexity. In general, correct classification of recrudescent infections is difficult as at 321 low levels of resistance they will reach patency at similar times as relapsing infections 322 and the genetic signature is the same as a homologous (i.e. isogenic) hypnozoite. Our 323 genetic model is brittle with respect to recrudescence as we do not account for imperfect 324 detection or genotyping errors. Given there is very little evidence of recrudescence in 325 this area from other indicators (i.e. slowing of parasite clearance rates and rising 326 recurrence rates supported by reduced in-vitro susceptibility), this is unlikely to impact 327 on our results, but it would necessitate modification before application to data from a 328 region where P. vivax antimalarial resistance is suspected. The computational 329 complexity of the model is exponential as a function of the number of recurrences and 330 the complexities of the episodes. Its scope is therefore limited to data from a maximum 331 of three episodes (a primary episode and two recurrences) in which the maximum 332 cumulative COI is 6, with approximately 3-6 heteroallelic genotype calls per episode.

333
The majority of P. vivax endemic areas are consistent with the low transmission 334 settings studied here on the Thailand-Myanmar border [1], where over 99% of the single 335 recurrences fit these stipulations (the median COI in both VHX and BPD is 1).

336
However, the method would require modification before application to data from high 337 transmission settings such as Papua New Guinea, where the estimated complexity of a 338 single infection is often greater than 6 [37, 55]. inference whereby genetic complexity is sufficiently low and diversity sufficiently high. 346 A simple simulation study estimated that genotyping 9 highly polymorphic infections are unrelated. Alone, each model is useful but sub-optimal. In combination 370 they provide more informed probabilistic estimates. In this work we combined the two 371 models informally, using the posterior of the time-to-event model as a discrete prior in 372 the genetic model. Using this approach we determined that the radical curative efficacy 373 of supervised high-dose primaquine is considerably higher than previously thought in border.

377
Time-to-recurrence combined with highly polymorphic microsatellite genotyping data 378 provide valuable complimentary information for the inference of relapse, recrudescence 379 and reinfection states of recurrent vivax infections. Karen ethnicity [68]. During the time these studies were conducted, primaquine radical 388 cure treatment was not routinely given to patients in this area.

389
Ethical Approval

390
The BPD study was approved by both the Mahidol University Faculty of Tropical recurrence in order to assess the risks and benefits of radical cure.

407
Subjects were followed daily for supervised drug treatment. Follow-up continued 408 weekly for eight weeks and then every four weeks for a total of one year. Patients with 409 microscopy confirmed P. vivax infections were retreated with the same study drug as in 410 the original allocation. Patients in the artesunate monotherapy or chloroquine 411 monotherapy groups who experienced more than 9 recurrences were given radical 412 curative treatment with the standard primaquine regimen (0.5 mg base/kg/day for 14 413 days).

414
Best Primaquine Dose trial (BPD) 415 Between February 2012 and July 2014, 680 patients older than 6 months were enrolled 416 in a four-way randomized controlled trial simultaneously comparing two regimens of 417 primaquine (0.5 mg/kg/day for 14 days or 1 mg/kg/day for 7 days) combined with one 418 of two blood-stage treatments: chloroquine (25 mg base/kg) or 419 dihydroartemisinin-piperaquine (dihydroartemisinin 7mg/kg and piperaquine 55mg/kg). 420 The aim was to characterize the efficacy and tolerability of a shorter course (7 days 421 rather than 14) of higher dose primaquine, with a second randomization to chloroquine 422 or dihydroartemisinin-piperaquine. All doses were supervised.

423
The inclusion and exclusion criteria for this study were the same as for the VHX 424 study, except for the following: patients were excluded if they were G6PD deficient by 425 the fluorescent spot test, had a haematocrit less than 25%, had received a blood 426 transfusion within 3 months, or could not comply with the study requirements.

427
Follow-up visits occurred on weeks 2 and 4, and then every 4 weeks for a total of one 428 year. Any recurrent P. vivax infections detected by microscopy (same criteria as for 429 VHX) were treated with a standard regimen of chloroquine (25 mg base/kg over 3 days) 430 and primaquine (0.5 mg base/kg/day for 14 days).

431
In both studies, recurrent episodes were detected actively at the scheduled visits by 432 microscopy (lower limit of detection is circa 20 parasites per µL). Patients were 433 encouraged to come to the clinics between scheduled visits when unwell and so some 434 recurrences were detected passively (less than 5%).

435
Microsatellite genotyping 436 Whole blood for complete blood count was collected by venipuncture in a 2mL EDTA 437 tube. Remaining whole blood was frozen at -80C. Plasmodium vivax genomic DNA was 438 extracted from 1 mL of venous blood using automated DNA extraction system 439 QIAsymphony SP (QIAGEN, Germany) and QIAsymphony DSP DNA mini kit 440 (QIAGEN, Germany) according to the manufacturer's instructions. In order to compare 441 primary infections and recurrences genotypic patterns, we genotyped initially using 442 three polymorphic microsatellite loci that provided very clean amplification: no stutter 443 peaks, and on the basis of amplification (i.e. reliability of PCR amplification at the low 444 parasite densities usually found in recurrent infections). These loci were PV 3.27, PV 445 3.502, and PV ms8. PCR amplification was performed following previously described 446 protocols [12,69]. The genotypes of pre-and post-treatment samples were subsequently 447 assigned a crude classification: 'related' versus 'different'. 'Related' genotypes were 448 defined as identical alleles observed at at least two loci out of the three typed. If all 449 alleles at all loci were different, or identical alleles were only observed at one loci, the 450 samples were classified as 'different'. If the paired samples were classified as 'related', six 451 more microsatellite markers were genotyped (PV 1.501, PV ms1, PV ms5, PV ms 6, PV 452 ms7, and PV ms16).

453
For allele calling on the microsatellites, the lengths of the PCR products were 454 measured in comparison to internal size standards (Genescan 500 LIZ) on an ABI 3100 455 Genetic analyzer (PE Applied Biosystems), using GENESCAN and GENOTYPER 456 software (Applied Biosystems) to measure allele lengths and to quantify peak heights.

457
Multiple alleles were called when there were multiple peaks per locus and where minor 458 peaks were > 33% of the height of the predominant allele. We included negative control 459 samples (human DNA or no template) in each amplification run. A subset of the 460 samples (n = 10) were analyzed in triplicate to confirm the consistency of the results 461 obtained. All pairs of primers were tested for specificity by use of genomic DNA from P. 462 falciparum or humans.  Time-to-event models were written in rstan based on the stan probabilistic programming 467 language [71,72]. Logistic and poisson mixed effects regression models were fitted using 468 the package lme4. R code along with deindentified microsatellite and time-to-event 469 datasets can be found on github at github.com/jwatowatson/RecurrentVivax. compared two Bayesian mixed-effects mixture models describing the time-to-event data 473 conditional on the treatment drug administered. Notation was chosen so as to be 474 consistent with the mathematical notation for the genetic model (next section). For 475 each individual (denoted by the subscript n ∈ 1..N ), we record the time intervals (in 476 days) between successive P. vivax episodes (the enrollment episode is denoted episode 477 0). The last time interval is right censored at the end of follow-up. The models assume 478 early dropout is not a confounding factor for drug efficacy. For the n th individual, data 479 concerning the time interval t (the time between episode t − 1 and episode t) is of the   Model 1 assumes 100% efficacy of high-dose primaquine with only reinfection possible. Model 2 does not assume 100% efficacy of high-dose primaquine and specifies different mixing proportions for the reinfection component in the non primaquine and primaquine groups, p n (AS) = p n (CQ) and p n (P M Q+), respectively. The mixing proportion between late and early relapse within the relapse component is the same across primaquine and non-primaquine groups. The likelihood for model 2 is given as: where p n (·) ∈ (0, 1) is the individual and drug-specific mixture probability of reinfection 501 (we set the prior to reflect our belief that p n (AS) = p n (CQ) > p n (P M Q+)) and 502 c(·) ∈ (0, 1) is the nested drug-specific mixture probability of recrudescence.

503
The likelihood for model 1 is the same except that p n (P M Q+) = 1 (only reinfection 504 possible). λ is the rate of reinfection; E(·) denotes the exponential distribution. E(λ RC ) 505 is an exponential distribution with rate parameter λ RC (assumed drug independent) n )). 514 We used informative prior distributions (supplementary Fig 10) to ensure 515 identifiability of the mixture components. Information content in the data, over and 516 above that specified in the prior, was visually examined using prior-to-posterior plots.

517
The prior-to-posterior plot for model 2 is shown in supplementary Figure 10.

518
All time-to-event models were coded and implemented in stan [72] and run using R 519 version 3.4.3. The complete models can be found on github at 520 github.com/jwatowatson/RecurrentVivax/Timing_Model.  There are several ways to compute allele frequencies from enrollment data. A simple 540 approach is to use monoclonal data. However, some alleles are only seen in polyclonal 541 infections. In addition, monoclonal-derived frequencies would almost certainly introduce 542 bias resulting from disproportionately low representation of rare alleles across 543 monoclonal infections [73]. A statistically rigorous treatment would jointly model 544 frequencies, polyclonality and hidden recurrence states, but is prohibitively complex. 545 We therefore took a two step approach: first we generated frequency estimates and then, 546 conditional on these estimates, we modelled relapse allowing for polyclonality (genetic 547 model described below).

548
To compute allele frequencies we used a multinomial likelihood and Dirichlet prior 549 with weight ω = 1 (equivalent to ω pseudo observations per allele) both with dimension 550 equal to the maximum repeat length seen across all the data, thereby interpolating 551 unobserved repeat lengths less than the maximum observed. Conjugacy implies the 552 posterior distribution over allele frequencies is a Dirichlet distribution with parameter 553 vector equal to ω plus the vector of allele counts (counted as 1 per episode observed).

554
The mean posterior estimate was used as a point estimate for the allele frequencies. A 555 Monte Carlo approximation of an 80% credible interval (extending from the 10th to 556 90th percentile) was constructed using 1000 random draws around the point estimates 557 (Fig 6). For each microsatellite genotyped, we calculated its 'effective cardinality' n * , defined 559 as the theoretical number of alleles (non-integer) which under a uniform distribution 560 would give rise to the same probability of identity by chance. The estimated values of 561 n * are shown for each microsatellite in Fig 6, with values varying from 4 (PV.ms1 ) to 562 28 (PV.ms8 ).

563
To lessen the probability that two alleles are identical by chance, some studies 564 discard common alleles (e.g. [55]). In this study, we account for identity by chance using 565 a model of relatedness based on identity by descent (IBD).  BPD and 137 in VHX)). The mean frequency estimate is shown by the black circles and 95% credible intervals are shown by the vertical lines. The dotted horizontal line shows the discrete uniform distribution over alleles for comparison. The number of isolates typed for each microsatellite is denoted by N . The top row shows the microsatellites which were typed most frequently (PV.3.502, PV.3.27, PV.ms8 ). The range of feasible alleles is given by the maximum observed repeat length in all the data per microsatellite combined, and all repeats are given a pseudo-observation of weight ω = 1. The estimated effective cardinality of each marker is denoted by n * . 567 In this section we describe the genetic model that outputs the probability that the t th 568 recurrence of the n th individual is a recrudescence, relapse or reinfection, given a vector 569 of prior probabilities of recrudescence, relapse or reinfection, and genetic data on a 570 small number of polyallelic microsatellite markers. In our joint model of time-to-event 571 and genetic data, the prior vector in the genetic model is derived from the posterior 572 distribution over recurrence states given the time-to-event data. Comprehensive 573 mathematical details of the model can be found in the Appendix.

574
To infer the probability of relapse, we construct a statistical model that exploits 575 evidence of relatedness expected within a single mosquito inoculum following 576 recombination (supplementary Fig 7). The only relationships between sporozoites that 577 are exclusive to a single mosquito are those between meiotic clones and siblings (Table 578 2). In other words, relapse can be caused by many types of sporozoites, including those 579 which are 'strangers' in relation to one another, but, in the absence of recrudescence, 580 only meiotic clones and meiotic siblings provide conclusive evidence of relapse ( Table 2). 581 Genetic data cannot discern clones and siblings that are meiotic or not. We thus use a 582 coarse definition of clones and siblings in this model, ignoring whether the relationship 583 is meiotic. We use the term 'stranger' to refer to all parasites whose shared ancestry 584 dates back beyond the most recent mosquito inoculation. Consequently, the term clone 585 refers to first-generation clones only. We assume zero probability of reinfection with 586 first-generation clones or siblings, since the probability of a mosquito feeding on the 587 same human host consecutively is low. Importantly, we do allow 'strangers' that are 588 genetically identical by chance. This captures some clones resulting from clonal 589 expansion (i.e. multiple generations of selfing). Parent-offspring pairs are considered 590 siblings under our model since they have an expected relatedness equal to 0.5 in the 591 absence of inbreeding. We ignore half siblings whose expected relatedness is 0.25 in the 592 absence of inbreeding. However, we do allow half-siblings in the sense that we allow two 593 or more alleles per locus within infections assumed to contain only siblings (i.e. we allow 594 for collections of siblings that together share more than two parents). The main model 595 assumptions (including those mentioned above, also summarised in Table 2) are as 596 follows.  Assumption 1 disregards half-siblings. However, as described above, within 613 infections assumed to contain only siblings, we allow more than two alleles per locus.

614
Half-siblings are possible within an inoculum if both i) a mosquito takes a blood meal 615 from an infection with COI ≥ 3 (supplementary Fig 7), and ii) two or more zygotes 616 derived from the cross fertilization of three or more genetically distinct gametes survive 617 the severe bottleneck of successful oocyst formation [21,74]. Given the severe 618 bottleneck, and the fact that only 30 of the 710 (4%) genotyped episodes in the pooled 619 dataset have COI estimates ≥ 3, half-siblings are likely to be rare in the data analysed 620 here. If any are present, the likelihood of the observed alleles will be misspecified with 621 expected IBD=0.5 under the model.

622
Assumption 1 also states that all siblings have the same expected relatedness (0.5 in 623 the absence of inbreeding). In general, this assumption holds for all full siblings, meiotic 624 or not (supplementary Fig 7). However, if we condition on meiotic siblings having one 625 or more genetic differences, the expected relatedness between two meiotic siblings is 0.4 626 (see Appendix). Within infections with COI≥ 2, we collapse clonal relationships 627 between haploid parasite genotypes. This means that the expected relatedness is very   The model would require modification before application to data from a region where P. 644 vivax antimalarial resistance is suspected, however. In comparison, inference of relapse 645 under the model is robust to assumptions 3 and 4 because we account for relapses 646 caused by siblings under the model. Conditioning on sibling relatednesss (IBD=0.5) 647 absorbs differences caused by mutation and slippage, especially since within infections 648 assumed to contain only siblings we allow more than two alleles per locus. Assumption 649 7 is likely to hold except in the rare event that an infected mosquito consecutively feeds 650 on the same human host (thereby transmitting a first-generation clone or recombinant 651 offspring to the same human host from whom the parental parasites were sourced).

652
Assumption 8 implies that all polyclonal reinfections are generated by co-inoculation. 653 That is, we do not allow coincidental blood-stage parasites from different mosquito 654 inoculations unless they are hynozoite-derived. Both the BPD and VHX trials had 655 active follow-up and all asymptomatic infections were treated so this assumption is 656 likely to hold. It would not hold in the context of passive detection or untreated 657 asymptomatic infections. This assumption also implies that we will miss reinfections in 658 individuals with frequent sibling or clonal relapses.

659
Neutrality of microsatellites (assumption 9) implies that allele frequencies of 660 hypnozoites are the same as those in the wider population. The microsatellite markers 661 used in this study were designed specifically to meet this assumption [12,69]. Future 662 data types may include non-independent markers. Extension of the relatedness model to 663 capture linkage between non-independent markers is possible (see [76,78,79] . underlying truth). Meiotic clones are parasites derived from a selfed oocyst (row 1, supplementary Fig 7), which means they must have been coinoculated. Meiotic siblings are parasites derived from an outcrossed oocyst (row 3, supplementary Fig 7), which also means they must have been coinoculated. Clones, siblings, parent-offspring and half siblings are all parasites that share common gamete genotypes but are from different oocysts (i.e. rows 2, 4, 5, and 8, respectively, supplementary Fig 7). They can occur in a single mosquito or in different mosquitoes (e.g. contemporaneous mosquitoes that share a common human source, or sequential mosquitoes linked to a common human host who acts a source to the first mosquito and sink to the second). Strangers * include all parasites whose shared common ancestry dates back beyond the most recent mosquito (rows 6, 7 and 9, supplementary Fig 7). Under the model, clones † includes both meiotic and not; siblings ‡ includes parent-offspring and meiotic and not, but excludes half siblings in the sense that they have expected relatedness of 0.25 (in the absence of inbreeding). allowing either sibling or stranger edges within infections (clonal edges within infections 677 collapse), and clone, sibling or stranger edges across infections. For each combination of 678 vertex and edge labels, we then compute the probability of the data conditional on the 679 expected relatedness given the edge label and the allele frequencies of the vertex labels. 680 We integrate over states of IBD, thereby taking into account alleles that are identical 681 due to chance. We allow for some background IBD, since serial transmission of related 682 parasites can lead to higher than expected IBD in low transmission areas, using the 683 parameter α (discussed above). Finally, we sum over all relatednesss graphs weighted by 684 their probability given relapse, recrudescence or reinfection.

685
The probability of a relatedness graph given that the recurrence is a relapse is equal 686 over all viable graphs, since all viable relatedness graphs are compatible with a relapse 687 state ( Table 2). The probability of a relatedness graph given the recurrence is a recrudescence is equal over viable graphs that have, for each haploid parasite genotype 689 in the recurrent infection, one or more clonal edges with haploid parasite genotypes in 690 the previous infection, since only clones are compatible with recrudescence under the 691 model (Table 2 and assumption 2). The probability of a relatedness graph given the 692 recurrence is a reinfection is equal over viable graphs that have only stranger edges 693 between infections, since clones and siblings are not compatible with a reinfection state 694 under the model (Table 2 and assumption 7). 695 For a given recurrence state, we do not weight the probability of each graph by the 696 relative proportion of vertices that are clones, siblings and 'strangers' in relation to one 697 another. Doing so is theoretically possible (and could provide a link to a dynamic P. Motivation Time-to-event data often only provide intermediate level evidence for or 719 against relapse and ignore rich signals from genetic data. Alone, genetic data do not 720 suffice to pinpoint relapsing infections as unrelated parasites are found both within and 721 across inocula, and are compatible with relapse or reinfection. However, by combining 722 both sources of information we can use genetic data to update the a priori belief of 723 recurrent states based on the time-to-event. In this work we combine the two models 724 informally, using the posterior of the time-to-event model as a discrete prior in the 725 genetic model. It remains to be seen whether a formally joint model of both data types 726 would add value [81].

Conditional independence of time-to-event and genetic information
For the 728 n th individual, let x n denote all available time-to-event data (as above), y n denote all 729 available genetic data, and R n denote hidden recurrent states. Under the joint model of 730 relapse we assume x n and y n are conditionally independent given R n : 731 P (x n , y n |R n ) = P (x n |R n ) P (y n |R n ) December 23, 2018 23/43 This assumption allows sequential update of the posterior probability of R n . First given information from the time-of-event data, (2) second given information from genotyping (be it microsatellite data or other), P (R n |y n , x n ) = P (y n |R n ) P (R n |x n ) AllRn P (y n |R n ) P (R n |x n ) .
We use Monte Carlo sampling to numerically approximate P (R n |x n ), Eq 2. We then 732 draw from the numerical approximation uniformly at random (while also drawing from 733 the posterior allele frequency distributions) to recover a numerical approximation of 734 P (R n |x n , y n ). Since we explicitly sum over all R n , for a given allele frequency draw,

735
(Eq 3) can be considered a transformation of the sample approximating P (R n |x n ).

736
The conditional independence assumption can be interpreted as 'no propensity for 737 relapsing stranger parasites to occur at different time intervals than relapsing parasites 738 that are related'. In reality, this assumption may be incorrect. For example, [6] 739 hypothesises that malarial illness itself activates pre-existent hypnozoites which in long 740 latency P. vivax could lead to a preferential activation of genetically unrelated 741 hypnozoites [6]. In long latency vivax, reinfection would thereby trigger activation of 742 previously inoculated parasites (unrelated) and the most recently accumulated 743 hypnozoites would stay dormant for 8-9 months. However, this is highly speculative and 744 thus in our joint model of relapse we assume conditional independence.

758
The content is solely the responsibility of the authors and does not necessarily 759 represent the official views of the funders. 760 We are grateful to all the patients who took part in these studies and for the study 761 staff who cared for them. A special thanks to Clare Ling, PhD and Pornpimon 762 Wilairisak for managing and keeping in order the large volume of study samples.   The proxy exposure to carboxy-primaquine is defined (lower bound) as the number of days of primaquine administration multiplied by the log trough concentration observed on day 7. The fitted trends using all the data are significant; they are shown by the thick lines. After removal of outliers (circles: defined as those episodes whose carboxy-primaquine trough concentrations were more than 3 standard deviations from the mean), the fitted trends (non-significant) are shown by the dashed lines.   Evaluation of the likelihood (divided into three parts), Model implementation. A running 1096 example is provided in boxes. The full set of mathematical notation used in this model 1097 is given in Table 3. n , for recurrence t experienced by individual n, given all available genetic data for that individual, denoted y n . Note that this includes the genetic data of past episodes and future recurrences. The Bayesian posterior probability of the relapse state L for the t th recurrence experienced by the n th individual can be written as: where 1100 • P (yn | Rn) denotes the likelihood and P(Rn) denotes the prior;  all genetic data available for the individual n (e.g. equation (5)) indicator function equal to one if x ∈ Ω and 0 otherwise |Ω| size (i.e. number of elements) of a set Ω Graph notation (subscripts n, a and b are dropped where fixed) a = 1, . . . , A index over labelled graphs that differ with respect to their vertex haploid genotype labels, where A is the number of ways to label a graph (disregarding edge relatedness labels) with vertex haploid genotype labels compatible with y n (i.e. only includes graphs whose probability is non-zero according to Eq (10)) b = 1, . . . , B index over labelled graphs that differ with respect to their edge relatedness labels, where B is the number of viable ways to label a graph (disregarding vertex haplotype genotype labels) with edge relatedness labels. Viable graphs are those that obey the transitive property and, additionally, have no clonal edges within an infection, see section entitled Viable graph brute-force search algorithm) nm for all i = I (i) k n ab ij ∈ {str, sib, clo} relatedness (a.k.a. kinship) label of edge e n ab ij sib sibling relatedness between a pair of parasites str stranger relatedness between a pair of parasites cln clonal relatedness between a pair of parasites f h nl im frequency of the allele at the mth microsatallite of the haploid genotype on the ith vertex of G n ab α ∈ [0, 0.5] additive effect on P(IBD | k n ab ij ) of background population-level inbreeding recurrence probability point estimates generated under the time-to-event model, namelŷ n I , for t = 1, . . . , T n , Evaluation of the likelihood 1112 The likelihood, P (y n | R n ), is evaluated by summing over person-specific (indexed by n) vertex labelled (indexed by a) and edge labelled (indexed by b) graphs of relatedness over parasite haploid genotypes within and across infections, G n ab , The solution to equation (7) is described in three parts: first we describe the set of 1113 fully labelled graphs G n ab , second we describe P (y n | G n ab ), and third we describe The edge relatedness (a.k.a. kinship) labels of both graphs are the same: k 12 = sib, k 23 = sib, k 13 = sib and k 45 = sib, while the rest are all str, such that b = 1 for both graphs. On the contrary, the vertex haploid genotype labels h 1 and h 2 differ across the example graphs and so the example graphs have different indices, a = 1 and a = 2.
above, we use a and b to index over vertex haploid genotype labels and edge relatedness 1142 labels, respectively, and so the two graphs G ab of Figure 12 have the different a ∈ {1, 2} 1143 but the same b = 1.

1144
The number of ways to allocate haploid genotype labels to a graph for the n th . To see why this is the case, first let H (t) be a matrix whose column vectors are 1149 haploid genotypes compatible with y (t) (e.g. equation (8)).
For a given y (t) , the number of possible haploid genotypes (i.e. number of columns of H (t) ) is However, of the many combinations given by equation (9), That is, only combinations where all alleles in y are represented at least once (e.g. those 1151 in Figure 12) lead to P (y | G ab ) > 0, and contribute to the total number of ways, A, to 1152 label a graph with vertex haploid genotype labels compatible with y. Since A is 1153 enumerated independently of edge relatedness labels, graphs that have P (y | G ab ) = 0 1154 due to I cln (k ij ) = 1 and I hi m (h jm ) = 0 (see below) or NA values do contribute to A.

1155
Probability of the data given a graph 1156 The probability of the data given a vertex and edge labelled graph, P (y | G ab ), is calculated assuming conditional independence between microsatellites and between edges, where y im = y

1159
Hereafter we consider a single graph with fixed vertex and edge labels thus drop the indices a and b. To evaluate P (y im , y jm | h im , h jm , k ij ) we assume conditional independence between pairs of vertices and edges given the IBD state of the m th marker, P (y im , y jm | h im , h jm , k ij ) = 1 IBD=0 P (y im , y jm | h im , h jm , IBD) P (IBD | k ij ) P (yi m , yj m | hi m , hj m , kij = clo) Probability of a graph given a series of recurrence states 1170 To calculate the probability of a graph given a series of recurrence states, we assume independence between recurrence states, where P G | R (t) = L = 1 AB (17) P G | R (t) = I = 1 AB I ∀z < t, ∀i ∈ I (t) , ∀j ∈ I (z) kij = str, if ∃ c (t) disjoint pairs ij : i ∈ I (t) and j ∈ I (t−1) and kij = cln, where B I < B and B C < B denote the number of graphs that satisfy the conditions 1171 outlined in Eq (18) and (19), both of which are determined algorithmically from the 1172 adjacency matrix of G.

1173
The condition outlined in Eq (19) that j ∈ I (t−1) specifies that a recrudescence is 1174 seeded by the most recent past infection only (assumption 2). Also in Eq (19), the 1175 condition that there are c (t) disjoint pairs follows from assumptions 5 and 6 and results 1176 in zero probability of recrudescence following an infection with lower COI (i.e. a 1177 recrudescence cannot be more diverse that the infection that seeded it -diversity cannot 1178 be be created, only lost). For example, in Eq (5), R n has zero probability of being a 1179 recrudescence because c (2) = 2 > c (1) = 1.

1180
Presently, Eq (17) to (19) do not take into the relative likelihood of parasites that 1181 are strangers, siblings or clones in relation to one another within an inoculation.

1182
However, they could be adapted to do so (e.g. by coupling to a transmission mode).

1183
Note that P (G | R) implicitly conditions on y via c (t) . 1185 Above, t = 0, . . . , T n where T n is the number of recurrences experienced by the n th 1186 individual. In the code, t = 1, . . . , T n where T n is the number of infections experienced 1187 by the n th individual.

1188
The model is implemented on the log scale to prevent under and over flow problems, using the log sum exp trick where appropriate. Instead of summing over a and b in one step as Eq (7)  P (y | G ab ) P (G ab | R) , where the subscript n is dropped since n is fixed. The algorithm is summarized as follows: 1196