Impacts of an urban sanitation intervention on fecal indicators and the prevalence of human fecal contamination in Mozambique

Fecal source tracking (FST) may be useful to assess pathways of fecal contamination in domestic environments and to estimate the impacts of water, sanitation, and hygiene (WASH) interventions in low-income settings. We measured two non-specific and two human-associated fecal indicators in water, soil, and surfaces before and after a shared latrine intervention from low-income households in Maputo, Mozambique participating in the Maputo Sanitation (MapSan) trial. Up to a quarter of households were impacted by human fecal contamination, but trends were unaffected by improvements to shared sanitation facilities. The intervention reduced E. coli gene concentrations in soil but did not impact culturable E. coli or the prevalence of human FST markers in a difference-in-differences analysis. Using a novel Bayesian hierarchical modeling approach to account for human marker diagnostic sensitivity and specificity, we revealed a high amount of uncertainty associated with human FST measurements and intervention effect estimates. The field of microbial source tracking would benefit from adding measures of diagnostic accuracy to better interpret findings, particularly when FST analyses convey insufficient information for robust inference. With improved measures, FST could help identify dominant pathways of human and animal fecal contamination in communities and guide implementation of effective interventions to safeguard health. SYNOPSIS An urban sanitation intervention had minimal and highly uncertain effects on human fecal contamination after accounting for fecal indicator sensitivity and specificity. TOC GRAPHIC/ABSTRACT ART


Introduction
probability of not detecting the indicator when contamination is absent. The probability that a 210 positive sample is contaminated depends on the marker sensitivity and specificity and the 211 prevalence of human fecal contamination. This marginal probability of contamination can be 212 approximated as the frequency of indicator detection among all samples to explore indicator 213 reliability in a specific study. 31 We assessed the probability that human feces were present in an 214 environmental sample in which HF183 or Mnif was detected using Bayes' Theorem and the local 215 sensitivity and specificity of the two markers (see SI). [34][35][36] We calculated the conditional 216 probability of contamination for HF183 and Mnif separately and for each combination of the two 217 indicators by sample type. The marginal probability of contamination was approximated as the 218 detection frequency of HF183 among all samples of a given type. 219

Accounting for diagnostic accuracy 220
Fecal indicator measurements are used as proxies for unobserved fecal contamination to 221 estimate its prevalence and associations of interest, such as the effects of mitigation practices. 222 This approach is vulnerable to measurement error, illustrated by the limited diagnostic accuracy 223 of many host-associated fecal indicators. 16 Bias due to inaccurate diagnostic tests can be 224 mitigated by incorporating external information on the sensitivity and specificity of the test. 62 225 The expected detection frequency, , of a test with sensitivity and specificity is given by 226 for an underlying condition with prevalence . 62, 63 We adapted the approach of Gelman and 227 Carpenter to estimate the intervention effect on human fecal contamination prevalence from 228 observations of human-associated fecal indicators by incorporating external information on 229 indicator performance within a Bayesian hierarchical framework. 63 We included the product-230 term representation of the DID estimator and other covariates as linear predictors of the 231 prevalence log-odds. Assuming indicator detection in the th of samples, , was Bernoulli-232 distributed with probability , where was related to the prevalence as shown in Equation (1) for positive results in human fecal samples and negative results in non-human 241 fecal samples. Because our validation sample set was small and performance estimates vary 242 widely between studies, we fit a third model (Model 3) featuring a meta-analysis of indicator 243 sensitivity and specificity (see SI). We assumed the log-odds of the sensitivity in the th study, 244 [ ] , were normally distributed with mean and SD , such that 245 with an equivalent structure for the specificity. We assigned = 1 to our local validation study, 246

Sample characteristics 291
We collected a total of 770 environmental samples from 507 unique locations at 139 292 households in 71 compounds. Samples were collected both pre-and post-intervention at 263 293 locations (52%), for a total of 526 paired samples and 244 unpaired samples (Table S2). 294 Characteristics expected to confound the relationship between sanitation and fecal contamination 295 were largely similar between treatment arms during each study phase (Table 1). Cumulative 296 precipitation was higher on average in intervention compounds at baseline and in control 297 compounds at follow-up. Water storage containers were also more frequently covered in 298 intervention (75%) than control households (57%) at baseline, though the majority of containers 299 were covered in all strata. Soil surfaces were more often visibly wet in control households (51%) 300 than intervention (33%) at follow-up, both of which were lower than at baseline (57% and 48%, 301 respectively). Most food preparation surfaces were plastic, though more often so in control 302 households during both study phases. A higher percentage of compounds from both treatment 303 arms reported owning domestic animals at follow-up (80-88%) than baseline (47-68%), which 304

Fecal indicator occurrence 312
At least one fecal indicator was detected in 94% of samples (720/770) and E. coli was 313 detected in 718 samples: by culture in 81% (611/755) and by qPCR in 86% (655/763). Mean 314 cEC concentrations were lower at follow-up for all sample types in both treatment arms, a 315 pattern not observed for EC23S concentrations (Figure 1). Of the 763 samples tested for human-316 associated indicators, 28% (217) were positive for at least one human marker. Human-associated 317 indicators were common in soils (23-65% prevalence, across treatment groups and study phases) 318 but only HF183 was regularly detected in stored water (10-22%) and both indicators were rare 319 on food surfaces (0-9%). qPCR calibration curves (Table S5), detection limits (Table S6), and 320 the results of laboratory quality controls are presented in the SI. 321 Bootstrap DID estimates suggest the intervention reduced EC23S concentrations on food 322 preparation surfaces and HF183 prevalence in household soil but minimally impacted fecal 323 indicator occurrence in other sample types (Table S7). Notably, HF183 prevalence in household 324 soil was similar among intervention households in both study phases but increased among 325 control compounds at follow-up. By contrast, model-based DID estimates, adjusted for potential 326 confounding, were consistent with no intervention effect on food preparation surface EC23S 327 concentration or household soil HF183 prevalence (Table S8). Adjusted models instead indicate 328 the intervention reduced latrine soil concentrations of EC23S [mean difference: -1.2 (95% CI: -329 2.1, -0.30) log10 gc/dry g]. Although several sample characteristics were imbalanced between 330 treatment arms and study phases (Table 1), estimates from models that adjusted for these 331 variables were largely similar to the unadjusted models, with adjusted estimates marginally 332 closer to the null in most cases (Table S8). EC23S concentrations in latrine soil were again the 333 exception, with a substantially larger reduction obtained under the adjusted model than the unadjusted estimate of -0.84 (95% CI: -1.6, -0.02) log10 gc/dry g. Due to low detection 335 frequency, models were not fit for either human marker on food surfaces or for Mnif in stored 336 water; source water samples were excluded from all analyses. 26

Conditional probability of human fecal contamination 344
The probability that a sample is contaminated with human feces given the detection of a 345 human indicator is a function of the indicator's sensitivity and specificity (Table S9) and the 346 prevalence of human contamination in the study environment. At 15% prevalence 347 (approximately the detection frequency of HF183 in stored water), the probability of human 348 contamination given a positive test was 26% for HF183 and 30% for Mnif. Only with prevalence 349 above 30-35% was detecting either indicator more likely than not to correctly diagnose human 350 fecal contamination. Combining test results from both indicators improved identification of 351 human contamination, increasing the probability of contamination to 45% when both markers 352 were positive and the prevalence was 15% (Figure 2). However, the two human markers 353 frequently disagreed when assessed in the same sample, conflicting in 44% of household soil, 354 43% of latrine soil, and 15% of stored water samples. Furthermore, at 44% prevalence (the 355 highest detection frequency for HF183, observed in latrine soils), there remained a >20% chance 356 that a sample positive for both indicators was not contaminated. Among lower-prevalence 357 sample types the conditional probability never reached 50%. Unless the background prevalence 358 in the study area was about 45% or greater, it is unlikely that the use of HF183 and Mnif reliably 359 identified human contamination in individual samples, particularly given the frequent 360 disagreement between the two markers. Values of sensitivity and specificity were obtained using human and animal feces from the 365 study area, and are 64% and 67%, respectively, for HF183 and 71% and 70% for Mnif. 366 The dashed vertical lines indicate the HF183 detection frequency for each sample type to 367 illustrate relevant human contamination probabilities. FP: food preparation surfaces; SW: 368 stored water; HS: household entrance soil; LS: latrine entrance soil. 369 370 371

Prevalence of human fecal contamination 372
Posterior predictions from each of the five accuracy-adjusted models were used to 373 estimate stratum-specific prevalence of human fecal contamination. To compare treatment 374 assignments and study phases, we predicted prevalence for compounds with no animals or 375 antecedent precipitation and the sample mean population density (7 persons/100 m 2 ), wealth 376 score (46), and previous-day temperature (20.4 °C), in which soil surfaces were dry and shaded 377 and water storage containers possessed wide, uncovered mouths. The prevalence estimates were 378 notably imprecise; the 95% CI of the HF183 prevalence in post-treatment latrine soil ranged 379 from 3% to 92% for Model 2 ( Table 2). The 95% CI widths were similar for Model 1 and the 380 bootstrap estimates but were substantially wider for the other four models, which accounted for 381 FST marker sensitivity and specificity (see SI). The intervals narrowed somewhat when both 382 indicators were considered (Model 4) and narrowed further when all sample types were 383 incorporated (Model 5) but were still wider than the estimates that did not account for diagnostic 384 accuracy. 385 Although we did not formally assess the pairwise differences between prevalence 386 estimates, the wide and largely overlapping posterior predictive CIs indicate a limited ability to 387 distinguish between prevalence estimates between different strata or models. The DID estimates 388 on the probability scale were strongly consistent with no effect for all model specifications, 389 which further suggests that the available data were insufficient to assess prevalence differences 390 between strata. The corresponding prevalence odds ratio estimates obtained directly from the 391 DID product term were likewise imprecise ( Figure S1). Nonetheless, the model-based prevalence 392 estimates were consistently more similar between study phase and treatment group than the 393 corresponding bootstrap estimates. This trend was notable for Model 5, which assumed that time 394 and treatment effects acted directly on the compound-wide prevalence of human contamination, 395 thus affecting all three sample types equally. The compound-level prevalence estimates were 396 quite similar, particularly between study phases for the same treatment group: 27% (95% CI: 9-397 52%) at baseline and 28% (9-53%) at follow-up for control compounds and 22% (6-50%) at 398 baseline and 22% (6-47%) at follow-up for intervention compounds. The corresponding 399 estimates for household soil were nearly identical to the compound-level estimates, with 400 somewhat higher estimates for latrine soil and lower for stored water. Although the physical 401 interpretation of this compound-level construct is uncertain, these estimates suggest that about a 402 quarter of compounds were measurably impacted by human fecal contamination, which was 403 unaffected by improvements to shared sanitation facilities. 404  a all models (excluding bootstrap estimates) were adjusted for population density, presence of animals, wealth score, temperature, antecedent 408 precipitation, and sun exposure and surface wetness for soil samples and storage container mouth width and cover status for water samples 409 b difference-in-differences 410 c model 1: single sample type, single marker assuming perfect sensitivity and specificity 411 d model 2: single sample type, single marker with sensitivity and specificity from local validation study 412 e model 3: single sample type, single marker with meta-analytic sensitivity and specificity 413 f model 4: single sample type, two markers with meta-analytic sensitivity and specificity 414 g model 5: three sample types, two markers with meta-analytic sensitivity and specificity 415

Discussion 416
The provision of shared latrines reduced average soil concentrations of the molecular E. 417 coli marker EC23S at latrine entrances by more than 1-log10 but did not have a comparable effect 418 on culturable E. coli. EC23S latrine soil concentrations rose more in control compounds than 419 they fell in intervention compounds, which under the parallel trends assumption is interpreted as 420 a secular trend upwards that the intervention mitigated, for a much smaller absolute reduction 421 than suggested by the DID estimate (Figure 1). 43 However, an opposite, downward trend was 422 observed for all cEC concentrations. This discrepancy between two tests for the same organism 423 complicates the interpretation of the relatively strong intervention effect estimated for EC23S. 424 While the exact reasons for this discrepancy are yet to be determined, preliminary evidence from 425 a related analysis suggests that the modified mTEC broth used for E. coli culture may have 426 produced colonies of the same color and morphology for Klebsiella spp., which are commonly 427 soil-derived and not specific to feces. 70  have not assessed intervention impacts on fecal contamination of soil, but several have evaluated 437 contamination of drinking water, with some also testing child hands, food, or fomites. 15 As with the present study, all found no effect of sanitation-only interventions on any environmental 439 compartment; combined water, sanitation, and hygiene interventions improved drinking water 440 quality in two studies. 13,14 441 Measures of human-associated FST markers demonstrated that about a quarter of 442 compounds were impacted by human fecal contamination, with compound-level prevalence 443 estimates not statistically different at baseline and follow-up. Similarly, two cluster-randomized 444 trials, in India and Bangladesh, found no effect of rural sanitation interventions on the prevalence 445 of human-associated indicators in stored drinking water. 37,39 Both studies also assessed human 446 markers in mother and child hand rinse samples, which were not collected in this study. No 447 effect was observed for either hand type in India or on mother hands in Bangladesh, although the 448 human marker prevalence may have been reduced on child hands. 39 449 Accounting for the diagnostic accuracy of FST markers revealed far greater uncertainty 450 about host-specific fecal contamination, both of individual samples and population averages, 451 than indicated by the raw indicator measurements. The relatively poor sensitivity and specificity 452 of both human markers in this setting severely limited their ability to identify specific samples 453 contaminated with human feces, but even moderate improvements in accuracy could 454 substantially increase FST marker utility. For example, a study in Singapore reported 75% 455 sensitivity and 89% specificity for HF183, 74 corresponding to a 55% chance a positive sample is 456 contaminated at 15% background prevalence and an 84% chance at 44% prevalence, compared 457 with 26% and 60%, respectively, for detection of HF183 in our study. Correcting for indicator 458 sensitivity and specificity to human-source contamination, coupled with the limited observations 459 of each sample type, yielded imprecise prevalence estimates that were consistent with both near 460 absence and almost omnipresence of contamination. While the reduced amplification efficiency 461 of HF183 (82%) may have contributed to its low sensitivity, it produced similar accuracy-462 corrected estimates as Mnif, which was 95% efficient (Table S5)