Triangulation of microbial fingerprinting in anaerobic digestion reveals consistent fingerprinting profiles

The anaerobic digestion microbiome has been puzzling us since the dawn of molecular methods for mixed microbial community analysis. Monitoring of the anaerobic digestion microbiome can either take place via a non-targeted holistic evaluation of the microbial community through fingerprinting or by targeted monitoring of selected taxa. Here, we compared four different microbial community fingerprinting methods, i.e., amplicon sequencing, metaproteomics, metabolomics and cytomics, in their ability to characterise the full-scale anaerobic digestion microbiome. Cytometric fingerprinting through cytomics reflects a, for anaerobic digestion, novel, single cell-based approach of direct microbial community fingerprinting by flow cytometry. Three different digester types, i.e., sludge digesters, digesters treating agro-industrial waste and dry anaerobic digesters, each reflected different operational parameters. The α-diversity analysis yielded inconsistent results, especially for richness, across the different methods. In contrast, β-diversity analysis resulted in comparable profiles, even when translated into phyla or functions, with clear separation of the three digester types. In-depth analysis of each method’s features i.e., operational taxonomic units, metaproteins, metabolites, and cytometric traits, yielded certain similar features, yet, also some clear differences between the different methods, which was related to the complexity of the anaerobic digestion process. In conclusion, cytometric fingerprinting through flow cytometry is a reliable, fast method for holistic monitoring of the anaerobic digestion microbiome, and the complementary identification of key features through other methods could give rise to a direct interpretation of anaerobic digestion process performance.


Introduction 60
The anaerobic digestion (AD) microbiome has been one of the most studied engineered 61 ecosystems since the dawn of molecular methods. In the 1990s and early 2000s, taxa 62 identification took place via labour-intensive techniques, such as clone libraries (Godon et al., 63 1997; Mladenovska et al., 2003), while semi-quantification could be achieved through 64 fluorescent in situ hybridization (FISH) (Raskin et al., 1994). Comprehensive microbial 65 community profiling was performed via basic fingerprinting techniques, such as denaturing 66 gradient gel electrophoresis (DGGE) (Liu et al., 2002)  Given the target-specific nature of these different high-throughput methods, i.e., amplicon 79 sequencing and the "omics", and because of the complexity of natural and engineered 80 microbiomes, more and more studies focused on combining them to answer specific questions 81 (Franzosa et al., 2015), though this often remains challenging (Prosser, 2015). Combining 16S 82 rRNA gene amplicon sequencing, metagenomics and metatranscriptomics revealed a high 83 activity of methanogenic archaea compared to their low (relative) abundance in AD 84

Metaproteomics 150
Sample preparation for metaproteomics was carried out as described earlier (Heyer et al., 2019), 151 starting from the at -80°C frozen samples. Purified peptide mixtures were analysed by reversed-152 phase liquid chromatography coupled to a timsTOF™ Pro tandem mass spectrometer (Bruker 153 Daltonik GmbH, Bremen), using a 120 min gradient. For protein identification, the mass 154 spectrometer data were searched with Mascot™ 2.6.1 (Matrix Science, London, UK) (Perkins 155 et al., 1999) against a combined database from a previous study (Heyer et  modifications. An aliquot of 500 µL digestate was transferred to a 2 mL Eppendorf, and 173 following the introduction of an internal standard (30 µL of a 25 ng µL -1 L-alanine-d3 solution), 174 each sample was vortexed for 15 s, followed by centrifugation at 13,300g for 5 min at 4 °C. 175 The supernatant was passed through a PVDF filter (0.22 µm), diluted (1:2) with ultrapure H2O, 176 and transferred to a glass LC-MS (liquid chromatographymass spectrometry) vial. A pool of 177 extracted samples was used as a quality control standard for normalisation. 178 An UHPLC (ultra-high performance liquid chromatography) was achieved on a Vanquish 179 Fresh samples were diluted with PBS (phosphate buffer saline), which was prepared following 196 the instructions of the manufacturer (Sigma-Aldrich, Overijse, Belgium), to a concentration of 197 1 g L -1 TS in a 1.5 mL Eppendorf. The samples were then sonicated for 3 min at 10% amplitude 198 on a QSonica Q700 sonicator (Qsonica L.L.C, Newtown, CT, USA) using indirect sonication 199 with a cup horn. The samples were then directly diluted (1:1000) in 0.2 µm-filtered PBS, stained 200 with Sybr Green I (final concentration of 1x), and incubated for 20 min at 37°C. The stained 201 samples were analyzed on a FACSVerse flow cytometer at 60 µL min -1 for a maximum of 1 202 min, as described earlier (Props et al., 2018). The performance of the FACSVerse was verified 203 by the FACSuite software performance quality check using CS&T research beads (BD 204 Biosciences, San Jose, CA, USA). The raw flow cytometry data were exported in FCS format 205 and imported into R (v3.6.1) through the flowCore package (v1.38.2), denoised using a gating 206 strategy and processed with the Phenoflow package as described elsewhere 207 The TS, VS and TAN were determined according to standard methods (Greenberg et al., 1992). 249 The pH and conductivity were measured with a C532 pH and C833 conductivity meter 250 (Consort, Turnhout, Belgium), respectively. The free ammonia (NH3) concentration was 251 calculated based on the TAN concentration, pH and temperature in the digester. Concentrations 252 of the Na + and K + cations were determined on a 761 Compact ion chromatograph (Metrohm,253 Herisau, Switzerland) with a Metrosep C6 e 250/4 column and Metrosep C4 Guard/4.0 guard 254 column. The eluent contained 1.7 mM HNO3 and 1.7 mM dipicolinic acid. Sample preparation 255 was carried out by centrifugation at 10,000g for 10 min, followed by filtration over a 0.45 µm 256 filter (type PA-45/25, Macherey-Nagel, Germany) to remove all solids, and dilution with milli-257 Q water to reach the desired detection limits between 2 and 100 mg L -1 , both for Na + and K + . 258 Concentrations of the different VFA were determined using gas chromatography, as described 259 in SI1 (S3). 260 Microbial fingerprints were compared between the four different methods, based on α-and β-285 diversity. The comparison of the different α-diversity measures H0, H1, and H2 resulted in 286 significant differences between the four different methods and for each of the three diversity 287 measures (P < 0.0001, Wilcoxon signed-rank test, Figure 1). Exceptions were the H0 for the 288 amplicon sequencing and metaproteomics (P = 0.74) and the H2 for the metaproteomics and 289 metabolomics (P = 0.74). Spearman correlation (Table S1) Table S3). A significantly 300 higher value for H0, H1 and H2 (P < 0.0001) was observed based on the amplicon sequencing 301 data for the Sludge compared to the Agro and Dranco digesters, while there was no significant 302 difference between the Agro and Dranco digesters for H0 (P = 1.00), H1 (P = 0.36) and H2 (P = 303 0.084). The opposite was true for the metaproteomics and metabolomics data, with a 304 significantly lower H0 value (P < 0.0001) for the Sludge than the Agro and Dranco digesters. 305 For the metaproteomics data, also H1 and H2 reached significantly higher values (P < 0.0001) 306 in the Agro and Dranco digesters, compared to the Sludge digesters. The cytomics data differed 307 from the other methods, with no significant differences between the digester types for H0. In 308 contrast, for H1, the Dranco digesters showed significantly higher values than the Agro (P = 309 0.0006) and Sludge (P < 0.0001) digesters. For H2, the Dranco digesters showed the highest 310 values again, compared to the Agro (P = 0.0008) and Sludge (P < 0.0001) digesters, but the 311 Agro digesters also showed a significantly higher value (P = 0.034) than the Sludge digesters. 312 313

Microbial community fingerprinting: β-diversity 314
The β-diversity analysis, based on the Bray-Curtis dissimilarity index, revealed a significant 315 clustering (Table S4) (Table S4, Figure S1). The separate clustering of the Agro and Dranco digesters was 324 not possible when OTUs and proteins were combined into their respective phyla. 325 The key parameters temperature, TAN, conductivity, free ammonia, total VFA and pH showed 326 a significant relation with the amplicon sequencing, metaproteomics and metabolomics profiles 327 (P < 0.05, Table S5, Figure S2), except for the free ammonia concentration and the 328 metabolomics profile (P = 0.070). The cytomic profile showed a significant relation with 329 temperature, free ammonia, total VFA and pH (P < 0.05, Table S5), but not with the TAN (P = 330 0.094) and conductivity (P = 0.072). A significantly higher degree of variance was observed in 331 the Agro digesters compared with the Sludge digesters for the amplicon sequencing (P < 332 0.0001), metaproteomics (P = 0.0028), metabolomics (P < 0.0001) and cytomics (P = 0.0006) 333 (Table S6, Figure S3). The degree of variance was also significantly higher in the Dranco 334 digesters compared to the Sludge digesters for the amplicon sequencing (P = 0.0016), 335 metabolomics (P = 0.0075) and cytomics (P = 0.0001), but not for the metaproteomics (P = 336 0.61). The degree of variance between the Agro and Dranco digesters was similar for the 337 amplicon sequencing (P = 0.13) and cytomics (P = 0.61), but significantly different for the 338 metaproteomics (P = 0.017) and metabolomics (P = 0.0071). The Mantel test revealed a similar 339 β-diversity profile when comparing the amplicon sequencing with the metaproteomics (P = 340 0.0006, R 2 = 0.86) and metabolomics (P = 0.0018, R 2 = 0.24) profiles, and also when comparing 341 the metaproteomics with the metabolomics (P = 0.0018, R 2 = 0.24). In contrast, the cytomics 342 profile did not show a significant similarity with the amplicon sequencing (P = 0.091, R 2 = 343 0.10), metaproteomics (P = 0.43, R 2 = 0.073) and metabolomics (P = 0.50, R 2 = 0.078) profiles. 344 Differential abundance analysis, which was used to identify OTUs, proteins, metabolites, or 345 cytometric traits that show a significant difference in relative abundance between the digester 346 types, revealed an overall stronger difference between the Sludge compared to the Agro and 347 Dranco digesters for three of the four methods (Table 2). For the amplicon sequencing, 348 metaproteomics and metabolomics, the percentage of OTUs, proteins and metabolites, 349 respectively, without considering their relative abundance, that was significantly different 350 between the Sludge vs. Agro and Dranco digesters was higher than 50% (

Correlation within and between the features 387
Correlation analysis, including seven main process parameters, the three digester types, and the 388 Thus, a cytometric fingerprint can be obtained in less than an hour, yet, this remains, at present, 469 elusive in more complex systems, such as AD. In contrast, amplicon sequencing or other 470 "omics" methods require a comprehensive sample preparation that takes at least one day of 471 analysis time (Heyer et al., 2019). 472 The second challenge is information gain. The application of fingerprinting techniques will 473 only be beneficial if they add novel information to the operators compared to conventional 474 process parameters, e.g., methane production, pH, TAN, or VFA. The An example of the complexity and our lack of knowledge is reflected in syntrophic interactions. 500 A strong correlation was observed between bacterial acetate fermentation, ethanol 501 fermentation, and glycerol metabolism, matching the described pathways for syntrophic ethanol 502 degradation in a co-culture experiment (Keller et al., 2019). Glycerol metabolism correlated to 503 this pathway, since it comprised the missing alcohol dehydrogenase to combine both pathways. 504 In this study, the identified NADP-dependent isopropanol dehydrogenase (Meta-Protein 2708) 505 had even the highest spectral count of all identified proteins (SI3), emphasizing the potential 506 importance of this new pathway. On the downside, the imprecise assignment to glycerol 507 metabolism and a proper taxonomy (in our study, the taxonomy was Entamoeba histolytica) 508 shows the requirement for better databases and further isolation and characterization of single 509 species. 510 Another example of the difficulty of feature interpretations is the issue that features can have 511 multiple functions. Enzymes for glutamine, glutamate and aspartate metabolism were 512 abundantly present in Sludge digesters, and correlated with low values for pH and TAN. The 513 reason could be (1) elevated amino acid metabolism or (2) nitrogen assimilation (Bernard and 514 Habash, 2009). Glutamate is also required for osmotic regulation, since it is a counterion for 515 potassium, only present in Sludge digesters at low concentrations (Yan, 2007) (Table 1 & SI3). 516 A strategy to utilize the omics data for improved process operation of AD would be to shift 517 from fingerprinting methods to panel development. Analogous to clinical panels, such as the 518 blood panel, the panels would summarize the omics methods' features (e.g., the abundance of 519 methanogens). Subsequently, we have to define safe operation borders for these panels by 520 extensive experiments and modelling, before providing them to AD operators. For example, 521 biogas production and process stability can be monitored based on the abundance of 522 methanogens and their enzymes (Heyer et al., 2013;Munk et al., 2012), and a multivariate 523 integrative method, based on a microbial signature of AD inhibition by ammonia and phenol 524 could be used to predict ammonia inhibition (Poirier et al., 2020). On-demand, the complexity 525 of these panels could be extended. Besides the abundance of methanogens and their enzymes, 526 it might be beneficial to add thermodynamic considerations (i.e., process temperature and 527 metabolite concentrations) or the abundance of trace elements, co-factors, and hydrolytic 528 enzymes. 529

Conclusions 530
The four different fingerprinting methods, i.e., based on amplicon sequencing, metaproteomics, 531 metabolomics and cytomics, revealed a similar clustering of the microbiomes in AD. This 532 finding highlights that cytometric fingerprinting, a novel approach in AD, concerns a valid 533 method for fast microbiome fingerprinting in AD, which is a key advantage over other methods. 534 The information provided through fingerprinting can have its merits for process stability 535 monitoring, and the identification of key features, either taxa, proteins or metabolites, gives rise 536 to a more direct interpretation of process performance. To provide this knowledge to AD 537 operators, new strategies are required. We propose that the microbiomes' key features should 538 be summarized into panels and linked with guidance for the operators.   to 85% (red) relative abundance for the amplicon sequencing and 0 (white) to 30% for the 900 metaproteomics data. Only phyla with an average relative abundance > 0.1% were included. 901 (right, purple square) digesters. The colour scale ranges from 0 (white) to 15% (red) relative 905 abundance. Only functions with an average relative abundance > 0.1% were included. 906