Measuring both microbial load and diversity with a single amplicon sequencing library

The microbial population size, or load, in a host is a fundamental metric of colonization by commensals or infection by pathogens. Sequence-based measurement of DNA amount contributed by host and microbes in a sample provides a simple way of measuring microbial load, and it also has the ability to estimate microbiome diversity. Unfortunately, it is also very costly, especially when host DNA greatly outweighs microbial DNA. We introduce a robust two-step PCR strategy to quantify absolute abundance of the host-associated microbiome and describe its composition in a single, cost-effective amplicon library. We demonstrate the accuracy and flexibility of this method across multiple amplicons and hosts, and further present a simple technique that can be used, prior to sequencing, to optimize the host representation in a batch of libraries without a loss of information.


Introduction
Knowing the relative abundance of individual taxa reveals important information about any ecological community, including microbial communities, and an expedient means of learning the composition is by counting copy numbers of 16S or 18S rRNA genes (hereafter rDNA), the internal transcribed spacer (ITS), or other microbial amplicons. However, these common amplicon counting-by-sequencing methods do not provide information on the density or load of the microbes, which is typically defined by other methods such as time-consuming serial dilution plating. Critically, the microbial sequence counts lack a denominator that accounts for the amount of the habitat sampled. Typically, such samples are therefore sequenced to a similar depth and scaled to a common value, for example by subsampling each sample to the same number of sequence reads, or converting all abundances to percentages. A sparsely-colonized and a densely-colonized sample are thus processed in an indistinguishable manner. Another limitation of such compositional data is that the more abundant a microbe is, the more it reduces the relative abundance of all other microbes, because the sum of all microbes is constrained. Most study systems are open systems, and the total number of microbial cells can vary over many orders of magnitude due to nutritional contents, infection status, or other factors. Without knowing to what extent microbial abundances are actually changing in absolute terms, it can be very misleading to make inferences from compositional data alone (Gloor et al., 2017;Regalado et al., 2020;Tsilimigras & Fodor, 2016) . Unsurprisingly, experimental determination of the absolute microbial abundance in the environment, for example by relating microbial abundance to sample volume, mass, or surface area, has led to important insights in microbiome research that otherwise would have been missed with relative abundance data (Humphrey &  For many host-associated microbial samples, in particular those from plants (Regalado et al., 2020) , nematodes (Ogier et al., 2020) , insects (Parker et al., 2020;Wu et al., 2019) , and other organisms in which it is difficult or impossible to physically separate microbes from host tissues, both host and microbial DNA are co-extracted. For such samples, assuming thorough DNA extraction, the amount of host DNA is directly proportional to the amount of host tissue sampled (Davies, 1977;Massonnet et al., 2011) , and therefore the ratio of microbial DNA to host DNA is a measure of the microbial load of the sample ( Regalado et al., 2020) . Whole metagenome sequencing (WMS), which indiscriminately and quantitatively sequences all host and microbial DNA, is conceptually the simplest method to simultaneously describe the microbial community and measure the microbial load. In practice, however, host DNA is often greatly overrepresented in WMS data, and wasteful sequencing of the host is often necessary to achieve a useful depth of microbial reads (Karasov et al., 2019;Regalado et al., 2020) . For example, WMS sequencing of a leaf sample from wild Arabidopsis thaliana typically yields > 95% plant DNA and < 5% microbial DNA. This problem becomes intractable as sample number or host genome size increases. Furthermore, many WMS reads remain unclassifiable in complex samples (Karasov et al., 2019;Regalado et al., 2020) .
More commonly, researchers approach host-associated microbial samples with amplicon sequencing, which allows the use of blocking techniques or specially-designed primers to avoid PCR amplification of host DNA (Agler et al., 2016;Chelius & Triplett, 2001;Lundberg et al., 2013) . They then interpret the amplicon data with orthogonal methods to quantify the microbial load. For example, we previously proposed using shallow WMS as a way of measuring microbial load for leaf-associated microbial communities, and then used this information to scale 16S rDNA amplicon data prepared from the same samples (Regalado et al., 2020) . Alternatively, qPCR of both a microbial and host gene can be used to calculate load (Anderson & McDowell, 2015;Karasov et al., 2019;Nadkarni et al., 2002) , or qPCR of a host gene can be used to calibrate the amount of a synthetic rDNA template that can then be spiked into the sample prior to amplicon library preparation (Guo et al., 2019) . Sample volume or mass can also be used to calibrate the amount of such a synthetic spike in (Stämmler et al., 2016;Tourlousse et al., 2017) . Bacteria can be enumerated in samples of known volume or mass by counting colony forming units (CFU), by flow cytometry, or by qPCR of a specific microbial gene (Chen et Vandeputte et al., 2017) . While it can be effective, combining data from any two methods requires more work, uses more of the sample, and introduces more experimental noise and other caveats. For example, CFU counting only can quantify the culturable fraction of the community, quantitative spike-ins and uniplex qPCR are especially reliant on accurate pipetting, and qPCR does not allow identification or exclusion of off-target amplicons.
Past attempts to implement a single amplicon sequencing protocol for simultaneous analysis of bacterial 16S rDNA and eukaryotic host DNA have met significant challenges. Lebeis and colleagues used a single pair of universal SSU rDNA primers to co-amplify both the genomic 18S rDNA from Arabidopsis thaliana and the 16S rDNA from a bacterial community (Lebeis et al., 2015) . However, because plant 18S rDNA is present in hundreds or thousands of copies per genome with exact copy number varying between plant accessions (Rabanal et al., 2017) , it was a poor internal standard. Indeed, to overcome the overabundant plant signal, the authors needed to pre-amplify the bacterial template with a bacteria-specific 16S rDNA oligonucleotide primer, meanwhile blocking amplification of plastid 16S rDNA sequences with peptide nucleic acid (PNA) clamps. Humpfrey and Whiteman attempted to use plant chloroplast rDNA as a proxy for the amount of host (Humphrey & Whiteman, 2020) . They used bacteria-specific 16S rDNA primers to amplify microbial communities from leaves, employing PNA clamps (Lundberg et al., 2013) to reduce the otherwise excessive amplification of SSU rDNA from plant organelles. They showed that the unblocked chloroplast rDNA roughly correlates with the amount of host, and used it to estimate microbial load. The correlation was, however, noisy, likely because the fraction of chloroplast that is blocked by PNAs depends on the total DNA concentration, which varies between samples, and furthermore because the number of chloroplasts in plant cells is variable.
Here, we introduce host-associated microbe PCR (hamPCR), a robust and accurate protocol to co-amplify a low-copy host gene and one or more microbial regions, such as 16S rDNA, in the same reaction. We accomplish this with a two-step PCR protocol similar to that described in established amplicon protocols (Carlson et al., 2013;de Muinck et al., 2017;Gohl et al., 2016;Holm et al., 2019;Lundberg et al., 2013) . In these two-step protocols, gene-specific primers bind to 'raw' templates in the first step and 'tag' the templates with universal overhangs. In the second step, primers with complementarity to the universal overhangs add barcodes and sequencing adapters. Thus, the same set of barcoded primers can be used regardless of the amplicon(s) tagged in the first step, a major advantage in flexibility.
In hamPCR, we convert the first 'tagging' PCR step into a multiplex reaction that combines two or more template-specific primer pairs in the same reaction tube. To avoid propagating any biases due to differential primer annealing or primer availability across multiple cycles, the tagging step runs for only two PCR cycles, which is the minimum sufficient to make a single copy of each template molecule with universal overhangs at both 5' and 3' ends; a similar strategy of short initial cycling is employed in sequencing protocols that add Unique Molecular Identifiers (UMIs) (Fields et al., 2020;Karst et al., 2020;Kou et al., 2016;Tourlousse et al., 2017) and in other two-step multiplex protocols (Carlson et al., 2013;Wen & Zhang, 2012) .
We then remove the excess gene-specific primers with a clean-up reaction and use the 'tagged' templates in the second step of the PCR reaction, performed with a single barcoded primer pair for 20 or more cycles. Because all templates share the same universal overhangs, primer annealing will not bias their amplification. Such PCR co-amplification of diverse fragments is also commonplace in many RNA-seq and shotgun metagenomics protocols (Kukurba & Montgomery, 2015;Quince et al., 2017) . Similar to our work, Carlson and colleagues used a two-step PCR with a multiplexed tagging step to target and quantify both variable and joining segments at human T and B cell receptor loci (Carlson et al., 2013) . They showed through high-throughput sequencing that the method remains quantitative for between five and seven cycles of the first step of PCR. Similarly, Wen and Zhang used three tagging cycles of multiplexed PCR followed by 28-32 cycles of exponential PCR to examine the genetic purity of maize seeds, although they did not sequence their amplicons (Wen & Zhang, 2012) . Using such methods has enormous potential in microbiome research, as we demonstrate.
We examine the technical reproducibility and limitations of hamPCR and validate our results by independent microbial load quantification methods, using both synthetic and wild samples from a variety of hosts. Because hamPCR relies on a user-defined single-or low-copy host gene, it can work for large genomes, as we demonstrate by quantitative endophytic microbial profiling in Capsicum annuum (pepper,~3.5 Gb genome) (Kim et al., 2014) . To further increase flexibility of the method, we designed our host and microbe amplicons to have slightly different lengths, such that they can be resolved on an agarose gel. We show that after pooling finished sequencing libraries, an aliquot of the original pool can be re-barcoded with an unused barcode to form a reference sample that preserves the host-to-microbe ratio in the original pool. The host and microbe fractions in the remainder of the pool can then be separately purified and re-mixed at a favorable ratio for sequencing (for example with host DNA representing an affordable 5-10%). The reference sample is spiked into the re-mixed library before sequencing, and it guides bioinformatic scaling of host sequences in the re-mixed library back to original levels. Thus, in contrast to shotgun sequencing, samples with initially unfavorable host-to-microbe ratios can be easily adjusted prior to sequencing without loss of information.
Because of the practical simplicity and flexibility of hamPCR, it has the potential to supplant traditional microbial amplicon sequencing in host-associated samples, requiring only minor adjustments for labs already using a two-step PCR protocol. For labs not using a two-step PCR protocol for amplicon-based microbiome analyses, the ability to perform hamPCR offers yet another compelling reason to make the switch.

Technical reproducibility
The first 2-cycle "tagging" step of hamPCR multiplexes two or more primer pairs in the same reaction, at least one of which targets a single-or low-copy host gene (Supplementary Figure 1). The tagging primers are then cleaned with Solid Phase Reversible Immobilisation (SPRI) magnetic beads (Rohland & Reich, 2012) (Supplementary Figure 2). Next, an exponential PCR is performed using universal barcoded primers ( Figure 1A, Supplementary Figure 1). As a host amplicon in A. thaliana samples, we targeted a fragment of the GIGANTEA ( GI ) gene, which is well conserved and present as a single copy in A. thaliana and many other plant species (Duarte et al., 2010) . As microbial amplicons, we initially targeted widely-used regions of 16S rDNA.
To assess the technical reproducibility of the protocol, we made a titration panel of artificial samples combining varying amounts of pure A. thaliana plant DNA with pure bacterial DNA that reflects a simple synthetic community (Methods). These represented a realistic range of bacterial concentrations as previously observed from shotgun metagenomics of wild leaves, ranging from about 0.25% to 24% bacterial DNA (Regalado et al., 2020) . We applied hamPCR to the panel, pairing one of three commonly-used 16S rDNA amplicons for the V4, V3V4, and V5V6V7 variable regions with either a 502 bp or 466 bp GI amplicon (Methods, Supplementary Table 1), such that the host and microbial amplicons differed by approximately 80 bp in length and were resolvable by gel electrophoresis. In all pairings, the GI band intensity increased as the 16S rDNA band intensity decreased ( Figure 1B, Supplementary Figure 3). Encouragingly, we noticed the GI and 16S rDNA bands were consistently of similar intensity at around 0.5% to 1% bacterial DNA, which is approximately the concentration at which the templates should have equal copy number, assuming a 135 Mb (1C-value) plant genome with one GI copy and a 4 Mb bacterial genome with three or more 16S rDNA copies.
Focusing on the V4 16S rDNA primer set, 515F -799R, paired with the 502 bp GI amplicon, we amplified the entire titration panel in four independently-mixed technical replicates. In addition to use of the chloroplast-avoiding 799R primer (Chelius & Triplett, 2001) , plant organelle-blocking PNAs (Lundberg et al., 2013) further prevented unwanted 16S rDNA signal from organelles in the pure plant sample ( Figure 1B). We pooled the replicates and sequenced them as part of a paired-end HiSeq 3000 lane. Because the 150 bp forward and reverse reads were not long enough to assemble into full amplicons, we analyzed only the forward reads (Methods), processing the sequences into Amplicon Sequence Variants (ASVs) and making a count table of individual ASVs using Usearch (Edgar, 2010) .

Figure 1. Synthetic samples demonstrate technical reproducibility | A)
Schematic showing the two steps of hamPCR. The tagging reaction (left) shows two primer pairs: one for the host (E-a and F-b) and one for microbes (E-c and F-d). Each primer pair adds the same universal overhangs E and F. The PCR reaction (right) shows a single primer pair (P7-E and P5-i-F) that can amplify all tagged products. B) Representative 2% agarose gel of hamPCR products from the synthetic titration panel, showing a V4 16S rDNA amplicon at~420 bp and an A. thaliana GI amplicon at 502 bp. The barplot underneath shows the predicted number of original GI and 16S rDNA template copies. Numbers boxed below the barplot indicate the percent bacterial genomic DNA of total DNA. C) Relative abundance of the host and microbial ASVs in the synthetic titration panel, as determined by amplicon counting. Pure E. coli , pure A. thaliana without PNAs, and blanks were excluded. D) Data in C converted to microbial load by dividing by host abundance, shown on fourth root-transformed y-axis to better visualize lower abundances.
After identifying the ASVs corresponding to host GI and the bacteria in the synthetic community, we plotted the relative abundance of A. thaliana, the three Sphingomonas ASVs, and the single E. coli ASV across the samples of the titration panel ( Figure 1C). There was high consistency between the four replicates, more than what was visually apparent in the gel ( Figure  1C, Supplementary Figure 3). We next divided the ASV abundances in each sample by the abundance of the host ASV in that sample to give the quantity of microbes per unit of host, a measure of the microbial load. A fourth-root transformed Y axis, used for better visualization of low bacterial loads, revealed consistent and accurate quantification of absolute microbial abundance from 0 up to about 16% total bacterial DNA ( Figure 1D). Through this range, the actual sequence counts for total bacteria matched theoretical expectations based on the volumes pipetted to make the titration (solid black line, Figure 1D). At 16% bacterial DNA, bacteria contributed more than 96% of sequences, and the microbe-to-host template ratio was near 25. At higher microbial loads the trend was still apparent, and the decrease in precision was likely exacerbated by the effects of small numbers; when the host ASV abundance is used as a denominator and the abundance approaches 0, load approaches infinity and all sources of error have a greater influence on the quotient. In our case, only a minority of highly infected plants (Regalado et al., 2020) reached bacterial abundances above this highly-quantitative range.

hamPCR does not distort the detected composition of the microbial community
We amplified the same wild A. thaliana phyllosphere template DNA, either with four technical replicates using V4 16S rDNA primers alone, or alternatively with four technical replicates using hamPCR. After sequencing and deriving ASVs, we first compared ASV abundances within identically-prepared replicates of the pure 16S rDNA protocol to demonstrate best-case technical reproducibility of this established technique. As expected, this resulted in a nearly perfect correlation, with a coefficient of determination R 2 of 0.99 and abundance distributions as indistinguishable by a Kolmogorov-Smirnov test ( Figure 2B). Next, we removed from the hamPCR data the ASV corresponding to A. thaliana GI , and rescaled the remaining microbial ASVs to 100% to give relative abundance data. We then compared microbial ASVs from the four pure 16S rDNA replicates to those from the four rescaled hamPCR replicates. In this comparison as well, R 2 was 0.99 and the distributions were essentially identical ( Figure 2C). Thus, the inclusion of a host amplicon in the reaction did not introduce taxonomic biases.

Sensitivity to number of tagging cycles and template concentration
Two tagging cycles minimize amplification biases that might otherwise have compounding effects due to differential primer efficiencies for the host and microbial templates. However, for templates at low concentrations, inefficiencies due to SPRI cleanup could represent a bottleneck in amplification. Additionally, some types of blocking primers that prevent off-target amplification (Agler et al., 2016) may benefit from additional tagging cycles. To investigate the sensitivity of the results to additional tagging cycles, we applied hamPCR for 2 through 10 tagging cycles, both on the wild A. thaliana phyllosphere DNA described above and on a synthetic plasmid-borne template that contains bacterial rDNA and a partial A. thaliana GI gene template in a 1:1 ratio (Supplementary Figure 4). Surprisingly, for the primers we used, there was no apparent influence of additional tagging cycles, as 7-10 tagging cycles yielded the same distribution of host and 16S rDNA ASV abundances as 2 cycles (Kolmogorov-Smirnov test, p > 0.47). This was true for hamPCR and for 16S rDNA primers alone ( Figure 2D, 2E). This ideal result may not be the case for all primer pairs and should be tested experimentally, but it is consistent with data that either 5 or 7 tagging cycles gave comparable results for quantifying the human immune receptor repertoire (Carlson et al., 2013) , and with the fact that properly-designed multiplex reactions can be used in quantitative PCR carried out with many cycles (Vet et al., 1999) . We noticed that application of hamPCR to the 1:1 synthetic template yielded an average of 56.5% host GI and 43.5% bacteria, invariant with tagging cycle number ( Supplementary Figures 4 and 5). This slight and consistent bias in favor of GI may be a result of slight differences in tagging primer efficiency or primer concentration, and should be fine-tunable by altering primer concentration (Carlson et al., 2013) (Supplementary Figure 6).

Figure 2. hamPCR is robust and does not distort a complex microbial community | A)
DNA extracted from wild A. thaliana phyllospheres was used as a template for both V4 16S rDNA PCR (left, 515F and 799R) and hamPCR (right, V4 16S rDNA and GI 502 bp primers). Four replicates were produced with 2 cycles of the tagging reaction and 30 cycles of the exponential PCR, and additional replicates were produced with 3 to 10 tagging cycles paired with 29 to 22 PCR cycles (for a constant total of 32 cycles). The stacked columns show the relative abundance of major taxa, with the number of tagging cycles underneath. Boxed lower case letters demarcate groups of samples compared below. B) Correlation of fourth-root transformed 16S rDNA ASV abundances for the two 2-cycle 16S rDNA samples above panel A box [a] to the two 2-cycle 16S rDNA samples above panel A box [b]. Only ASVs with a minimum relative abundance of 0.05% were compared. R 2 , coefficient of determination. p -value from Kolmogorov-Smirnov test. C) Same as B , but for the four 2-cycle 16S rDNA samples above box [a] and [b] compared to the four 2-cycle hamPCR samples above box [d]. For hamPCR, the A. thaliana GI ASV was removed and the bacterial ASVs were rescaled to 100% prior to the comparison. D) Same as B and C , but for the four 2-cycle 16S rDNA samples above box  ii : An aliquot of DNA from the pool in i is re-amplified with 8 cycles of PCR to replace all barcodes in the pool with a new barcode, creating a reference sample. iii: A large aliquot of the pool from i is run on an agarose gel to physically separate and then purify the host and microbial fractions. iv: The host and microbial fractions and the reference sample are pooled in the ratio desired for sequencing. v: All sequences are quality filtered, demultiplexed, and taxonomically classified using the same parameters. vi: Host and microbial amplicon counts are summed from the samples comprising the pooled library ( h and m respectively), and from the reference sample ( H and M ). vi : H, h, M, and m are used to calculate the scaling constant f for the dataset . All host sequence counts are multiplied by f to reconstruct the original microbe-to-host ratios. vii: Reconstructed original abundances. B) Relative abundance (RA) of actual sequence counts from our original HiSeq3000 run. C) Relative abundance of actual sequence counts from our adjusted library showing reduced host and 4 reference samples. D) The data from C after reconstructing original host abundance using the reference samples. E) The total fraction of host vs. other ASVs in the original library, reduced host library, and reconstruction. RA, relative abundance F) Relative abundances in the original and reconstructed library for all ASVs with a 0.05% minimum abundance, shown on 4 th -root transformed axes. R 2 , coefficient of determination. p -value from Kolmogorov-Smirnov test.
As a further exploration of the robustness of the protocol, we applied hamPCR to a range of total A. thaliana leaf template concentrations of between 5 and 500 ng total DNA per reaction, covering a typical template range of 5 to 100 ng. Through the typical range, there was no difference in microbe or host ASV abundances. At 200 ng or above, the host amplicon seemed to be slightly favored, possibly because the 16S rDNA primers started to become limiting at these concentrations (Supplementary Figure 7).

Pre-sequencing adjustment of host-to-microbe ratio
We realized that the size difference between host and microbe bands in hamPCR affords not only independent visualization of both amplicons on a single gel, but also allows convenient and easy adjustment of the host and microbial signals in the pooled library prior to sequencing, in order to improve cost effectiveness. We developed a strategy by which the final hamPCR amplicons are pooled and one aliquot of the pool is rebarcoded to form a reference sample that preserves the original host-to-microbe ratio. The remainder of the pool is run on a gel and the host and microbial bands are separately purified, quantified, and remixed (in our case, to reduce host and gain more microbial resolution). The rebarcoded reference sample, which was not remixed and thereby preserves the original ratio, can be sequenced separately or spiked into the remixed library prior to sequencing. Following sequencing, the reference sample provides the key to the correct host and microbe proportions, allowing simple scaling of the entire library back to original levels ( Figure 3A, Supplementary Figure 8).
We made 4 replicate reference samples for our HiSeq 3000 run, which included much of the data from Figure 1 and Figure 2, and then separately purified the host and microbial fractions of the library (Supplementary Figure 9). Based on estimated amplicon molarities of the host and microbial fractions, we remixed them targeting 5% host DNA, added the reference samples, and sequenced the final mix as part of a new HiSeq 3000 lane. A stacked-column plot of relative abundances for all samples on the original run clearly showed the host A. thaliana GI ASV highly abundant in some samples, on average responsible for about 22% of total sequences in the run ( Figure 3B, 3E). The remixed reduced host library had nearly 10-fold less host GI ASV, 2.6%, slightly lower than our target of 5% ( Figure 3C, 3E). The reference samples averaged 19.2% of host GI ASV, very close to the 22% host fraction in the original library. After using the reference samples to reconstruct the original host abundance in the remixed dataset, we recreated the shape of the stacked column plot from the original library (compare Figure 3D to 3B). When the fourth-root abundances for ASVs above a 0.05% threshold were compared between the original and reconstructed libraries, the R 2 coefficient of determination was 0.99, with no significant difference between the distributions (Kolmogorov-Smirnov test, p > 0.86).

hamPCR with different 16S rDNA regions compared to shotgun metagenomics
We next applied hamPCR to leaf DNA from eight wild A. thaliana plants that we had previously analyzed by WMS, and from which we therefore had an accurate estimate of the microbial load (Regalado et al., 2020) . We applied hamPCR with primer combinations targeting the host GI gene and either the V3V4, V4, or V5V6V7 variable regions of the 16S rDNA. We produced three independent replicates for each primer set, which we averaged for final analysis (Figure 4, Supplementary Figure 10). Across WMS and the three hamPCR amplicon combinations, the relative abundance of bacterial families was consistent ( Figure 4A-D, i), with slight deviations likely due to the different taxonomic classification pipeline used for the metagenome reads (Regalado et al., 2020) , as well as known biases resulting from amplification or classification of different 16S rDNA variable regions (Graspeuntner et al., 2018) . After converting both WMS and hamPCR bacterial reads to load by dividing by the plant reads in each sample, we recovered a similar pattern despite the quantification method, with decreasingly lower total loads progressing from plant S1 to plant S8, and individual bacterial family loads showing similar patterns ( Figure 4A-D, ii, iii). Relative differences in load when comparing the different hamPCR amplicons are likely in part due to different affinities of the 16S rDNA primer pairs, and rDNA copy number variation among the microbial families (Kembel et al., 2012) . To quantify the consistency of hamPCR load estimates with WGS load estimates, we plotted the loads against each other and found strong positive correlations, with the highest correlation with the hamPCR using V4 rDNA ( Figure 4E-G).
It is important to note that while relative load ratios between samples were consistent across hamPCR primer sets, the total microbe-to-host ratio varied substantially, with the maximum V5V6V7 16S rDNA total load at less than 3 times host, and the maximum V4 16S rDNA total load near 16 times host. This is likely due to variation in GI and 16S rDNA primer efficiencies. To make a statement about the absolute ratio of plant cells to bacterial cells using hamPCR it would be important to include standard samples with known bacterial load ratios. A. thaliana GI amplicon and a~590 bp V3V4 16S rDNA amplicon (primers 341F -799R). C) Similar to B, but with the 16S rDNA primers targeting a~420 bp V4 16S rDNA amplicon (515F -806R). D) Similar to B, but with a 466 bp A. thaliana GI amplicon and a~540 bp V5V6V7 16S rDNA amplicon. E) 4 th root transformed abundance of each bacterial family determined by hamPCR of V3V4 16S rDNA plotted against the 4 th -root transformed bacterial load from shotgun metagenomics. R 2 = Coefficient of determination. F) Same as E , but for hamPCR of V4 16S rDNA. G) Same as E , but for V5V6V7 16S rDNA.

Three-amplicon hamPCR for simultaneous determination of oomycete and bacterial load
More than two amplicons can be quantitatively tagged and amplified, although it requires more initial troubleshooting to find compatible primers (Carlson et al., 2013;Shuber et al., 1995;Vet et al., 1999;Wen & Zhang, 2012) . We set up hamPCR to co-quantify bacteria and eukaryotic oomycetes on plants. These are both diverse microbial groups that include important A. thaliana pathogens and that cannot be captured by the same rDNA primer set. We first tested universal ITSo primers targeting oomycete rDNA (Methods) in combination with A. thaliana GI primers and 16S rDNA primers targeting the bacterial V4, V3V4, or V5V6V7 regions, using as template our synthetic plasmid that includes templates for the three primer sets in equal proportion (Methods). A combination of all three amplicons seemed to work efficiently for the V4 region (Supplementary Figure 11), and with this encouraging result, we set up a simple infection experiment. As pathogens, we prepared local strain 466-1 of the obligate biotrophic oomycete Hyaloperonospora arabidopsidis ( Hpa ) (Coates & Beynon, 2010) and the well-described bacterial pathogen Pseudomonas syringae pv. tomato ( Pto ) DC3000 (Xin & He, 2013) . We used two A. thaliana genotypes: the reference accession Col-0, which is resistant to Hpa 466-1 but susceptible to Pto DC3000, and an enhanced disease susceptibility 1 ( eds1-1 ) mutant, which has a well-studied defect in a lipase-like protein necessary for many disease resistance responses and which is susceptible to both pathogens (Bhandari et al., 2019) .
We infected seedlings with either Hpa 466-1 alone, a mix of Hpa 466-1 and Pto DC3000, or a buffer control, and maintained them for 7 days under cool, humid conditions ideal for Hpa growth (Methods). The eds1-1 plants inoculated with Hpa 466-1 became heavily infected and sporangiophores were too numerous to count. No visible bacterial disease symptoms were present on any of the plants, likely because the cool temperature decelerated bacterial growth and symptom appearance (Huot et al., 2017) . We ground pools of 4-5 seedlings in a buffer and used a small aliquot to count Pto DC3000 CFUs, and the remainder of the lysate for DNA isolation and hamPCR. Despite the lack of bacterial symptoms, we recovered Pto DC3000 CFUs from the inoculated plants. We applied hamPCR to these samples using the ITSo/16S/ GI primer set, but due to excessive ITSo product, we repeated library construction replacing the ITSo primers with primers for a single copy Hpa actin gene (Supplementary Figure 11) (Anderson & McDowell, 2015) . Intensity of the actin product correlated with visual Hpa symptoms (Supplementary Figure 11). Sequencing the libraries confirmed Hpa and Pto ASVs in the inoculated samples, as expected. A standard bacterial relative abundance plot, as would be obtained from pure 16S rDNA data, confirmed the presence of Pto DC3000 in the bacteria-infected samples, and in addition revealed that Hpa -infected samples had a different bacterial community than uninfected samples ( Figure 5A). Importantly, it failed to detect obvious differences between microbial communities on Col-0 and eds1-1 plants. However, after including the actin ASV from Hpa and converting all abundances to microbial load, a striking difference became apparent between Col-0 and eds1-1 , with eds1-1 supporting higher bacterial and Hpa abundances. This is expected from existing knowledge (Bhandari et al., 2019) , and supported by Pto DC3000 CFU counts from the same plants ( Figure 5B, Figure 5C). The microbial load plot also revealed that Hpa -challenged plants supported more bacteria than buffer-treated plants, indicating either that successful bacterial colonizers were unintentionally co-inoculated with Hpa , or that Hpa caused changes in the native flora ( Figure 5B).
To confirm that the sequence abundances for all three amplicons accurately reflected the concentration of their original templates, we prepared a stepwise titration panel with real samples, mixing increasing amounts of DNA from an uninfected eds1-1 plant (low load) into decreasing amounts of DNA from an Hpa -infected eds1-1 plant (high load). Sequencing triplicate hamPCR libraries revealed a stepwise increase in ASV levels for all amplicons, consistent with the expectation based on pipetting ( Figure 5D). These data, combined with the infection experiment, show that hamPCR is quantitative for at least two independent microbial amplicons in real-world samples. Hpa and Pto DC3000. The ASV corresponding to Pto DC3000 is shown separately from other Pseudomonadaceae; all other bacteria are classified to the family level, including remaining Pseudomonadaceae. B) The same data as shown in A, but converted to microbial load and with the ASV for Hpa included. The asterisk indicates what is most likely an unreliable load calculation, because host ASV abundance was below 2% of the total. Same color key as in A, with an additional color for Hpa added.
Hpa amplicon abundance was quadrupled in this panel for better visualization. C) Pto DC3000 bacteria were quantified in parallel on the Col-0 and eds1-1 samples infected with Pto DC3000 using CFU counts, the microbial load data in B, or the relative abundance data in A. D) An uninfected plant sample was titrated into an Hpa -infected sample to make a panel of eight samples. i: the relative abundance of all hamPCR amplicons. ii: after using host ASV to convert amplicons to load. The cumulative load is shown in black. iii: the load on a 4 th -root transformed y-axis, showing less-abundant families. iv: stacked column visualization of the panel as it would be seen with pure 16S rDNA data. v: stacked-column plot of the panel corrected for microbial load. Same color key as in B, but with colors for A. thaliana and sum of microbes added. E) Similar to D, but with the nematode worm P. pacificus as host. Instead of bacterial families, specific ASV abundances are shown.

Utility in diverse hosts and crops with larger genomes
To demonstrate the utility of hamPCR outside of plants, we prepared samples of the nematode worm Pristionchus pacificus , fed on a diet of either pure E. coli OP50, or alternatively a mix of E. coli OP50 with Pto DC3000 and Lysinibacillus xylanilyticus . The worms were washed extensively with PBS buffer to remove epidermally-attached bacteria, enriching the worms for gut-associated bacteria, and we prepared DNA from each sample. In the same manner as described in the previous section, we titrated the two DNA samples into each other to create a panel of samples representing a continuous range of colonization at biologically-relevant levels. Over three replicates, hamPCR accurately captured the changing bacterial loads of the gut microbes ( Figure 5E, Supplementary Figure 12 ).
As a final proof-of-concept experiment for the potential of hamPCR for microbiome research in species with larger genomes, such as crops, we used bell pepper ( Capsicum annuum ), which has a 3.5 Gb genome (Kim et al., 2014) , approximately 25× bigger than A. thaliana , and the pepper pathogenic bacterium Xanthomonas euvesicatoria ( Xe ) strain 85-10 (Thieme et al., 2005) . We first constructed a simple infiltration panel. After diluting Xe 85-10 in MgCl 2 buffer to final concentrations of 10 4 , 10 5 , 10 6 , 10 7 and 10 8 CFU / mL, we infiltrated four replicate leaves with each concentration. Immediately afterwards, without further bacterial growth, we harvested leaf discs within inoculated areas using a cork borer (7 mm diameter). From each sample, we used fresh lysate for Xe 85-10 CFU counting and processed the remaining lysate into DNA. We used the DNA both for hamPCR, targeting the V4 16S rDNA and the pepper GI gene ( CaGI ), and qPCR, targeting the xopQ gene for a Xe type III effector (Doddaraju et al., 2019) , and the C. annuum UBI-3 gene for a ubiquitin-conjugating protein (Wan et al., 2011) .
Sequencing the hamPCR libraries revealed that as the Xe 85-10 infiltration concentration increased, so did the resulting load of the ASV corresponding to Xe 85-10 ( Figure 6A, Supplementary Figure 13). The other major bacterial classes detected in the infiltration experiment, comprising commensal bacteria already present in the leaves, were Actinobacteria as well as Alpha-, Beta-, and Gammaproteobacteria. These had similar, low abundances, regardless of the amount of infiltrated Xe 85-10 ( Figure 6B). The Xe 85-10 ASV loads showed nearly the same exponential differences between samples as CFU counts recovered from the same lysates, although at lower infiltration concentrations, hamPCR gave a higher estimate than CFU counts ( Figure 6C). Like hamPCR, qPCR also gave higher estimates than CFU counts at low microbial loads. The presence of a low level of native Xe on the leaves could potentially explain this discrepancy, because they would be detected by DNA-based methods, but would not have the antibiotic resistance necessary to be counted in the CFU measurements.

Figure 6. hamPCR can be generalized to
Xanthomonas infection of C. annuum | A-C: Infiltration experiment. A) Xe 85-10 was inoculated into C. annuum leaves at five concentrations, leaves were harvested immediately afterwards, and hamPCR was performed. The corrected load is shown for the particular ASV corresponding to Xe 85-10, as well as for the major bacterial classes. B) The total load for all bacterial classes shown in A at each concentration. C) Actual CFU counts for Xe 85-10 in the infiltration experiment juxtaposed with aligned qPCR and hamPCR loads. D-F: Growth curve experiment. D) Xe 85-10 was inoculated into C. annuum leaves at 10 4 CFU/mL. Leaf samples were taken at 0, 2, 4, 7, 9, and 11 days post inoculation (dpi), and hamPCR performed. The corrected load is shown for the particular ASV corresponding to Xe 85-10, as well as for the major bacterial classes. E) The total load for all bacterial classes shown in D at 0, 2, and 4 dpi (*** = p < 0.001). F) Actual CFU counts for Xe 85-10 in the growth curve experiment juxtaposed with aligned qPCR and hamPCR loads.
Having confirmed that hamPCR could capture absolute changes in pathogen abundance in our system, we set up a growth-curve experiment to monitor the pathogen and the commensal microbes through time. We infiltrated six C. annuum leaves of six different plants with Xe 85-10 at a concentration of 10 4 CFU / mL, and took samples from each plant at 0, 2, 4, 7, 9 and 11 dpi for CFU counting, qPCR, and hamPCR. We observed a rapid increase in Xe 85-10 ASV abundance as a result of rapid bacterial growth, leveling off at 7 dpi ( Figure 6D). By 7 dpi, bacterial growth had reduced the host GI amplicon abundance to below 2%, making the load corrections unreliable from day 7 on (gray box, Figure 6D). Notably, the other major bacterial classes, Actinobacteria and the Alpha-, Beta-, and Gammaproteobacteria, also increased in absolute abundance through time, a trend significant even comparing 2 dpi to 0 dpi ( Figure 6E, Mann Whitney U-test, p < 0.001). This increase in load for the other classes was not an artifact due to high Xe 85-10 titers, because in the infiltration experiment, measurements for these classes had not changed even at higher pathogen concentrations ( Figure 6B). This subtle but biologically significant effect would be invisible in a pure 16S rDNA amplicon analysis.
In the growth curve, Xe 85-10 ASV loads compared very closely to CFU counts and to qPCR abundances up to 7 dpi ( Figure 7F). Unlike in the infiltration experiment, the three methods agreed well at low Xe 85-10 concentrations. After 7 dpi, CFU counts and qPCR continued to show slight increases in load. However, hamPCR flatlined; above the concentration reached at 7 dpi, the number of reads comprising the host ASV became too low for accurate load estimates (vertical gray bars, Figure 6F).

Discussion
With hamPCR, we developed a simple and robust method to quantitatively co-amplify one or more microbial amplicons along with an unrelated single-copy host gene in the same reaction, allowing accurate determination of microbial community composition and microbial load from a single sequencing library (Figure 1, Figure 2). Furthermore, we also developed a method to predictably optimize the amount of sequencing effort devoted to microbe vs. host, without losing information about the original microbe to host ratio ( Figure 3). This is an important advance with our approach that greatly increases cost-efficiency when sequence capacity is limited.
The principle behind hamPCR builds on a body of literature describing related, firmly established techniques. The two-step PCR protocol on which hamPCR relies is well-established for microbial amplicon sequencing (de Muinck et  . The use of multiple primer pairs in a first step of a two-step PCR protocol and limiting the cycling of the first step to reduce priming bias also has precedent (Wen & Zhang, 2012) , as does the use of high-throughput sequencing for quantifying amplicons generated through such a protocol (Carlson et al., 2013) . Co-amplification of diverse fragments from multiple organisms by PCR has proven reliable in low-input shotgun metagenomics and RNA-seq protocols, albeit with some PCR-related biases such as differential amplification based on GC content (Bowers et al., 2015;Carlson et al., 2013;Rinke et al., 2016) . Multiplexed reactions for fluorometric qPCR analysis are also well described, including in a protocol to quantitatively co-amplify microbe and host DNA (Weller et al., 2000) . This rich literature should increase confidence when implementing hamPCR in microbiome research, and it also provides rich resources for optimization and further development.
To perform hamPCR, a universal two-step PCR protocol for amplicon sequencing is required (Carlson et Lundberg et al., 2013) . A two-step protocol already provides the major flexibility advantage that only a one-time investment in a set of universal barcoding primers for step two is required, which then can be easily adapted to any amplicon by simply swapping in different template-specific primers for step one. For labs already equipped for two-step PCR, hamPCR is simple to implement, requiring only slight adjustments to cycling conditions, and ordering a handful of template-specific host and microbe primers for the first step.
Throughout our reactions, we have used 30 PCR cycles for the exponential step for consistency. However, reducing the PCR cycle number whenever possible will reduce noise and bias. We have demonstrated that the tagging step in hamPCR is sensitive to the relative concentrations between the host and microbe primers (Supplementary Figure 6), consistent with the effects of primer concentration on amplification efficiency in qPCR (Bustin & Huggett, 2017;Pierce et al., 2005;Sanchez et al., 2004) . This property was exploited by Carlson and colleagues, who adjusted primer concentrations until multiple amplicons gave the correct product ratio (Carlson et al., 2013) . The effect of primer concentration has important implications for how a large project should be prepared. We recommend that the tagging primers should be carefully pipetted into a multiplexed primer master mix sufficiently large to be used for the entire project. If such a mix is not possible, resequencing of the same control samples across sample batches can allow detection and calibration of slight batch differences in the microbe-to-host ratios.
A limitation of hamPCR is reduced accuracy at very high microbial loads. As the number of host sequences used as a denominator approaches 0, the quotient, that is the calculated load, quickly approaches infinity, and small sources of noise become amplified. Higher loads also require deeper sequencing to ensure that enough host reads are sequenced. For our samples, only the most highly-infected samples reached a level that interfered with accurate quantification. We expect that a simple application of hamPCR will be precise and accurate for the majority of colonized hosts. There are three straightforward adjustments, however, that can increase host signal and thus reduce noise for samples with the highest microbial loads. First, adjusting the host and microbe ratio prior to sequencing, as demonstrated in Figure 3, could be used to increase the overall host representation. If only a subset of the samples has very high loads, one could increase host in only this subset, as long as a reference sample is also made from it to reconstruct the original microbe to host ratio in that subset. Second, a host gene with a higher copy number could be chosen for template tagging throughout the entire project. Finally, carefully increasing the concentration of the host primer in the tagging reaction could also increase the representation of host (Supplementary Figure 6) (Carlson et al., 2013) .
In summary, we have demonstrated that hamPCR is agnostic to the taxonomic identities of the organisms studied, their genome sizes, or the functions of the genomic regions amplified. We have also shown that hamPCR can quantitatively monitor three amplicons at the same time, and in principle can multiplex more with careful design. Our focus here has been on tracking whole organisms, but hamPCR would also enable quantitatively monitoring bacterial populations and sub-genomic elements, such as a plasmid that might not be shared by all strains in a population. An exciting application of hamPCR is for the study of endophytic microbial colonization and infection in crop plants, many of which have very large genomes that preclude the analysis of any sizable number of samples by shotgun sequencing. In a previous study, we sequenced leaf metagenomes from over 200 A. thaliana plants (Regalado et al., 2020) . To invest the same sequencing resources in wheat, assuming comparable microbial loads, the same investment would barely be sufficient for two samples due to the > 16 Gb wheat genome. Other exciting applications are the recognition of cryptic infections (Stergiopoulos & Gordon, 2014) , tracking of mixed infections, and measurement of pathogen abundances on hosts showing quantitative disease resistance -this could even be accomplished by spiking hamPCR amplicons into the same sequencing run used to genotype the hosts (St Clair, 2010) . Finally, for projects with many samples, the fact that hamPCR derives microbial composition and load from the same library not only saves cost and uses less of the sample, but also simplifies analysis and project organization.

Methods hamPCR Protocol
hamPCR requires two steps: a short 'tagging' reaction of 2 cycles, and a longer 'exponential' reaction. We used 30 cycles throughout this work, although fewer can and should be used if the signal is clear. The primers employed in the tagging reaction were used at ⅛ the concentration of the exponential primers, as this still represents an excess in a reaction run for only two cycles, prevents waste, and reduces dimer formation. See Supplementary Information and Supplementary Table 1 for detailed information about the primers.

Tagging Reaction
We used Taq DNA Polymerase (NEB, Ipswich, MA, USA) for the first tagging step, and set up 25 μL reactions as follows. Each well received 20 μL of master mix and 5 μL of DNA (around 50 ng). Completed reactions were thoroughly mixed on a plate vortex and placed into a preheated thermocycler. We used the following cycling conditions: 1) 94° C for 2 min. Denature 2) 78° C for 10 sec. PNA annealing 3) 58° C for 30 sec.
Primer annealing 5) 72° C for 2 min. Extension 6) GO TO STEP 2 for 1 additional cycle 7) 16° C forever Hold The tagging reaction was cleaned with Solid Phase Reversible Immobilization (SPRI) beads (Rohland & Reich, 2012) . All ITS amplicons were cleaned with a 1.1:1 ratio of SPRI beads to DNA, or 27.5 μL beads mixed in 25 μL of tagged template. After securing beads and DNA to a magnet and removing the supernatant containing primers and small fragments, beads were washed twice with 80% ethanol, air dried briefly, and eluted in 17 μL of water. For primer sequences, see Supplementary Information and Supplementary Table 1.

Exponential Reaction
15 µL of the tagged DNA from step one was used as template for the exponential reaction. To reduce errors during the exponential phase, we used the proof-reading enzyme Q5 from NEB, with its included buffer. We prepared reactions in either 25 μL for technical tests with replicated samples. For samples prepared without sequenced replicates, we prepared bias-reducing triplicate reactions in which a 40 μL mix was split into 3 parallel reactions of~13 μL prior to PCR. We first distributed 8.75 μL (or 23 μL for 40 μL mixes) of master mix to each well. We then added 15 μL of the DNA from the tagging reaction and 1.25 μL (or 2 μL for 40 μL mixes) of 5 μM barcoded reverse primer. For the 40 μL mixes, 13 μL was pipetted into two new PCR wells. The PCR reactions were placed into a hot thermocycler and cycled with the following conditions: 1) 94° C for 2 min. Denature 2) 94° C for 30 sec.

Library Quality Control and Pooling
For visualization, 5 μL of PCR product was mixed with 3 μL of 6x loading dye and all 8 μL loaded on a 2% agarose gel and stained with ethidium bromide. The remaining PCR products were cleaned with a SPRI-to-DNA ratio of 1.1:1.0 (v/v). The DNA concentrations in the cleaned products were measured with PicoGreen (Invitrogen, Carlsbad, CA, USA) and samples were pooled at equimolar total DNA ratios. We note that because host and microbial fractions are independently visible on the gel, it would also be possible to measure the quantity of microbial products with image analysis software such as ImageJ (Rueden et al., 2017) and pool at equimolar microbial ratios.
The pooled library was diluted to~1 ng/uL and run on a Bioanalyzer High Sensitivity DNA chip (Agilent, Santa Clara, CA, USA) to check library purity and to estimate the expected ratio of host to microbial amplicons in the sample.

Pre-Sequencing Adjustment of Host : Microbe Ratio
To adjust the host-to-microbe ratio in the "synthetic template panel" and "cycle number test" prior to sequencing on a HiSeq3000 instrument (Illumina, San Diego, CA, USA), four reference samples were first made by rebarcoding the original pooled library (Supplementary Figure 8). To accomplish this,~5 ng of of the pooled library was used in a 30 μL PCR reaction as follows:

Master Mix
For 30 μL 5x Q5 buffer 6 μ L Q5 polymerase 0.3 μ L dNTPs (10 mM) 0.6 μ L 100 μM F universal PCR primer 0.075 μ L ** 5 μM reverse barcoded primer 1.5 μ L ** 5 ng original pooled library 5 μ L Water 16.45 μL ** not part of master mix After distributing 23.5 μL of master mix to each well, 5 μL of the diluted original library was added to each well (5 ng total), along with 1.5 μL of 5 μM barcoded reverse primer. Just prior to placing the reactions in the thermocycler, a 5 μL pre-PCR aliquot was removed from each one and kept on ice to preserve the pre-PCR concentrations. The remaining 25 μL reaction was placed into a preheated thermocycler and run for 8 cycles, using the following cycling conditions: 1) 94° C for 2 min. Denature 2) 94° C for 30 sec.
PNA annealing 4) 60° C for 1 min. Primer annealing 5) 72° C for 1 min. Extension 6) GO TO STEP 2 for 7 additional cycles 7) 16° C forever Hold Following PCR, the pre-PCR aliquots were run alongside 5 μL of post-PCR product on a 2% gel to confirm successful amplification of the reference libraries (Supplementary Figure 9). The remaining 20 μL of PCR reactions were then cleaned with SPRI beads (1.5 : 1.0 [v/v]) and set aside. A large aliquot of the original library (approximately 50 ng) was also run on a 2% gel to separate the host and microbe bands for individual purification. The bands were cut out of the gel and each band was put into a separate Econospin spin column (Epoch Life Sciences, Missouri City, TX, USA) without any other liquids or binding buffer. The gel slices were centrifuged at maximum speed to force the liquid containing the DNA into the bottom chamber, leaving the dried gel on top. The eluted DNA was cleaned with SPRI beads at 1.5 : 1.0 (v/v) and eluted in EB.
The purified pooled host library fraction, pooled microbe library fraction, and each of the four reference libraries were quantified with Picogreen and the molarity of each was estimated. The pools were then mixed together, targeting host molarity at 5% of the total and each reference library at 1% of the total.

Illumina Sequencing
Pooled and quality-checked sequencing libraries were cleaned of all remaining dimers and off-target fragments using a BluePippin (Sage Science, Beverly, MA, USA) set to a broad range of 280 to 720 bp. The libraries were then diluted for Illumina sequencing following manufacturers' protocols. Libraries were first diluted to 2.5 -2.8 nM in elution buffer (EB, 10 mM Tris pH 8.0) and spiked into a compatible lane of the HiSeq3000 instrument (2 x 150 bp paired end reads) to occupy 2-3% of the lane. Samples were sequenced across four total lanes (Supplementary Table 1)

Sequence Processing
The sequences were demultiplexed first by the 9 bp barcode on the PCR primers (Supplementary Table 1), of which there are 96, not allowing for any mismatches. In some cases in which two samples differed in both their host and microbe primer sets, we amplified both samples with the same 9 bp barcode to increase multiplexing; such samples were further demultiplexed using regular expressions for the forward primer and reverse primer sequences. Following demultiplexing, all samples were filtered to remove sequences with any mismatches to the expected primers. With HiSeq3000 150 bp read lengths, overlap of read 1 and read 2 was not possible for our amplicons, and therefore only read 1 was processed further.

Samples Synthetic Titration Panel
Seeds from the Arabidopsis thaliana accession Col-0 were surface sterilized by immersion for 1 minute in 70% ethanol with 0.1% Triton X-100, soaking in 10% household bleach for 12 minutes, and washing three times with sterile water. Seeds were germinated axenically on ½ strength MS media with MES, and about 2 g of seedlings were harvested after 10 days. DNA was extracted in the sterile hood as in (Regalado et al., 2020) and diluted to 10 ng/μL in elution buffer (10 mM Tris pH 8.0, hereafter EB). Pure E. coli and Sphingomonas sp. cultures were likewise grown with LB liquid and solid media respectively, and DNA was extracted using a bead beating protocol (Regalado et al., 2020) . E. coli DNA was used separately, or alternatively pooled with the mixed Sphingomonas DNA, and diluted to 10 ng/μL. The plant DNA and microbial DNA were then combined according to the following

Synthetic Equimolar Plasmid Template
The ITS1 region from Agaricus bisporus , a fragment of the GI gene from A. thaliana Col-0 accession, the 16S rRNA gene from Pto DC3000 , and the ITS1 region from H. arabidopsidis were PCR amplified individually, combined into one fragment via overlap extension PCR, and cloned into pGEM®-T Easy (Promega, Madison, WI, USA). The sequences of these templates can be found in Supplementary Information.

Wild A. thaliana Samples
DNA from chosen samples previously analyzed by conventional 16S rDNA-only sequencing of the V4 region and shotgun metagenomics (Regalado et al., 2020) was reused, chosen to capture a wide range of realistic bacterial loads. The samples were individually assayed with hamPCR using 5 μL DNA template (approximately 50 ng). V4 tagging was performed with 515_F1_G-46603 and 806_R1_G-46631 (V4 16S rDNA) and At.GI_F1_G-46602 and At.GI_R502bp_G-46614 (502 bp A. thaliana GI ). These were the only V4 samples tagged with the 806R primer instead of the nearby 799R primer, and it was used to enable direct comparison to the dataset in (Regalado et al., 2020) . V3V4 tagging was performed with 341_F1_G-46605 and 799_R1_G-46601 (V3V4 16S rDNA) and At.GI_F1_G-46602 and At.GI_R502bp_G-46614 (502 bp A. thaliana GI ). V5V6V7 tagging was performed with 799_F1_G-46628 and 1193_R1_G-46629 (V5V6V7 16S rDNA) and At.GI_F1_G-46602 and At.GI_R466bp_G-46652 (466 bp A. thaliana GI ). Each exponential PCR reaction was completed in a single reaction of 25 μL; each sample was replicated three times.

Wild A. thaliana Mixed sample
DNA from samples previously analyzed by conventional 16S rDNA-only sequencing of the V4 region and shotgun metagenomics (Regalado et al., 2020) were pooled to prepare a single abundant mixed sample to be used repeatedly for technical tests.
The infected plants were grown at 16°C for 8 days (10 hours light, 14 hours dark) and harvested by pooling all seedlings in each pot into a sterile pre-weighed tube, which was again weighed to find the mass of the seedlings. Three 5 mm glass balls and 300 μL 10 mM MgCl 2 were added to each tube and the plant cells were lysed at a speed of 4.0 m/s for 20 seconds in a FastPrep-24™ Instrument (MP Biomedicals, Illkirch-Graffenstaden, France) to release the live bacteria from the leaves. From the pure lysate, 20 μL was used for a serial log dilution series, and 5 μL of each dilution was plated on LB agar supplemented with 100 μg/mL rifampicin. Colony forming units (CFUs) were counted after 2 days of incubation at 28°C. The remaining 280 µL of lysate were combined with 520 µL DNA lysis buffer, 0.5 mL of 1 mm garnet sharp particles (BioSpec, Bartlesville, OK, USA). 60 μL of 20% SDS was added to make a final SDS concentration of 1.5%, and DNA was extracted using a bead beating protocol (Regalado et al., 2020) . The number of Hpa sporangiophores was too high to be accurately quantified by visual counting.

Titration with of plant DNA infected with H. arabidopsidis
A titration panel was made combining different amounts of DNA from uninfected plants ( eds1-1 treated only with 10 mM MgCl 2 ) and DNA from Hpa -infected plants ( eds1-1 infected with Hpa as described above). Infected and uninfected pools were each diluted to 6 ng/ μL, and combined in 0:7, 1:6, 2:5, 3:4, 4:3, 5:2, 6:1, and 7:0 ratios. These were tagged using the same three primer sets described above for Hpa actin, A. thaliana GI , and V4 16S rDNA above. Each exponential PCR reaction was completed in a single reaction of 25 μL; hamPCR was replicated on the titration three times.

Capsicum annuum Infections with Xanthomonas
Leaf infiltration log series: Using pressure infiltration with a blunt-end syringe, C. annuum cultivar Early Calwonder (ECW) leaves were inoculated with Xanthomonas euvesicatoria ( Xe ).
Growth curve: Xe strain 85-10, resuspended in 10 mM MgCl 2 to a final concentration of 10 4 CFU / mL, was infiltrated via a blunt end syringe into 6 C. annuum (ECW) leaves of 6 different plants. Upon 0, 2, 4, 7, 9 and 11 dpi (days post inoculation) 4 leaf discs (7 mm diameter) from each inoculated leaf were harvested and bacterial numbers were determined as described above. 250 μL of leaf lysates were used for a bead-beating DNA prep as described for all other samples above.

Test of Tagging Step Cycle Numbers
As templates, we used a pool of mixed wild A. thaliana leaf DNA (~50 ng / reaction) and the "synthetic equimolar plasmid template" (~0.05 ng / reaction). For the wild A. thaliana leaf DNA, we tested V4 16S rDNA primers alone in the tagging step vs. hamPCR with V4 16S rDNA primers plus primers for the host GI gene. For the "synthetic equimolar plasmid template", we used only hamPCR. Specifically, we used 515_F1_G-46603 and 799_R1_G-46601 (V4 16S rDNA) and At.GI_F1_G-46602 and At.GI_R502bp_G-46614 (502 bp GI gene). We applied hamPCR for 2, 3, 4, 5, 6, 7, 8, 9, or 10 tagging cycles, paired with 30, 29, 28, 27, 26, 25, 24, 23, or 22 PCR cycles, respectively. All tagging and PCR reactions were started together, and fewer tagging cycles than 10, or fewer PCR cycles than 30, were achieved by taking PCR tubes out of the thermocycler at the end of the appropriate extension steps and placing them on ice.

Tests of Template Concentrations
A panel of 8 concentrations of wild A. thaliana leaf DNA was prepared, ranging from 5 to 500 ng per reaction. Primers for the wild A. thaliana leaf DNA were 515_F1_G-46603 and 799_R1_G-46601 (V4 16S rDNA) and At.GI_F1_G-46602 and At.GI_R502bp_G-46614 (502 bp Gigantea gene), with both primer pairs in equal ratio.

Quantitative Real time PCR on C. annuum samples
A primer set targeting the gene for the type III effector XopQ of pathogenic Xanthomonas was used to measure abundance of Xe 85-10 (Doddaraju et al., 2019) . For C. annuum , primers targeting the UBI-3 gene encoding a ubiquitin-conjugating protein were used (Wan et al., 2011) . Two reagent mastermixes were prepared, one for each primer set, to help improve primer dose consistency. Each sample was amplified using three 10 μL technical replicates per primer set that were averaged for analysis. Each 10 μL reaction included 2.5 μL of DNA, to which was added, as a mastermix, 5 μL SYBR® Green PCR Master Mix (Life Technologies, Carlsbad, California), 1.5 μL water, 0.5 μL of 100 μM forward primer, and 0.5 μL of 100 μM reverse primer. qPCR was performed on a BioRad CFX384 Real-time System and analyzed with the CFX Manager Software. The following conditions were used for the amplification of both target genes: 1) 94° C for 5 min. Denature 2) 94° C for 30 sec. The ratio of microbial to host DNA was calculated as 1.76^(-mean xopQ Cq value) / 1.76^(-mean UBI-3 Cq value).

Data availability
All data in this manuscript has been deposited in the European Nucleotide Archive (ENA