Microbiome composition differs in hybrid and inbred maize

1 ​Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045 2 ​Kansas Biological Survey, University of Kansas, Lawrence, KS 66045 3 ​Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC 27695 4 ​Plant Science Research Unit, Agricultural Research Service, United States Department of Agriculture, Raleigh, NC 27695 5 ​Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC 27695


INTRODUCTION
The role of host genotype in shaping microbiome composition and function-and the effect of that microbiome on the host phenotype-is a crucial topic for both basic and applied research into plant-associated microbial communities. Understanding patterns and mechanisms of microbiome heritability is a key step toward understanding how plant-microbiome interactions evolved, as well as unlocking their potential to improve agricultural productivity and sustainability (Bordenstein & Theis, 2015;Busby et al. , 2017) . Heritable components of microbiome variation may reflect plant-microbe interactions that have been shaped by natural selection acting on plant fitness and underlying plant traits and which might be modified during crop breeding. Plant genotype can also affect the phenotypic response to variation in the local microbial species pool. Thus, genetic variation within and among species may have major implications for harnessing the potential benefits of plant-associated microbiomes, which include improved nutrient acquisition, drought resistance, defense, and reproductive phenology (Friesen et al. , 2011;Lau & Lennon, 2012;Wagner et al. , 2014;Panke-Buisse et al. , 2015, 2017Santhanam et al. , 2015;Busby et al. , 2016;Ritpitakphong et al. , 2016;Berg & Koskella, 2018;Hubbard et al. , 2019) .
Although many studies have documented host genetic variation for microbiome composition, most of these have compared genotypes whose relationship to each other is undefined (Bodenhausen et al. , 2013;Edwards et al. , 2015;Wagner et al. , 2016;Wallace et al. , 2018;Walters et al. , 2018) or delved into molecular mechanisms using mutants, transgenic manipulation, or QTL-mapping approaches (Bressan et al. , 2009;Bodenhausen et al. , 2014;Horton et al. , 2014;Beckers et al. , 2016;Ritpitakphong et al. , 2016) . As a result, we have limited ability to predict microbiome similarity between two host genotypes that are directly related ( e.g., within a known pedigree). Very recently, the introgression of quantitative trait loci was shown to alter leaf microbiomes across five generations of a breeding experiment for improved disease resistance in maize (Wagner et al. , 2019) ; however, additional studies of this type are needed before we will be able to anticipate what microbiome changes might follow genetic improvement in crops or evolution in natural populations.
One major outstanding question is whether hybridization between genetically distinct lineages results in hybrids that harbor microbiomes that are intermediate in composition between those of the parent genotypes. This question is important because intermating between diverged populations or species can create novel combinations of alleles that modify adaptively important traits, with major evolutionary implications (Anderson & Stebbins, 1954;Soltis & Soltis, 2009) . Hybridization is the cornerstone of many crop breeding strategies. In many cases, and particularly in maize, traits of hybrids exceed both of their parents rather than being phenotypically intermediate between their parents. This results in more robust, productive plants-a phenomenon known as heterosis or hybrid vigor. Despite this phenomenon being a high-priority research subject for over 100 years, the mechanistic basis is still not completely understood; but it likely involves various genetic mechanisms including dominance, overdominance, and genic dosage (Birchler et al. , 2003;Flint-Garcia et al. , 2009;Schnable & Springer, 2013) . To the extent that the host traits that affect microbiome composition exhibit heterosis, microbiome composition itself could potentially do the same (Figure 1).

Figure 1 | Overview of heterosis and its hypothesized relationship to microbiome composition. (a)
Each panel depicts trait values for each of three genotypes: two inbred lines (left and right, symbolized by red, blue, or yellow in the different panels) and their F 1 hybrid (center, symbolized by purple, orange, or green). The dotted grey lines denote the midparent value, or the expected trait value of the hybrid in the absence of heterosis. Midparent heterosis describes hybrid trait values that deviate from the midparent value but are within the range of the parent line values. Better parent heterosis describes hybrid trait values that are outside of the range of parental values. (b) Host phenotype defines the habitat available to potential microbial colonizers from the ambient community. Therefore, heterosis for traits that affect microbiome composition can result in heterosis for microbiome composition itself ( e.g. , relative abundance of the focal microbe highlighted in pink).
Due to the immense value of heterosis for food security and agribusiness, the genes, traits, and mechanisms involved in this phenomenon remain subjects of intense research interest. So far, however, exploration of the links between heterosis and plant microbiomes has been limited to specific microbial taxa and functional groups and has excluded aboveground plant-microbe interactions. For instance, maize hybrids are more likely than inbred lines to be colonized by arbuscular mycorrhizal fungi and by beneficial Pseudomonas strains that produce growth-promoting phytohormones and protective antibiotic compounds in the rhizosphere (Picard et al. , 2004(Picard et al. , , 2008Picard & Bosco, 2005An et al. , 2010) .
Here, we investigate the effects of hybridization on the overall composition and structure of maize leaf and rhizosphere microbiomes. In maize, some heritable variation for microbiome composition of both leaves and rhizosphere has been identified among diverse inbred lines (Peiffer et al. , 2013;Wallace et al. , 2018;Walters et al. , 2018) . The genes and traits underlying this variation are unknown, and hybrid offspring of these or other inbred lines have not been fully characterized for microbiome content. We used deep amplicon sequencing to quantify membership and diversity of both bacterial and fungal communities in eleven F 1 hybrids and their inbred parent lines, grown together in two experimental fields. With this dataset we address the following questions: (1) In general, do the microbiomes of maize hybrids differ consistently from those of inbred lines? (2) Which microbiome features, if any, show evidence of heterosis within individual crosses? and (3) Are patterns of microbiome heritability in F1 crosses comparable to those of other complex traits? We demonstrate consistent differences in the microbiome composition of hybrids and inbreds, and evidence for heterotic patterns for multiple components of the microbiome. These results are a significant step towards understanding the relationship between host genotype and microbiome composition. Leaf discs were collected from young plants three weeks after planting. Plants were uprooted seven weeks after planting (shortly before anthesis for most genotypes) and rhizosphere was collected and pooled from all root types to a depth of ~1 foot.

Field experimental design
In May 2017, 18 maize genotypes were planted in a randomized design into two fields at Central Crops Research Station, Clayton, NC. These genotypes included the inbred lines B73 and Mo17; five additional inbred lines selected from the founding members of the Nested Association Mapping population (Yu et al. , 2008) ; and 11 F 1 hybrids resulting from crosses of inbred lines to B73 and Mo17 (Figure 2a). For all hybrids, either B73 or Mo17 was the maternal parent; for the B73xMo17 hybrid, B73 was the maternal parent. Immediately prior to planting, kernels were soaked in 3% hydrogen peroxide for two minutes to reduce surface-associated microbes and then rinsed in distilled water. One seed per genotype was planted into each block, with the exception of B73 and Mo17 which were planted an average of 1.5 times per block to ensure adequate sample sizes of these key genotypes. Twenty blocks of either 18 or 20 individuals, randomized with respect to genotype, were planted into adjacent rows within each field. However, the final sample sizes were lower due to uneven germination, survival, and sequencing ( Figure 2b).

Sample collection
Three weeks after planting, we sampled leaf tissue from all emerged plants for microbiome analysis. A standard hole punch and forceps (rinsed in 70% ethanol prior to each sample) were used to collect three discs evenly spaced from base to tip of the fourth leaf, avoiding the midrib and any non-green tissue. Leaf discs were immediately placed into sterile tubes and kept on ice until transfer to -20℃ for storage.
Seven weeks after planting, we uprooted plants for collection of rhizosphere soil. Using a garden spade, we collected root crowns (approximately 6" deep x 18" wide) and shook them vigorously until little or no bulk soil was falling from the roots. We then used ethanol-rinsed garden pruners to cut all roots starting 2" below the top of the crown; for large root systems that could not be grasped in one hand, we cut all roots from one hemisphere of the crown to ensure that all root types would be represented in the sample. Cut roots were stored in a sterile plastic bag on ice until transfer to -20℃ for storage. Due to time constraints, rhizosphere collections were spread over four days (July 2 nd -5 th , 2017), with complete replicates harvested from each field on each day so that collection date would not be confounded with genotype or field ( Figure  S1). At time of rhizosphere sampling most plants had transitioned to adult phase but were not yet flowering, with the exception of the sweet corn P39 which began anthesis during the collection period.
We also collected bulk soil samples from the four corners and approximate center of each experimental plot, approximately 8 cm below the surface. The first ten soil samples were collected on July 4 th , and sampling was repeated on July 5 th for a total of 20 samples (5 per field per day). These soil samples were used as a comparison to rhizosphere samples, as well as to test the effects of rhizosphere collection protocols (see below) on inferred microbiome content.
We vortexed root samples in 50 mL of 1x phosphate-buffered saline (PBS) at maximum speed for 15 s to dislodge rhizosphere soil. The resulting rhizosphere suspension was poured through a sterile 100-micron stainless steel mesh into a sterile 50 mL tube, then centrifuged for 30 minutes at 1,600 x g to concentrate the rhizosphere soil. We then discarded the supernatant, resuspended rhizosphere pellets in 1.5 mL 1x PBS, transferred them to sterile Eppendorf tubes, and centrifuged them for 5 minutes at 10,000 x g. After discarding the remaining supernatant we transferred rhizosphere pellets into storage at -80℃ until DNA extraction.
To determine whether this procedure affected downstream data collection and inference, we tested the protocol on our bulk soil samples. Each bulk soil sample was divided into two aliquots. We extracted DNA directly from one aliquot; to the other, we first applied the same washing protocol that we used to separate rhizosphere from roots. Comparison of data from washed and unwashed bulk soil samples indicates that the collection procedure may have resulted in underestimates of alpha diversity and overestimates of beta diversity (i.e., resulted in loss of some taxa and generated additional variation among samples) ( Figure S2). Alternatively, alpha diversity in bulk soils may have been inflated by the presence of dead cells or loose DNA that were separated out by the washing process. Our data on rhizosphere microbiomes should be interpreted with these caveats in mind.

DNA extraction
Prior to DNA extraction, we vortexed leaf discs at top speed for 10 seconds in 750 μL sterile deionized water to remove dust and soil. Leaf discs were shaken dry and blotted on clean paper towels prior to freezing at -80℃ and subsequent lyophilization. Freeze-dried leaf discs were randomly arranged within 96-well plates and ground for 60 s at 25 Hz in a Retsch MM301 mixer mill. As a positive control we included a mock microbial community in three wells (ZymoBiomics Microbial Community Standard, Zymo Research, Irvine, CA, USA). We used the Synergy 2.0 Plant DNA Extraction Kit (OPS Diagnostics, Lebanon, NJ, USA) to purify DNA, doubling the manufacturer's recommended bead-beating time to improve lysis of microbial cells. Purified DNA was eluted in 30 μL of 1x TE buffer (pH 8.0).
We used the MoBio PowerSoil htp-96 kit to extract DNA from rhizosphere and bulk soil samples. Rhizosphere pellets were resuspended in 750 μL bead solution and transferred to 96-well plates, with samples randomly arranged across plates. To improve lysis of microbial cells, samples were subjected to two freeze-thaw cycles prior to extraction according to the manufacturer's instructions. Washed and unwashed bulk soil samples (see above) were randomly distributed among the rhizosphere samples.

Amplicon library preparation and sequencing
As in previous work (Wagner et al. , 2019) , we characterized bacterial and fungal communities by sequencing the V4 region of the 16S rDNA gene and the ITS1 region of the fungal rDNA gene, respectively. We amplified these barcoding genes using the standard primer pairs 515f/806r and ITS1f/ITS2, modified to include 3 to 6 random nucleotides upstream of the gene primers to increase library complexity (Lundberg et al. 2013). A second PCR step added dual-indexed universal Illumina adaptors to the amplicons. Barcoding genes were amplified in 10 μL reactions that included 5 μL of 5Prime HotMasterMix (Quanta Bio, Beverly, MA, USA) and 0.4 μL of each 10 μM primer. For 16S-V4 reactions, we added 0.15 μL of PNA (100 μM) to block amplification of host mitochondrial and chloroplast sequence (Lundberg et al. , 2013) . The template for the first PCR step was 1.5 μL of DNA extracted from leaf, rhizosphere, or mock community samples; for the second PCR step, the template was 1 μL of the products from the first step. Reaction conditions for ITS1 amplification were 2 minutes at 95℃; 27 cycles of 20 s at 95℃, 20 s at 50℃, and 50 s at 72℃; and a final 10 minutes at 72℃. Conditions for 16S-V4 amplification were identical except for an additional PNA-annealing step (5 s at 78℃) before primer annealing at 52℃. The first PCR step included 27 cycles while the second included 8 cycles. PCR products from the first step were purified using 7 μL of magnetic SPRI bead solution, two washes with 70% ethanol, and elution in 10 μL water. Finally, we pooled 1 μL of each dual-indexed library, creating four separate pools (i.e., fungal and bacterial libraries for leaf and rhizosphere samples). We used Blue Pippin (Sage Science Inc., Beverly, MA, USA) to select fragments that were between 150 bp and 600 bp, and then combined the four pools at equimolar concentrations. The final combined sample was sequenced on the Illumina HiSeq 2500 platform (250-bp paired-end reads).

Sequence processing and quality filtering
We used CUTADAPT v1.12 (Martin, 2011) to trim primers from raw sequences prior to quality filtering. Using DADA2 v1.10.1 (Callahan et al. , 2016) , we removed all reads with ambiguous bases or more than 2 expected errors. To remove low-quality sequence from 3' ends, forward and reverse 16S reads were truncated at 220 bp and 160 bp (respectively). We inferred error rates separately for 16S and ITS data using 3x10 6 bases, then de-replicated and de-noised reads to identify amplicon sequence variants (ASVs). We removed chimeric ASVs using DADA2 and then used the RDP classifier (Wang et al. , 2007) trained on the RDP training set (v. 16) and the UNITE database (supplemented with Zea sequences) to assign taxonomy to bacterial and fungal ASVs, respectively (Cole et al. , 2014;Nilsson et al. , 2019) .
We discarded "non-usable" ASVs that could not be reliably identified at the kingdom level and those that corresponded to host sequences. Samples with fewer than 500 usable reads were excluded from analysis. Within each sample we removed bacterial ASVs with less than 0.21% relative abundance, and fungal ASVs with less than 0.185% relative abundance. These cutoffs were determined using the mock community positive controls to identify relative abundance thresholds that minimized cross-contaminant ASVs while retaining true positives and discarding as little data as possible. We required ASVs to be observed at least 25 times in at least 5 samples to be considered "reproducible"; all others were removed from the dataset (Lundberg et al. , 2012) . Combined, these quality filters reduced the number of bacterial ASVs from 8,958 to 753 and the number of fungal ASVs from 4,863 to 501. However, 92.3% of the original bacterial data and 95.0% of the fungal data were retained. The final number of observations for each sample was normalized, centered, and used to control for sampling effort in downstream analyses.

Quantification of fungal abundance in the rhizosphere
For rhizosphere and soil samples only, we used droplet digital PCR (ddPCR) to measure the number of fungal 18S rRNA genes per unit of extracted DNA. We interpret this metric as an estimate of the absolute abundance of fungi ( cf. bacteria and other organisms) within the community. DNA extracts were quantified using the PicoGreen dsDNA Assay Kit (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions, and all samples were diluted to a working concentration of 0.102 ng/uL. We used the FungiQuant assay to quantitatively amplify the fungal 18S rRNA gene (Liu et al. , 2012) . Each ddPCR reaction mixture contained 1 ng template DNA (9.8 uL of the diluted working solution), 5 uL of 2x Evagreen Probe Supermix (Bio-Rad, Hercules, CA, USA), 0.1 uL of each 100nM primer (from 5' to 3', forward primer GGRAAACTCACCAGGTCCAG and reverse primer GSWCTATCCCCAKCACGA), and molecular-grade water to bring the reaction volume to 20uL. The reaction mixtures were loaded into DG8 droplet generation cartridges along with 70uL of droplet generation oil for EvaGreen assays, and the QX200 Droplet Generator was used to divide the mixture into droplets (Bio-Rad). The droplets were then transferred to a T100 thermal cycler (Bio-Rad) for 10 minutes of activation at 95℃; followed by 50 cycles of denaturation (30s at 95℃), annealing (60s at 55℃), and elongation (30s at 72℃); and finally held at 4C until use. After amplification, the droplets were transferred to the QX200 Droplet Reader and the QuantaSoft AP program (Bio-Rad) was used to calculate number of fungal 18S rRNA gene copies per ng of rhizosphere DNA.

Data analysis
All data analysis was performed in parallel for bacteria and fungi using R version 3.6.0 with extensive use of the packages phyloseq, vegan, DESeq2, tidyr, lme4, and lmerTest (McMurdie & Holmes, 2013;Love et al. , 2014;Bates et al. , 2015;Kuznetsova et al. , 2017;Wickham, 2017;{Oksanen et al. , 2019) . Leaf and rhizosphere communities were analyzed separately and in parallel. Raw sequence reads are available at the NCBI Sequence Read Archive under Bioproject #PRJNA597058. Processed data, metadata, and original R code are available in a Zenodo archive (doi: 10.5281/zenodo.3596465).
ASV counts were normalized using the variance-stabilizing transformation implemented in the DESeq2 package (Love et al. , 2014) . Prior to normalization we estimated the species richness and community evenness of each sample using the abundance-based coverage estimator (ACE) and Shannon metrics, respectively (Hughes et al. , 2001) . These measures of alpha (within-sample) diversity were obtained using the package vegan ({Oksanen et al. , 2019) . We calculated beta (between-sample) diversity using the function "vegan::betadisper". Briefly, we calculated the Bray-Curtis dissimilarity between all pairs of samples and calculated the centroid for each genotype in each field; then, we determined the distance of each sample to its respective centroid.
Comparing microbiome features of hybrid vs. inbred maize: First, we investigated whether the microbiomes of maize hybrids, as a group, are distinct from those of inbred lines. We modeled alpha diversity (ACE and Shannon metrics) and beta diversity (distance to centroid) using separate linear mixed models, with "category" ( i.e. , either hybrid or inbred), field, and their interaction as fixed effects. Normalized sampling effort ( i.e., sequencing depth) was included as a covariate, and genotype (nested within category), block (nested within field), and plate (to control for batch effects) were included as random-intercept terms. For rhizosphere samples, collection date was also included as a random effect. ANOVA with Type III sums-of-squares and Satterthwaite's approximation of degrees-of-freedom was used for statistical inference of fixed effects; likelihood ratio tests were used to assess the significance of random effects. The same model (excluding the sampling effort covariate) was used to analyze fungal abundance in rhizosphere.
To test whether hybrids and inbreds differed in overall community composition, we conducted permutational MANOVA of Bray-Curtis dissimilarities, with hybrid/inbred category, field, and their interaction as the predictor variables. Sampling effort and collection date were also included as nuisance variables. Finally, we used negative binomial models to explore which microbial taxa differed in relative abundance between hybrids and inbred lines. Raw counts of ASVs were modeled with host hybrid/inbred category as the predictor variable (Love et al. , 2014;McMurdie & Holmes, 2014) . The rarest taxa were excluded from this analysis: the threshold for inclusion was set at 10% of the mean abundance, which substantially reduced the number of tests performed while retaining >10% of the data. For example, the mean ASV count across the leaf fungal dataset was 14,057, so we included only ASVs that were observed at least 1,406 times in this analysis. This reduced the number of tests by 45% but still used 98.3% of the data. P -values from Wald tests were adjusted to correct for multiple comparisons using the Benjamini-Hochberg false discovery rate (Benjamini & Hochberg, 1995) . Taxonomic designations of ASVs that differed in relative abundance between hybrids and inbreds were checked by comparing their sequences to the NCBI nucleotide database using BLAST.
Testing for microbiome heterosis: Next, we looked within individual crosses to explore patterns of heterosis for a variety of microbiome features. These features included alpha and beta diversity (as described above); overall community composition (summarized using the top 5 unconstrained principal coordinates and the top 5 constrained axes of variation from a distance-based redundancy analysis constrained on genotype); and variance-stabilized counts of ASVs. For each of these features we fit a linear mixed-effects model with the predictor variables field, sampling effort, batch effects, and (for rhizosphere samples) collection date. The purpose of these models was not to test for significance, but to derive residual trait values cleansed of these sources of noise; the resulting residuals were then used to test for evidence of heterosis.
For each microbiome feature, we calculated the mean values of these residuals in the seven inbred lines, and then used these means to determine the expected mid-parent values (assuming additive genetic variance) for each of the eleven hybrids. We then conducted two-sided t -tests of the null hypothesis that each hybrid's microbiome trait value was equivalent to its respective mid-parent value (i.e., tests for "mid-parent heterosis"; Figure 1a). In addition, we tested for "better-parent heterosis" using one-sided t -tests to assess whether the hybrid value fell outside of the parental range.
Effects of host relatedness on microbiome similarity within and between environments: To test whether the relationship between two host genotypes predicts the similarity of their microbiomes, we first calculated the Bray-Curtis dissimilarity (based on variance-stabilized ASV tables) between all pairwise combinations of samples. Each pair of samples was classified as one of nine categories based on the known relationship (or lack thereof) between the two host genotypes. Each pair was also classified as a comparison of individuals within a single field or between two different fields. We fit a linear mixed-effects model of Bray-Curtis dissimilarity with host-pair relationship as the fixed predictor variable. Only comparisons of samples within the same field were used for this analysis. The log-transformed absolute difference in sequencing depth between samples was included as a covariate; the two host genotypes and the two individual plant IDs were included as random intercept terms to account for these sources of non-independence among data points.
We used a similar approach to assess whether the microbiomes of hybrids and inbred lines differed in their sensitivity to environmental variation. We modeled Bray-Curtis dissimilarity of microbiomes from pairs of individuals with the same genotype using hybrid/inbred category, within/between fields, and their interaction as predictor variables. The log-transformed absolute difference in sequencing depth between samples was included as a covariate; the host genotype (nested in hybrid/inbred category) and the two individual plant IDs were included as random intercept terms.
Estimating heritability, general combining ability (GCA), and specific combining ability (SCA) for microbiome composition: Finally, we used ASReml-R software to estimate GCA and SCA for key properties of hybrid microbiomes, including alpha diversity (ACE and Shannon metrics), beta diversity (distance to centroid), and the five major constrained and unconstrained axes of variation from the dbRDA described above (Isik et al. , 2017) . To estimate variance components for each of these microbiome properties, we fit separate linear mixed-effects models for inbreds and hybrids with field, sequencing depth, and (for rhizosphere only) collection date as fixed effects. For the inbred model, the random effects included batch, block nested within field, genotype, and genotype-field interactions. For the hybrid model, random effects included batch, block, female parent, male parent, the interaction of each parent line with field, the interaction between the parent lines, and the three-way interaction between the parent lines and field.
For hybrids, a diallel model was fit to the data, in which a single GCA effect was estimated for each parent line, regardless of whether it was a male or female parent (Möhring et al. , 2011) . GCA variance is the variation of parental breeding values on hybrids and additive variance was estimated as twice the GCA variance; SCA variance was calculated as the variance component from the interaction between the parent lines and represents variation of dominance genetic effects, such as those that could lead to heterosis. The total genetic variation was calculated as the sum of twice the GCA plus SCA; the total genotype-by-environment (GxE) variation was calculated as the sum of the variance components from interactions between each parent line and field (twice the GCA-field interaction plus the SCA-field interaction variances). Narrow-sense heritability ( h 2 ) was calculated as the additive variance divided by the total phenotypic variation; broad-sense heritability ( H ) was calculated as the total genetic variation (2*GCA + SCA) divided by the phenotypic variation. For inbreds, total genetic variation and GxE were calculated using the variance components for genotype and genotype-by-field interaction, respectively.

RESULTS
After sequence processing and quality control, the final dataset included 501 fungal ASVs and 753 bacterial ASVs. The fungal dataset comprised 481 leaf samples and 332 rhizosphere samples from a total of 547 individual plants, with a median sampling effort of 19,266 observations per sample. The bacterial dataset included 502 leaf samples and 403 rhizosphere samples from 567 individuals (median sampling effort = 9,055 observations per sample). Plant survival differed between fields and genotypes, but the median sample size ranged from N = 10.5 to N = 14 per genotype per field (Figure 2). Leaf and rhizosphere communities differed substantially in diversity and composition ( Figure S3, Figure S4). Leaf communities were dominated by Pantoaea , Entyloma , and Golubevia spp., while the most abundant taxa in the rhizosphere were Ralstonia , Burkholderia, Rhizobium, Entyloma, and Golubevia (Table S1). Only 193 bacterial ASVs were observed in both sample types; 260 bacterial ASVs were found only in leaves, whereas 290 were specific to the rhizosphere. For fungi, 132 ASVs were specific to leaves, 196 were found only in rhizosphere, and the remaining 173 were found in both ( Figure S4). As has been observed for other plants, rhizosphere communities had higher alpha diversity than leaf communities. For fungi-but not bacteria-beta diversity was also higher in the rhizosphere, suggesting that belowground fungal communities may be more strongly influenced by microhabitat variation or stochastic processes ( Figure S3c-d). This is consistent with the observation that community composition differed strongly between fields only for fungi in the rhizosphere ( Figure S3a-b). As expected, rhizosphere communities were compositionally distinct from bulk soil ( Figure S2); they also were relatively richer in fungi ( Figure S5).

Figure 3 | Constrained distance-based redundancy analysis revealed that the microbiome composition of maize hybrids is distinct from that of inbred lines.
Two axes of variation in community composition (ordination of Bray-Curtis dissimilarity) are plotted for bacterial and fungal communities in maize leaves and rhizosphere. The CAP1 axis was constrained to capture variation explained by inbred/hybrid category; MDS1 was the dominant unconstrained axis of variation. ANOVA-like permutation tests were performed to assess statistical significance of the constrained axis.

Microbiome composition differs between maize hybrids and inbred lines
First, we investigated whether the hybrids, as a group, differed from the inbred lines in microbiome diversity and composition. For bacteria and leaf-associated fungi, we found no consistent difference between these groups in either alpha diversity or beta diversity (Table S2). In contrast, fungal communities in the rhizospheres of hybrid plants had higher alpha diversity (within-sample species richness and evenness) and lower beta diversity (among-individual variability) relative to inbred lines (Tukey's HSD, P < 0.05).
Microbiomes of hybrid plants were primarily distinguished from those of inbred plants not by alpha or beta diversity, but by the overall community composition (Table 1).. Distance-based redundancy analysis (ordination of Bray-Curtis dissimilarity, constrained to emphasize variation that could be explained by hybrid/inbred category) revealed that microbiomes of maize hybrids clustered with those of other hybrids, apart from those of inbred lines (Figure 3). We confirmed this observation using permutational MANOVA; hybrid/inbred category predicted composition of leaf-associated fungal communities and rhizosphere bacterial and fungal communities (ADONIS, all P <0.01). This differentiation was strongest in the rhizosphere, explaining about 1.6% of the overall variation (compared to <1% in leaves).
Using negative binomial models, we identified 18 bacterial and fungal ASVs that were differentially abundant in the rhizosphere of hybrid versus inbred hosts (Figure 4). Some of these taxa have been linked to either positive or negative effects on plant health. For example, several Burkholderia ASVs were relatively more abundant in inbreds than in hybrids; members of this genus have been reported to improve growth and stress resilience in diverse plants including maize (Bevivino et al. , 2000;Mitter et al. , 2013) , and a sizeable minority of maize-associated Burkholderia strains appear to be able to fix nitrogen (Perin et al. , 2006) . Similarly, one Bradyrhizobium ASV was relatively more abundant in inbred rhizospheres; although maize does not make the root nodules that are central to rhizobial symbiosis in legumes, strains of Bradyrhizobium have been shown to increase aboveground biomass in maize by up to 20% (Prévost et al. , 2012) . Hybrid rhizospheres were enriched in the yeast-like fungus Tilletiopsis washingtonensis , which produces antifungal metabolites and has shown promise as a biocontrol agent in the phyllosphere (Urquhart, 1994;Urquhart & Punja, 2002) . Compared to the rhizosphere, leaves contained fewer taxa that differed consistently in relative abundance; this result is consistent with the weaker effects on overall community composition observed in leaves using multivariate methods (Figure 3). One ASV identified as Microdochium sorghi was strongly enriched in the leaves of hybrids relative to inbreds. This fungus can cause zonate leaf spot disease in maize, although it more commonly infects sorghum (De León, 1978;Korsman et al. , 2012) ; however, no symptoms were apparent in the plants in this experiment. Interestingly, a recent study in Populus also found an enrichment of foliar fungal pathogens in an interspecific hybrid relative to one of the parent species (Cregger et al. , 2018) . For this analysis, data from the two fields were pooled. Each point represents one ASV, with point size scaled by the relative abundance of that taxon. FDR-corrected P -values from Wald tests of negative binomial models of taxon counts are plotted against the estimated effect sizes from the same models. Red points reflect taxa that were significantly associated with host hybrid/inbred category (FDR < 0.05). ASVs are labeled as the closest-matching species based on comparison to the NCBI sequence database using BLAST. One leaf ASV ( Microdochium sorghi ) is omitted to improve figure clarity; it was strongly enriched in hybrids.

Patterns of heterosis in microbiome features
Next, we used the framework described in Figure 1 to test which microbiome features, if any, showed evidence of heterosis within individual crosses. These microbiome features included community-level metrics of alpha and beta diversity ( i.e., Shannon, ACE, distance-to-centroid) and composition (the top 5 constrained and unconstrained axes of variation from distance-based redundancy analysis; Figure 5b). We used t-tests to assess whether these microbiome features deviated from the midparent expectation for each hybrid, using a significance threshold of FDR < 0.05 (Figure 1). We note that historically, the term "better-parent" heterosis describes cases in which the hybrid has higher values of a desirable trait or lower values of an undesirable trait relative to both parents (Flint-Garcia et al. , 2009) .
Because the function and value of these microbiome traits (if any) are unknown, we use this term to refer to cases in which hybrid trait values are higher than both parents or lower than both parents (Figure 1a).
Heterosis for alpha and beta diversity was detected in a few crosses, but the most frequently heterotic microbiome properties were the axes of variation in community composition. Both midparent and "better-parent" heterosis were particularly common in the dbRDA axes that were constrained by host genotype (Figure 5a), which is consistent with these axes capturing the heritable components of microbiome composition. More microbiome properties displayed heterosis in the rhizosphere than in leaves, which is consistent with the overall stronger differences between hybrids and inbreds observed previously (Figure 3). Beta diversity in the rhizospheres of hybrids exceeded midparent expectations most often in crosses with Mo17.
We also tested for heterosis of individual ASVs using FDR-corrected t-tests of their variance-stabilized abundances. In most hybrids, most ASVs did not deviate from the midparent expectation ( Figure S6a). However, 40% of all rhizosphere fungal ASVs showed evidence of heterosis in at least one cross, as did 20% of all leaf-associated fungal ASVs. These ASVs represented a wide diversity of classes ( Figure S6b). For bacterial ASVs, the rates of heterosis in the rhizosphere and leaf were 35% and 17%, respectively. The prevalence of ASV heterosis also varied strongly among hybrids, ranging from 27% of the tested fungal ASVs in the B73xP39 rhizosphere, to 5.1% of the tested bacterial ASVs in the rhizosphere of B73xMo17 ( Figure S6a).
For both bacteria and fungi, the second constrained axis of rhizosphere variation (CAP2) was the most frequently heterotic microbiome property; hybrid values of the bacterial CAP2 exceeded both parental values for all 11 crosses ( Figure 5). To obtain a biological interpretation of CAP2, we identified the top 10% of ASVs with the strongest loadings on this axis (Table S3). These included 13 of the ASVs that were differentially abundant between hybrids and inbred lines (Figure 4), including three Burkholderia ASVs. Additional rhizosphere ASVs with strong influences on this axis that were not previously detected by the negative binomial model included three more Burkholderia ASVs, two Nitrospira ASVs, and four ASVs from the family Rhizobiaceae. The full list of strongly-loading ASVs on all heterotic axes for both leaves and rhizosphere is provided in Table S3.

Figure 5 | Multiple microbiome properties show evidence of heterosis in maize leaves and rhizosphere for both bacteria (top) and fungi (bottom).
We calculated the values of select microbiome features in maize hybrids, after controlling for field, sequencing depth, and batch effects; then, we used t-tests to compare hybrid trait values to the mid-parental trait values as illustrated in Figure 1. P -values were adjusted to correct for multiple tests using the false discovery rate (Benjamini & Hochberg, 1995) . (a) Heterosis was more common for some microbiome properties than others. This difference is especially stark for community composition, summarized using the top 5 constrained and unconstrained axes of variation resulting from distance-based redundancy analysis. The constrained axes (abbreviated "CAP" for Constrained Analysis of Principal Coordinates) capture the microbiome variation that is associated with host genotype; the unconstrained axes (abbreviated "MDS" for Multidimensional Scaling) capture the dominant axes of remaining variation. The percentage of the overall community composition explained by each CAP or MDS axis is specified on the y-axis. (b) Patterns of heterosis are illustrated for two axes of variation in rhizosphere community composition, as described for panel (a), and for rhizosphere beta diversity (distance to centroid), which quantifies within-group/among-individual variation in community composition. Points represent least-squares mean values +/-1 s.e.m.

Patterns of microbiome heritability in F 1 crosses
Finally, we investigated several general patterns of microbiome heritability in this set of maize crosses. In particular, we (1) partitioned variance in community composition among genotypes within hybrid/inbred categories; (2) compared microbiome similarity of pairs of genotypes with differing levels of relatedness; (3) tested whether the microbiomes of hybrid and inbred maize were equally influenced by environmental variation; and (4) investigated the relative importance of additive versus heterotic components of genetic variation for shaping hybrid microbiomes.
First, we used permutational MANOVA to test for host genetic variation among inbred lines and among hybrids for whole-community composition. We found that host genotype contributed between 4-8% of the variation in overall community composition for both bacteria and fungi, in both leaves and rhizosphere (Table S4). Alpha diversity and beta diversity also varied among host genotypes, but not as consistently as community composition. For instance, host genotype affected alpha diversity (both ACE and Shannon metrics) and distance to centroid ( i.e., among-individual variation within genotypes); but almost exclusively in the rhizosphere (Table S2).
Next, to assess how relatedness affects microbiome similarity, we modeled Bray-Curtis dissimilarity between pairs of samples as a response to host-genotype-relationship using ANOVA. Overall, host-genotype-relationship had a stronger effect on rhizosphere microbiome dissimilarity than leaf microbiome dissimilarity (Figure 6a). Bacterial rhizosphere composition differed more between pairs of hybrids (regardless of their relatedness) than between pairs of inbreds; the opposite was true in leaves. We found no evidence that hybrids' microbiomes are more similar to those of their maternal inbred than to their paternal inbred; in fact, for leaf-associated bacteria, the opposite was true. Similarly, hybrid microbiomes were equally dissimilar to those of their half-siblings regardless of whether the female or male parent was shared ( Figure 6a).
Third, we tested whether genotype-by-environment (GxE) interactions are more prominent for inbred microbiomes than hybrid microbiomes, as they are for many plant traits (Lewis, 1954;Griffing & Zsiros, 1971;Blum, 2013;Zanewich et al. , 2018) (but see (Li et al. , 2018) ). We found that pairs of siblings from hybrid genotypes had less similar rhizosphere microbiomes than pairs of siblings from inbred genotypes, both within fields and between fields (Figure 6b). In addition, absolute abundance of fungi in the rhizosphere was stable across fields for inbreds, but not hybrids ( Figure S5). For leaf microbiomes, however, there was no difference in sibling-pair dissimilarity between inbreds and hybrids. The variance components contributed by Genotype and GxE supported these findings. For many leaf microbiome traits, G/GxE was equal between hybrids and inbreds; for others (particularly fungal traits) it was higher in hybrids, and for others (particularly bacterial traits) the opposite was true ( Figure S7a).
Last, we investigated whether hybrid microbiomes were influenced more by general combining ability (GCA; the average trait values of the two parent lines, mostly involving additive gene effects) or specific combining ability (SCA; the interaction between the two parent lines, involving synergistic or heterotic effects). Using this method we detected moderate heritability for a variety of leaf microbiome features, but no heritability for rhizosphere microbiomes. For both fungi and bacteria, narrow-sense heritability ( h 2 ) was much higher among inbred lines than among hybrids ( Figure S7b). For hybrids, comparison of SCA and GCA revealed that for multiple axes of leaf microbiome variation, broad-sense heritability was considerably higher than h 2 ( Figure S7c). This indicates that leaf microbiomes, in particular, were shaped by specific combinations of parental genotypes ( i.e., heterotic or synergistic effects) beyond the general combining ability (additive effects) of those parents.

Figure 6 | Host relatedness affects dissimilarity between pairs of microbiomes. (a)
Bray-Curtis dissimilarity (horizontal axis) was calculated for each pair of samples, each of which was categorized according to the relationship between the two samples' host genotypes (vertical axis). All sample pairs were from individuals in the same field. We used ANOVA to test the response of Bray-Curtis dissimilarity to "host-pair-relationship"; the log absolute value of the difference in sequencing depth between the samples was included as a nuisance variable. Least-squares mean dissimilarity values from this model are shown +/-1 s.e.m. Non-overlapping sets of letters indicate post-hoc Tukey's Honest Significant Differences. (b) Pairs of rhizosphere samples from inbred genotypes were more similar to each other in composition than pairs of samples from hybrid genotypes. Bray-Curtis dissimilarity was modeled as a function of "inbred/hybrid category", "within vs. between field", and their interaction; least-squares means are plotted as described in (a).

DISCUSSION
Understanding patterns and mechanisms of microbiome heritability is critical to understanding the evolution of host-microbiome symbiosis in both natural and human-directed systems. Here we demonstrate that microbiome diversity and composition display some of the same patterns as other complex plant traits within F1 crosses. In general, leaf and rhizosphere microbiomes of hybrid maize differ reliably from those of inbred maize lines, much like performance-related traits such as height and yield. Within crosses, we detected evidence for heterosis of a variety of microbiome features, including beta diversity and several orthogonal axes of community composition ( Figure 5).
In this study, host genotype and its interaction with environment explained only between 4-8% of the overall variation in microbiome composition (Table S1). This is a relatively high proportion in comparison to other studies of host genetic microbiome control, but relatively low in comparison to studies of other maize traits. However, this may be an underestimate of the true importance of host genotype; if only a subset of the microbiome (potentially including functionally important groups) is actually responsive to host variation, then using the entire community as a response variable is likely diluting the host genotype signal. In support of this possibility, some studies, including in maize, have found that heritability of microbiome functional potential ( i.e., metagenome content) is higher than the heritability of taxonomy-based microbiome features such as those used in our analysis (Lemanceau et al. , 2017;Wallace et al. , 2018) .
Maize is well-known for the strength of heterosis in intraspecific crosses, and our results indicate that this phenomenon can be seen within leaf and rhizosphere microbiomes. However, not all maize crosses are equally heterotic, either for yield or for the underlying traits (Flint-Garcia et al. , 2009;Guo et al. , 2019) . This, too, was evident in our microbiome data ( Figure 5; Figure S6). However, with only 11 crosses included in our study, an explicit test for a relationship between the extent of microbiome heterosis and the extent of heterosis for other phenotypic traits was beyond the scope of this work. Future investigations into this question would hone our ability to predict microbiome composition in the offspring of controlled crosses, and also may yield clues about which host traits are involved in microbiome heterosis.
Along the same lines, the effects of hybridization on the microbiome will likely vary among other host species. A comparison to plants that display outbreeding depression rather than heterosis would be particularly interesting. Evidence from animal systems suggest a potential role for the microbiome in speciation. Crosses between two subspecies of house mice ( Mus musculus ) result in hybrid offspring with reduced fertility and gut microbiomes that diverge from both parent subspecies (Wang et al. , 2015) , while lethality of hybrids between species of Nasonia wasps can be attributed to a disordered microbiome (Brucker & Bordenstein, 2013) . To our knowledge, however, our study is the first to investigate whole microbiome composition in true intraspecific crosses of any organism. More data is needed from similar experiments, particularly in additional plant species that vary in mating system. Altogether, our results indicate that certain microbiome features differ consistently between maize hybrids and inbred lines (Figures 3-4). However, the implications of these differences for microbiome function and/or plant health remain unclear. This reflects the limitations of the amplicon-sequencing approach: due to poor taxonomic resolution ( i.e. , an ASV may represent multiple strains with different functions) and a general lack of knowledge about the biology of many microbial taxa, function cannot be reliably inferred from this type of data. Therefore, different approaches are needed to draw conclusions about the importance of microbiome heterosis for community and host function. For example, shotgun metagenome sequencing could reveal differences in or conservation of microbial gene content of hybrid and inbred rhizospheres; metatranscriptome sequencing may reveal further differences among host genotypes in the expression of those microbial genes.
The true functional consequences of microbiome heterosis, however, would be most clearly assessed by directly manipulating microbiomes and measuring the effects on host phenotype. This approach would also inform us about the direction of causality in the microbiome-heterosis relationship. In this study, we manipulated host genotype and measured the effects on microbial communities; however, given the abundant evidence that plant microbiomes can drastically affect their hosts, it is possible that the observed effects on the microbiome could feed back and contribute to heterosis of other plant traits. Is microbiome heterosis a meaningless by-product of host hybridization? Or is it a contributing mechanism of hybrid vigor itself? These questions could be addressed using synthetic microbial communities, reinoculation, or soil-conditioning experiments (Ponsford et al. ;Panke-Buisse et al. , 2015;Vorholt et al. , 2017) . These important next steps will help to clarify the role of plant-associated microbes in hybrid vigor. TABLE S2 | Host genotype influences alpha and beta diversity in maize leaf and rhizosphere microbiomes. ACE (log-transformed) and Shannon metrics were used to quantify community richness and evenness, respectively; Distance to centroid was used to quantify beta diversity (i.e., among-individual variation within genotypes). Bacterial (a) and fungal (b) communities were tested separately using univariate models and analysis of variance with Type III sums of squares and Satterthwaite's approximate denominator d.f. Likelihood ratio tests were used for significance testing of random-intercept effects. For rhizosphere fungal communities, one genotype (Oh43) was excluded from the analysis because no individuals survived in one field. *** P < 0.001; ** P < 0.01; * P < 0.05 . All P-values were adjusted to correct for multiple comparisons using the Benjamini-Hochberg false discovery rate.

Figure S1 | The effect of collection date on rhizosphere microbiome diversity and composition. (a)
The number of rhizosphere samples in the 16S dataset are shown for each host genotype in each field. Complete replicates were collected on each day so that collection date would not be confounded with either field or plant genotype; however, some genotypes with poor survival are not represented on all dates. (b) Shannon diversity varied slightly among collection dates for both bacteria (top) and fungi (bottom). (c) Principal coordinates analysis of Bray-Curtis dissimilarities shows that both bacterial and fungal communities shifted in community structure between July 3rd and July 4th, but only in one of the two fields (field G5C).

Figure S2 | Microbiome characteristics of rhizosphere, bulk soil, and "washed" bulk soil samples.
Rhizosphere soil was collected from maize roots using a series of washes and centrifuge steps (see Methods). To assess the effects of this protocol on microbiome readouts, we divided each of 20 bulk soil samples into two aliquots and applied the same protocol to one aliquot prior to DNA extraction.

Figure S3 | Maize leaves and rhizosphere harbored distinct bacterial (top) and fungal (bottom) microbiomes. (a-b)
Principal coordinates analysis of Bray-Curtis dissimilarities highlights the difference between leaf and rhizosphere microbial communities. (c-d) Beta diversity, measured as distance to centroid (based on Bray-Curtis dissimilarity), was higher in the rhizosphere for fungal communities only. The 75th and 25th percentiles are marked by the top and bottom of each box; the median is marked by the center line; and the length of each whisker is 1.5 times the interquartile range. (e-f) Alpha diversity was substantially higher for rhizosphere microbial communities than for leaf communities. Boxplot statistics are the same as for panels (c-d).    show any signal of heterosis in most crosses. ASV heterosis was more common in some hybrids than in others. (b) A diverse set of ASVs showed heterosis of relative abundance in at least one hybrid. Each point represents a single ASV, arranged on the y-axis by taxonomic class and on the x-axis by the number of hybrids in which its abundance deviated from the midparent expectation ( Figure 1). Point colors are to improve figure clarity and have no biological meaning.