Heterosis of leaf and rhizosphere microbiomes in field-grown maize

Macroorganisms’ genotypes shape their phenotypes, which in turn shape the habitat available to potential microbial symbionts. This influence of host genotype on microbiome composition has been demonstrated in many systems; however, most previous studies have either compared unrelated genotypes or delved into molecular mechanisms. As a result, it is currently unclear whether the heritability of host-associated microbiomes follows similar patterns to the heritability of other complex traits. We take a new approach to this question by comparing the microbiomes of diverse maize inbred lines and their F1 hybrid offspring, which we quantified in both rhizosphere and leaves of field-grown plants using 16S-v4 and ITS1 amplicon sequencing. We show that inbred lines and hybrids differ consistently in composition of bacterial and fungal rhizosphere communities, as well as leaf-associated fungal communities. A wide range of microbiome features display heterosis within individual crosses, consistent with patterns for non-microbial maize phenotypes. For leaf microbiomes, these results were supported by the observation that broad-sense heritability in hybrids was substantially higher than narrow-sense heritability. Our results support our hypothesis that at least some heterotic host traits affect microbiome composition in maize.


18
• Macroorganisms' genotypes shape their phenotypes, which in turn shape the habitat 19 available to potential microbial symbionts. This influence of host genotype on 20 microbiome composition has been demonstrated in many systems; however, most 21 previous studies have either compared unrelated genotypes or delved into molecular 22 mechanisms. As a result, it is currently unclear whether the heritability of host-23 associated microbiomes follows similar patterns to the heritability of other complex traits. 24 • We take a new approach to this question by comparing the microbiomes of diverse 25 maize inbred lines and their F1 hybrid offspring, which we quantified in both rhizosphere 26 and leaves of field-grown plants using  • We show that inbred lines and hybrids differ consistently in composition of bacterial and 28 fungal rhizosphere communities, as well as leaf-associated fungal communities. A wide 29 range of microbiome features display heterosis within individual crosses, consistent with 30 patterns for non-microbial maize phenotypes. For leaf microbiomes, these results were 31 supported by the observation that broad-sense heritability in hybrids was substantially 32 higher than narrow-sense heritability. 33 • Our results support our hypothesis that at least some heterotic host traits affect 34 microbiome composition in maize. 35 36 The role of host genotype in shaping microbiome composition and function is a crucial 39 topic for both basic and applied research into plant-associated microbial communities. 40 Understanding patterns and mechanisms of microbiome heritability is a key step toward 41 understanding how plant-microbiome interactions evolved, and unlocking their potential to 42 improve agricultural productivity and sustainability (Bordenstein & Theis, 2015;Busby et al., 43 2017). Heritable components of microbiome variation may reflect plant-microbe interactions that 44 have been shaped by natural selection acting on plant traits underlying fitness, which might be 45 modified during crop breeding. Plant genotype can also affect the phenotypic response to 46 variation in the local microbial species pool. Thus, genetic variation within and among species 47 may have major implications for harnessing the potential benefits of plant-associated 48 microbiomes, which include improved nutrient acquisition, drought resistance, defense, and 49 reproductive phenology (Friesen et al., 2011;Lau & Lennon, 2012;Wagner et al., 2014;Panke-50 Buisse et al., 2015, 2017Santhanam et al., 2015;Busby et al., 2016;Ritpitakphong et al., 51 2016;Berg & Koskella, 2018;Hubbard et al., 2019). 52 Although many studies have documented host genetic variation for microbiome 53 composition, most of these have compared genotypes whose relationship to each other is 54 undefined (Bodenhausen et al., 2013;Edwards et al., 2015;Wagner et al., 2016;Wallace et al., 55 2018a;Walters et al., 2018) or delved into molecular mechanisms using mutants, transgenic 56 manipulation, or QTL-mapping approaches (Bressan et al., 2009;Bodenhausen et al., 2014;57   median sample size per genotype was slightly lower in Field A5B than in Field G5C; due to 120 uneven sequencing of 16S and ITS amplicons and subsequent quality filtering, the median 121 sample size was slightly lower for the fungal dataset than the bacterial dataset. Map data: Google 122 (2019). (c) Leaf discs were collected from young plants three weeks after planting. Plants were 123 uprooted seven weeks after planting (shortly before anthesis for most genotypes) and 124 rhizosphere was collected and pooled from all root types to a depth of ~1 foot.

126
Field experimental design 127 In May 2017, 18 maize genotypes were planted in a randomized design into two fields at 128 Central Crops Research Station, Clayton, NC. The fields were 1.1 km apart and had similar soil 129 types but different recent crop rotation histories and soil microbiota (Tables S1-S2). 130 The genotypes included the inbred lines B73 and Mo17; five additional inbred lines 131 selected from the founding members of the Nested Association Mapping population (Yu et al., 132 2008); and 11 F1 hybrids resulting from crosses of those lines to B73 and Mo17 (Figure 2a). For 133 all hybrids, either B73 or Mo17 was the maternal parent; for the B73xMo17 hybrid, B73 was the 134 maternal parent. We included B73 and Mo17 due to their importance as elite representatives of 135 the stiff-stalk and non-stiff-stalk heterotic groups and because they have high-quality reference 136 genomes that will facilitate deeper dissection of host genetic microbiome control (Jiao et al., 137 2017;Sun et al., 2018). The five additional inbred lines were chosen because they showed 138 highly divergent rhizosphere microbiome composition relative to each other and to B73 and 139 Mo17 in a previous study (Peiffer et al., 2013). Some of these were sweetcorn lines or tropical in 140 origin; these groups are distinct from B73 and Mo17 in a wide variety of phenotypic traits 141 including root system architecture and phenology, all of which could potentially influence 142 microbiome composition. Thus, our study captured a wide range of maize genetic and 143 phenotypic variation; however, identifying the particular traits that caused any observed 144 microbiome changes was beyond the scope of this experiment. 145 Immediately prior to planting, kernels were soaked in 3% hydrogen peroxide for two 146 minutes to reduce surface-associated microbes and then rinsed in distilled water. One seed per 147 genotype was planted into each block, except for B73 and Mo17 which were planted an average 148 of 1.5 times per block. Twenty blocks of 18 or 20 individuals, randomized with respect to 149 genotype, were planted into adjacent rows within each field. However, the final sample sizes 150 were lower due to uneven germination, survival, and sequencing ( Figure 2b). 151

Sample collection
152 Three weeks after planting, we sampled leaf tissue from all emerged plants for 153 microbiome analysis. We chose this early timepoint to ensure that sampling was completed 154 before the establishment of foliar diseases such as southern leaf blight, which sometimes arise 155 later in the season in these fields and which could potentially skew microbiome composition (but 156 see (Wagner et al., 2020)). A standard hole punch and forceps (rinsed in 70% ethanol between 157 samples) were used to collect three discs evenly spaced from base to tip of the fourth leaf, 158 avoiding the midrib and any non-green tissue. Leaf discs were immediately placed into sterile 159 tubes and kept on ice until transfer to -20℃ for storage. 160 Seven weeks after planting, we uprooted plants for collection of rhizosphere soil. We 161 chose this timepoint so that our data would be comparable to the largest published maize 162 rhizosphere dataset available at the time (Peiffer et al., 2013). This timepoint was originally 163 chosen based on the expectation that host genetic effects would accumulate in the rhizosphere 164 microbiome until the transition to flowering, while the roots were mostly a carbon sink (Peiffer & 165 Ley, 2013). Using a spade, we collected root crowns (~6" deep x ~18" wide) and shook them 166 until little or no bulk soil was falling from the roots. We then used ethanol-rinsed pruners to cut 167 all roots starting 2" below the top of the crown. All whorls of roots attached to the stem up to 6" 168 belowground were included in each sample; for large root systems that could not be grasped in 169 one hand, we instead cut all roots from one hemisphere of the crown to ensure that primary, 170 seminal, and crown roots would be represented in each sample. A variety of root system 171 properties likely differed among genotypes from different heterotic groups, or between hybrids 172 and inbreds; however, we conducted this experiment agnostic to the particular phenotypic traits 173 that affect microbiome composition. We therefore consider such genetic variation in root 174 morphology or physiology to be a possible underlying cause of any observed microbiome 175 variation, which would need to be verified through additional experimentation. 176 Cut roots were stored in a sterile plastic bag on ice until transfer to -20℃ for storage. 177 Rhizosphere collections were spread over four days (July 2 nd -5 th , 2017), with representatives of 178 all genotypes harvested from each field on each day so that collection date would not be 179 confounded with genotype or field ( Figure S1). At time of rhizosphere sampling most plants had 180 transitioned to adult phase but were not yet flowering, with the exception of the sweet corn P39 181 which began anthesis during the collection period. We also collected bulk soil samples from the 182 four corners and approximate center of each field, approximately 8 cm below the surface. The 183 first ten soil samples were collected on July 4 th , and sampling was repeated on July 5 th for a 184 total of 20 samples. 185 We vortexed roots in 50 mL of 1x phosphate-buffered saline (PBS) at maximum speed 186 for 15 s to dislodge rhizosphere soil. The resulting suspension was poured through a sterile 187 100-micron stainless steel mesh into a sterile 50 mL tube, then centrifuged for 30 minutes at 188 1,600 x g to concentrate the rhizosphere soil. We then discarded the supernatant, resuspended 189 rhizosphere pellets in 1.5 mL 1x PBS, transferred them to sterile Eppendorf tubes, and 190 centrifuged them for 5 minutes at 10,000 x g. After discarding the remaining supernatant we 191 transferred rhizosphere pellets into storage at - 80℃ until DNA extraction. 192 To determine whether this procedure affected downstream data collection and inference, 193 we tested the protocol on our bulk soil samples. Each bulk soil sample was divided into two 194 aliquots. We extracted DNA directly from one aliquot; to the other, we first applied the same 195 washing protocol that we used to separate rhizosphere from roots. Comparison of data from washed and unwashed bulk soil samples indicates that the procedure may have resulted in 197 underestimates of alpha diversity and overestimates of beta diversity (i.e., resulted in loss of 198 some taxa and generated additional variation among samples) ( Figure S2). Alternatively, alpha 199 diversity in bulk soils may have been inflated by the presence of dead cells or loose DNA that 200 were separated out by the washing process. Our data on rhizosphere microbiomes should be 201 interpreted with these caveats in mind. 202

203
Prior to DNA extraction, we vortexed leaf discs at top speed for 10 seconds in 750 μL 204 sterile deionized water to remove dust and soil. Leaf discs were shaken dry and blotted on clean 205 paper towels, frozen at -80℃, and lyophilized. Freeze-dried leaf discs were randomly arranged 206 to block amplification of host mitochondrial and chloroplast sequence (Lundberg et al., 2013). 227 The template for the first PCR step was 1.5 μL of DNA extracted from leaf, rhizosphere, or mock 228 community samples; for the second PCR step, the template was 1 μL of the products from the 229 first step. Reaction conditions for ITS1 amplification were 2 minutes at 95℃; 27 cycles of 20 s at 230 95℃, 20 s at 50℃, and 50 s at 72℃; and a final 10 minutes at 72℃. Conditions for 16S-V4 231 amplification were identical except for an additional PNA-annealing step (5 s at 78℃) before 232 primer annealing at 52℃. The first PCR step included 27 cycles while the second included 8 233 cycles. PCR products from the first step were purified using 7 μL of magnetic SPRI bead 234 solution, two washes with 70% ethanol, and elution in 10 μL water. Finally, we pooled 1 μL of 235 each dual-indexed library, creating four separate pools (i.e., fungal and bacterial libraries for leaf 236 and rhizosphere samples). We used Blue Pippin (Sage Science Inc., Beverly, MA, USA) to 237 select fragments that were between 150 bp and 600 bp, and then combined the four pools at 238 equimolar concentrations. The final combined sample was sequenced on the Illumina HiSeq 239 2500 platform (250-bp paired-end reads). 240 Sequence processing and quality filtering 241 We used CUTADAPT v1.12 (Martin, 2011) to trim primers from raw sequences prior to 242 quality filtering. Using DADA2 v1.10.1 (Callahan et al., 2016), we removed all reads with 243 ambiguous bases or more than 2 expected errors. Forward and reverse 16S reads were 244 truncated at 220 bp and 160 bp (respectively). We inferred error rates separately for 16S and 245 ITS data using 3x10 6 bases, then de-replicated and de-noised reads to identify amplicon 246 sequence variants (ASVs). We removed chimeric ASVs using DADA2 and then used the RDP 247 classifier (Wang et al., 2007) trained on the RDP training set (v. 16) and the UNITE database 248 (supplemented with Zea sequences) to assign taxonomy to bacterial and fungal ASVs, 249 respectively (Cole et al., 2014;Nilsson et al., 2019). 250 We discarded "non-usable" ASVs that could not be reliably identified at the kingdom 251 level and those identified as host sequences. Samples with <500 usable reads were excluded 252 from analysis. Within each sample we removed bacterial and fungal ASVs with less than 0.21% 253 and 0.185% relative abundance, respectively. These cutoffs were determined using the mock 254 community positive controls to identify relative abundance thresholds that minimized cross-255 contaminant ASVs while retaining true positives and discarding as little data as possible. We 256 required ASVs to be observed at least 25 times in at least 5 samples to be considered into DG8 droplet generation cartridges along with 70uL of droplet generation oil for EvaGreen 275 assays, and the QX200 Droplet Generator was used to divide the mixture into droplets (Bio-276 Rad). The droplets were then transferred to a T100 thermal cycler (Bio-Rad) for 10 minutes of 277 activation at 95℃; followed by 50 cycles of denaturation (30s at 95℃), annealing (60s at 55℃), 278 and elongation (30s at 72℃); and finally held at 4C until use. After amplification, the droplets 279 were transferred to the QX200 Droplet Reader and the QuantaSoft AP program (Bio-Rad) was 280 used to calculate the number of fungal 18S rRNA gene copies per ng of rhizosphere DNA. 281

Data analysis
282 Data analysis was performed in parallel for bacteria and fungi using R version 3.6.0 with 283 the packages phyloseq, vegan, DESeq2, tidyr, lme4, and lmerTest (McMurdie & Holmes, 2013;284 Love et al., 2014;Bates et al., 2015;Kuznetsova et al., 2017;Wickham, 2017;Oksanen et al., 285 2019 (Love et al., 2014). Before normalization we estimated alpha (within-sample) 291 diversity using the abundance-based coverage estimator (ACE) and Shannon metrics (Hughes 292 et al., 2001). We quantified beta (between-sample) diversity by calculating the centroid for each 293 genotype in each field in multivariate space; then, using the function "betadisper" from the 294 package vegan (Oksanen et al., 2019), we determined the distance of each sample to its 295 respective centroid. 296 Comparing microbiome features of hybrid vs. inbred maize: First, we investigated 297 whether the microbiomes of maize hybrids are distinct from those of inbreds. We modeled alpha 298 diversity (ACE and Shannon metrics) and beta diversity (distance to centroid) using separate 299 linear mixed models, with "category" (i.e., either hybrid or inbred), field, and their interaction as 300 fixed effects. Normalized sequencing depth was included as a covariate. Genotype (nested 301 within category), block (nested within field), and plate (to control for batch effects) were included 302 as random-intercept terms. For rhizosphere samples, collection date was also included as a 303 random effect. ANOVA with Type III sums-of-squares and Satterthwaite's approximation of 304 degrees-of-freedom was used for statistical inference of fixed effects; likelihood ratio tests were 305 used for random effects. 306 To test whether hybrids and inbreds differed in overall community composition, we 307 conducted permutational MANOVA of Bray-Curtis dissimilarities, with hybrid/inbred category, 308 field, and their interaction as predictors. Sequencing depth and collection date were included as 309 nuisance variables. Finally, we fit negative binomial models to explore which microbial taxa 310 differed in relative abundance between hybrids and inbreds. ASV counts were modeled with 311 hybrid/inbred category as the predictor (Love et al., 2014;McMurdie & Holmes, 2014). The 312 rarest taxa were excluded from this analysis: the threshold for inclusion was set at 10% of the 313 mean abundance. For example, the mean ASV count across the leaf fungal dataset was 314 14,057, so we included only ASVs that were observed at least 1,406 times. This reduced the 315 number of tests by 45% but still used 98.3% of the data. P-values from Wald tests were 316 adjusted to correct for multiple comparisons using the false discovery rate (Benjamini & 317 Hochberg, 1995). 318 Testing for microbiome heterosis: Next, we explored patterns of heterosis within 319 individual crosses for various microbiome features, including alpha and beta diversity (as 320 described above); overall community composition (summarized using the top 5 unconstrained 321 principal coordinates and the top 5 constrained axes of variation from a distance-based 322 redundancy analysis constrained on genotype); and variance-stabilized counts of ASVs. For each feature we fit a linear mixed-effects model with the predictors field, sampling effort, batch 324 effects, and (for rhizosphere samples) collection date. The purpose of these models was not to 325 test for significance, but to derive residual trait values cleansed of these sources of noise; the 326 resulting residuals were then used to test for evidence of heterosis. 327 For each feature, we calculated the mean values of these residuals in the seven inbred 328 lines, from which we calculated the expected mid-parent values (assuming additive genetic 329 variance) for each hybrid. We then conducted two-sided t-tests of the null hypothesis that each 330 hybrid's microbiome trait value was equivalent to its respective mid-parent value (i.e., tests for 331 "mid-parent heterosis"; Figure 1a). In addition, we tested for "better-parent heterosis" using one-332 sided t-tests to assess whether the hybrid value fell outside of the parental range. Finally, we 333 used linear regression to test for a relationship between microbiome heterosis (percent 334 deviation from expected mid-parent value) and average heterozygosity, estimated using 335 genome-wide SNP data from the maize HapMap v2 (Chia et al., 2012). 336

Effects of host relatedness on microbiome similarity within and between 337
environments: To test whether the relationship between two host genotypes predicts the 338 similarity of their microbiomes, we calculated the Bray-Curtis dissimilarity between all pairwise 339 combinations of samples. Each pair of samples was classified according to the known 340 relationship (or lack thereof) between the two host genotypes. We fit a linear model of Bray-341 Curtis dissimilarity with host-pair relationship as the predictor. The log-transformed absolute 342 difference in sequencing depth between samples was included as a covariate; host genotypes 343 and individual plant IDs were included as random intercept terms to account for these sources 344 of non-independence. 345 We used a similar approach to assess whether the microbiomes of hybrids and inbred 346 lines differed in their sensitivity to environmental variation. We modeled Bray-Curtis dissimilarity 347 between pairs of individuals with the same genotype using hybrid/inbred category, 348 within/between fields, and their interaction as predictor variables. The log-transformed absolute 349 difference in sequencing depth between samples was included as a covariate; the host 350 genotype and the two individual plant IDs were included as random intercept terms. 351 Estimating heritability, general combining ability (GCA), and specific combining 352 ability (SCA) for microbiome composition: Finally, we used ASReml-R software to estimate 353 GCA and SCA for key properties of hybrid microbiomes, including alpha diversity, beta diversity, 354 and the five major constrained and unconstrained axes of variation from the dbRDA described 355 above (Isik et al., 2017). To estimate variance components for each of these microbiome 356 properties, we fit separate linear mixed-effects models for inbreds and hybrids with field, sequencing depth, and collection date as fixed effects. For the inbred model, the random effects 358 included batch, block nested within field, genotype, and genotype-field interactions. For the 359 hybrid model, random effects included batch, block, female parent, male parent, the interaction 360 of each parent line with field, the interaction between the parent lines, and the three-way 361 interaction between the parent lines and field. 362 For hybrids, a diallel model was fit to the data, in which a single GCA effect was 363 estimated for each parent line, regardless of whether it was a male or female parent (Möhring et 364 al., 2011). GCA variance is related to the variation of parental breeding values in hybrids and 365 additive variance was estimated as twice the GCA variance. SCA variance was calculated as 366 the variance component due to the interaction between the parent lines and represents variation 367 of dominance genetic effects, such as those that could lead to heterosis. The total genetic 368 variation was calculated as the sum of twice the GCA plus SCA; the total genotype-by-369 environment (GxE) variation was calculated as the sum of the variance components from 370 interactions between each parent line and field (twice the GCA-field interaction plus the SCA-371 field interaction variances). Narrow-sense heritability (h 2 ) was calculated as the additive 372 variance divided by the total phenotypic variation; broad-sense heritability (H) was calculated as 373 the total genetic variation (2*GCA + SCA) divided by the phenotypic variation. For inbreds, total 374 genetic variation and GxE were calculated using the variance components for genotype and 375 genotype-by-field interaction, respectively. 376 Leaf and rhizosphere communities differed substantially in diversity and composition 384 ( Figure S3, Figure S4). Leaf communities were dominated by Pantoaea, Entyloma, and 385

RESULTS
Golubevia spp., while the most abundant taxa in the rhizosphere were Ralstonia, Burkholderia, 386 Rhizobium, Entyloma, and Golubevia (Table S3). Only 193 bacterial ASVs were observed in 387 both sample types; 260 bacterial ASVs were found only in leaves, whereas 290 were specific to 388 the rhizosphere. For fungi, 132 ASVs were specific to leaves, 196 were found only in the rhizosphere, and 173 were found in both ( Figure S4). Alpha diversity was higher in the 390 rhizosphere than in leaves. For fungi, beta diversity was also higher in the rhizosphere, 391 suggesting that belowground fungal communities may be more strongly influenced by 392 microhabitat variation or stochastic processes ( Figure S3c-d). This is consistent with the 393 observation that community composition differed strongly between fields only for rhizosphere 394 fungi ( Figure S3a-b). As expected, rhizosphere communities were compositionally distinct from 395 bulk soil ( Figure S2); they also were relatively richer in fungi ( Figure S5). 396 397 398  Microbiome composition differs between maize hybrids and 407 inbred lines 408 First, we investigated whether hybrids, as a group, differed from inbreds in microbiome 409 diversity and composition. For bacteria and leaf-associated fungi, we found no consistent 410 difference between these groups in either alpha diversity or beta diversity (Table S4). In 411 contrast, fungal communities in the rhizospheres of hybrid plants had higher alpha diversity 412 (within-sample species richness and evenness) and lower beta diversity (among-individual 413 variability) relative to inbreds (Tukey's HSD, P < 0.05). 414 Hybrids' microbiomes were primarily distinguished from those of inbred plants not by 415 alpha or beta diversity, but by overall community composition (Table 1)

435
Patterns of heterosis in microbiome features 436 Next, we tested which microbiome features, if any, showed evidence of heterosis within 437 individual crosses. These features included community-level metrics of alpha and beta diversity 438 (i.e., Shannon, ACE, distance-to-centroid) and composition (the top 5 constrained and 439 unconstrained axes of variation from distance-based redundancy analysis; Figure 5b). We used t-tests to assess whether these microbiome features deviated from the midparent expectation 441 for each hybrid, using a significance threshold of FDR < 0.05 (Figure 1). We note that 442 historically, the term "better-parent" heterosis describes cases in which the hybrid has higher 443 values of a desirable trait or lower values of an undesirable trait relative to both parents (Flint-444 Garcia et al., 2009). Because the function and value of these microbiome traits (if any) are 445 unknown, we use this term to refer to cases in which hybrid trait values are higher than both 446 parents or lower than both parents (Figure 1a). 447 Heterosis for alpha and beta diversity was detected in a few crosses, but the most 448 frequently heterotic microbiome features were the axes of variation in community composition. 449 Both midparent and "better-parent" heterosis were particularly common in the dbRDA axes that 450 were constrained by host genotype (Figure 5a), which is consistent with these axes capturing 451 the heritable components of microbiome composition. More microbiome features displayed 452 heterosis in the rhizosphere than in leaves, which is consistent with the overall stronger 453 differences between hybrids and inbreds observed previously (Figure 3). Linear regressions 454 revealed that genome-wide heterozygosity did not reliably predict the strength of microbiome 455 heterosis, quantified as the percent deviation from expected midparent values of microbiome 456 features ( Figure S6). Heterozygosity was positively associated with heterosis only for the 457 bacterial MDS1 in the rhizosphere (R 2 = 0.53, P=0.03). For bacterial CAP1 and fungal beta 458 diversity, the relationship was inverse. Most frequently, however, we found no significant 459 relationship, although this could reflect low statistical power due to the small number of hybrids 460 tested (N=11). 461 We also tested for heterosis of individual ASVs using FDR-corrected t-tests of their 462 variance-stabilized abundances. In most hybrids, most ASVs did not deviate from the midparent 463 expectation ( Figure S7a). However, 40% of rhizosphere fungal ASVs showed evidence of 464 heterosis in at least one cross, as did 20% of leaf-associated fungal ASVs, representing a wide 465 diversity of classes ( Figure S7b; Table S5). For bacterial ASVs, the rates of heterosis in the 466 rhizosphere and leaf were 35% and 17%, respectively. The prevalence of ASV heterosis varied 467 among hybrids, ranging from 27% of the tested fungal ASVs in the B73xP39 rhizosphere, to 468 5.1% of the tested bacterial ASVs in the B73xMo17 rhizosphere ( Figure S7a). The most 469 strongly heterotic fungal ASV (showing "better-parent" heterosis most frequently) belonged to 470 the yeast Moesziomyces aphidis, which was enriched in the rhizospheres of six hybrids relative 471 to both of their inbred lines ( Figure S8). Two bacterial ASVs in the rhizosphere (Pseudonocardia 472 sp. and Sphingomonas sp.) and one in the phyllosphere (Acinetobacter sp.) each showed 473 "better-parent" heterosis in five hybrids ( Figure S8).
For both bacteria and fungi, the second constrained axis of rhizosphere variation (CAP2) 475 was the most frequently heterotic microbiome property; hybrid values of bacterial CAP2 476 exceeded both parental values for all 11 crosses ( Figure 5). To obtain a biological interpretation 477 of CAP2, we identified the top 10% of ASVs with the strongest loadings on this axis (Table S6). 478 These included 13 of the ASVs that were differentially abundant between hybrids and inbred 479 lines (Figure 4)

501
Patterns of microbiome heritability in F 1 crosses 502 Finally, we investigated general patterns of microbiome heritability in this set of maize 503 crosses. In particular, we (1) partitioned variance in community composition among genotypes 504 within hybrid/inbred categories; (2) compared microbiome similarity of pairs of genotypes with 505 differing levels of relatedness; (3) tested whether the microbiomes of hybrid and inbred maize 506 were equally influenced by environmental variation; and (4) investigated the relative importance 507 of additive versus heterotic components of genetic variation for shaping hybrid microbiomes. 508 First, we used permutational MANOVA to test for host genetic variation among inbred 509 lines and among hybrids for whole-community composition. Host genotype contributed 4-8% of 510 the variation in overall community composition for both bacteria and fungi, in both leaves and 511 rhizosphere (Table S7). Alpha and beta diversity also varied among host genotypes, but not as 512 consistently as community composition. For instance, host genotype affected alpha diversity 513 and distance to centroid (i.e., among-individual variation within genotypes); but almost 514 exclusively in the rhizosphere (Table S4). 515 Next, to assess how relatedness affects microbiome similarity, we modeled Bray-Curtis 516 dissimilarity between pairs of samples as a response to host-genotype-relationship using 517 ANOVA. Overall, host-genotype-relationship had a stronger effect on rhizosphere microbiome 518 dissimilarity than leaf microbiome dissimilarity (Figure 6a). Bacterial rhizosphere composition 519 differed more between pairs of hybrids (regardless of their relatedness) than between pairs of 520 inbreds; the opposite was true in leaves. We found no evidence that hybrids' microbiomes are 521 more similar to those of their maternal inbred than to their paternal inbred; in fact, for leaf-522 associated bacteria, the opposite was true. Similarly, hybrid microbiomes were equally 523 dissimilar to those of their half-siblings regardless of whether the female or male parent was 524 shared (Figure 6a). 525 Third, we tested whether genotype-by-environment (GxE) interactions are more 526 prominent for inbred microbiomes than hybrid microbiomes, as they are for many plant traits 527 (Lewis, 1954;Griffing & Zsiros, 1971;Blum, 2013;Zanewich et al., 2018) (but see (Li et al., 528 2018)). We found that pairs of siblings from hybrid genotypes had less similar rhizosphere 529 microbiomes than pairs of siblings from inbred genotypes, both within fields and between fields 530 ( Figure 6b). In addition, fungal load in the rhizosphere was stable across fields for inbreds, but 531 not hybrids ( Figure S5). For leaf microbiomes, however, there was no difference in sibling-pair 532 dissimilarity between inbreds and hybrids. The variance components contributed by Genotype 533 and GxE supported these findings. For many leaf microbiome traits, G/GxE was equal between 534 hybrids and inbreds; for others (particularly fungal traits) it was higher in hybrids, and for others 535 (particularly bacterial traits) the opposite was true ( Figure S9a). 536 Last, we investigated whether hybrid microbiomes were influenced more by general 537 combining ability (GCA; the average trait values of the two parent lines, mostly involving additive 538 gene effects) or specific combining ability (SCA; the interaction between the two parent lines, 539 involving synergistic or heterotic effects). Using this method we detected moderate heritability 540 for a variety of leaf microbiome features, but no heritability for rhizosphere microbiomes. For 541 both fungi and bacteria, narrow-sense heritability (h 2 ) was much higher among inbred lines than 542 among hybrids ( Figure S9b). For hybrids, comparison of SCA and GCA revealed that for 543 multiple axes of leaf microbiome variation, broad-sense heritability was considerably higher than 544 h 2 ( Figure S9c). This indicates that leaf microbiomes, in particular, were shaped by specific 545 combinations of parental genotypes (i.e., heterotic or synergistic effects) beyond the general 546 combining ability (additive effects) of those parents. 547

561
Understanding patterns and mechanisms of microbiome heritability is critical to 562 understanding the evolution of host-microbiome symbiosis in both natural and human-directed 563 systems. Here we demonstrate that microbiome diversity and composition display some of the 564 same patterns as other complex plant traits within F1 crosses. In general, leaf and rhizosphere 565 microbiomes of hybrid maize differ reliably from those of inbred maize lines, much like 566 performance-related traits such as height and yield (Flint-Garcia et al., 2009). Within crosses, 567 we detected evidence for heterosis of a variety of microbiome features, including beta diversity 568 and several orthogonal axes of community composition ( Figure 5). 569 Overall, effects of hybridization (and of host genotype more generally) were stronger in 570 the rhizosphere than in the leaves. The reasons for this are unclear. One possible explanation is 571 that the difference in sampling time (3 weeks after planting for leaves, 7 weeks for rhizosphere) 572 allowed more time for deterministic effects of host genotype to shape rhizosphere communities, 573 whereas the leaf microbiome reflected more of the stochastic variation that contributes to initial 574 community assembly (Maignien et al., 2014). It is important to note that in our study, the 575 observed differences between leaf and rhizosphere communities reflect not only the distinction 576 between aboveground and belowground organs, but also differences in abiotic environment and 577 in plant developmental stage that correspond to the two timepoints. Maize leaf microbiomes 578 change drastically over the growing season (Wagner et al., 2020). Nevertheless, studies from 579 diverse plant species confirm that leaf and rhizosphere microbiomes are reliably distinct 580 (Coleman-Derr et al., 2016;Wagner et al., 2016;Wallace et al., 2018b). In maize, individual 581 leaves can senesce within weeks, especially early in development; in contrast, much of the root 582 system remains active throughout the plant's lifespan (Fusseder, 1987). Thus, maize leaves are 583 generally more transient habitats for microbes than maize rhizosphere. However, the same is 584 true for the perennial forb Boechera stricta, in which host genotype effects were stronger in 585 leaves than in roots (Wagner et al., 2016). One potential explanation for this opposite result is 586 that maize forms partnerships with mycorrhizal fungi, while (like other mustards) B. stricta does 587 not; this hints at fundamental differences in mechanisms of interaction with belowground 588 microbes. 589 In this study, host genotype and its interaction with environment explained between 4-590 8% of the overall variation in microbiome composition (Table S3). This is a high proportion 591 relative to other studies of host genetic microbiome control, but low relative to studies of other 592 maize traits; flowering time and resistance to various diseases often have heritabilities above 80% (Robertson et al., 2006;Buckler et al., 2009). However, this may be an underestimate of 594 the true importance of host genotype. If only a subset of the microbiome is actually responsive 595 to host variation, then using the entire community as a response variable is likely diluting the 596 host genotype signal. In support of this possibility, some studies, including in maize, have found 597 that heritability of microbiome functional potential (i.e., metagenome content) is higher than the 598 heritability of taxonomy-based microbiome features such as those used in our analysis 599 (Lemanceau et al., 2017;Wallace et al., 2018a). 600 Maize is well-known for the strength of heterosis in intraspecific crosses, and our results 601 indicate that this phenomenon can be seen within leaf and rhizosphere microbiomes. However, 602 not all maize crosses are equally heterotic, either for yield or for the underlying traits (Flint-603 Garcia et al., 2009;Guo et al., 2019). This, too, was evident in our microbiome data ( Figure 5; 604 Figure S7), although we found no consistent relationship between heterozygosity (estimated 605 from genome-wide SNP data) and microbiome heterosis ( Figure S6). However, with only 11 606 crosses included in our study, an explicit test for a relationship between the extent of 607 microbiome heterosis and the extent of heterosis for other phenotypic traits was beyond the 608 scope of this work. One hurdle to addressing this question is the general lack of knowledge 609 about which plant traits cause changes in the microbiome. Mutant and transgenic studies in 610 Arabidopsis have demonstrated that properties of the leaf cuticle affect microbial colonization 611 (Bodenhausen et al., 2014;Ritpitakphong et al., 2016), and that physiological traits such as 612 hormone signalling and defensive secondary chemistry influence the root and rhizosphere 613 microbiomes (Bressan et al., 2009;Lebeis et al., 2015). However, it is unknown whether the 614 same traits are involved in shaping the maize microbiome. An additional complication is the fact 615 that the microbiome can, in turn, affect phenotypic expression in the host plant (O'Brien et al.;616 Panke-Buisse et al., 2017;Berg & Koskella, 2018) as well as the relationship between 617 phenotype and yield (Wagner et al., 2014;Chaney & Baucom, 2020). For these reasons, careful 618 experimentation-not just simultaneous measurement of plant traits and microbiome features-619 will be needed to clarify the phenotypic links between heterozygosity and microbiome heterosis 620 ( Figure 1). 621 Although no comparable studies are currently available, we expect that other sets of 622 maize genotypes, and indeed other plant species, could show different effects of hybridization 623 on the microbiome. A comparison to plant species that display outbreeding depression rather 624 than heterosis would be particularly interesting. Evidence from animal systems suggests a 625 potential role for the microbiome in speciation. Crosses between two subspecies of house mice 626 (Mus musculus) result in hybrid offspring with reduced fertility and gut microbiomes that diverge 627 from both parent subspecies (Wang et al., 2015), while lethality of hybrids between species of Track-1 grant (OIA-1656006 -transformed) and Shannon metrics were used to quantify community richness and evenness, respectively; Distance to centroid was used to quantify beta diversity (i.e., amongindividual variation within genotypes). Bacterial (a) and fungal (b) communities were tested separately using univariate models and analysis of variance with Type III sums of squares and Satterthwaite's approximate denominator d.f. Likelihood ratio tests were used for significance testing of randomintercept effects. For rhizosphere fungal communities, one genotype (Oh43) was excluded from the analysis because no individuals survived in one field. *** P < 0.001; ** P < 0.01; * P < 0.05 . All Pvalues were adjusted to correct for multiple comparisons using the Benjamini-Hochberg false discovery rate.

744
The number of rhizosphere samples in the 16S dataset are shown for each host genotype in each field.

745
Complete replicates were collected on each day so that collection date would not be confounded with  the F1 hybrid ("Hyb") for each cross. T-tests were used to test for midparent and "better parent" heterosis 806 as described in the Methods section of the main text. Statistical support (at P < 0.05) for midparent or 807 "better parent" heterosis is noted with "*MPH" and "*BPH", respectively. For each hybrid, we used genome-wide heterozygosity to predict heterosis of microbiome 813 properties (calculated as the percent change relative to the expected midparent value). Results 814 of linear models are provided; unless otherwise shown, P > 0.1 after correction for false 815 discovery rate. 816 Sequencing depth) is plotted for each taxon in the maternal inbred parent ("Mat"), the paternal inbred 831 parent ("Pat"), and the hybrid ("Hyb"). All contrasts between the hybrid and the inbred parents were 832 statistically significant after correction for false discovery rate (T-test, FDR < 0.05). (a) "bASV_3" 833 belonged to the genus Pseudonocardia and showed "better parent" heterosis in the rhizospheres of 5 834 hybrids. (b) "bASV_51" belonged to the genus Acinetobacter and showed "better parent" heterosis in the 835 leaves of 5 hybrids. (c) "bASV_224" belonged to the genus Sphingomonas and showed "better parent" 836 heterosis in the rhizospheres of 5 hybrids. (d) "fASV_28" belonged to Moesziomyces aphidis and showed 837 "better parent" heterosis in the rhizospheres of 6 hybrids.  after planting. Relative abundances were modeled using hybrid/inbred Category, Field, and 854 their interaction as fixed effect predictors, with Genotype, Block, and Batch as random-intercept 855 terms. 856 857