Population structure of the Brachypodium species complex and genome wide association of agronomic traits in response to climate

The development of model systems requires a detailed assessment of standing genetic variation across natural populations. The Brachypodium species complex has been promoted as a plant model for grass genomics with translational to small grain and biomass crops. To capture the genetic diversity within this species complex, thousands of Brachypodium accessions from around the globe were collected and sequenced using genotyping by sequencing (GBS). Overall, 1,897 samples were classified into two diploid or allopolyploid species and then further grouped into distinct inbred genotypes. A core set of diverse B. distachyon diploid lines were selected for whole genome sequencing and high resolution phenotyping. Genome-wide association studies across simulated seasonal environments was used to identify candidate genes and pathways tied to key life history and agronomic traits under current and future climatic conditions. A total of 8, 22 and 47 QTLs were identified for flowering time, early vigour and energy traits, respectively. Overall, the results highlight the genomic structure of the Brachypodium species complex and allow powerful complex trait dissection within this new grass model species.


Introduction 34
Climate change is impacting the production of food worldwide (1) while 35 increasing global demand will soon outstrip the rate of improvement in crop 36 yield by traditional breeding methods (2). To address food and climate security, 37 there is a need for agricultural innovation across a range of scientific disciplines, 38 from genomics to phenomics in new species across the landscape (3). Key to 39 breeding for more variable future climates, as well as for broad adaptability, is 40 understanding the plasticity of the genetic architecture of agronomic traits 41 across environments. The use of controlled growth cabinets, that can mimic water limiting grain filling. Early vigour, defined as an increase in the above 51 ground biomass prior to stem elongation, is a beneficial trait in many 52 environment types, especially when combined with increased transpiration 53 efficiency (5). Since vapour pressure is low in winter, increased biomass during 54 early growth improves plant water use efficiency. Early vigour also increases 55 competition against weeds, reduces soil evaporation and may improve yields by 56 increasing total seasonal biomass (6). Energy use efficiency is a relatively 57 understudied component of plant growth that represents the efficient transfer of 58 energy, acquired through photosynthesis, to the grain, and may significantly 59 affect yield. Early studies indicate that energy efficiency, via lower respiration 60 rates, are correlated with an increase in biomass in monocot species (7,8).  collections outside the native range ( Fig 1A). Conversely, B. distachyon is largely 147 limited to the native Mediterranean and Western Asian regions, with B. stacei in 148 the same area but was less common.
Gaz-3 Adi-15 BdTR11a Adi-16 Adi-9 Adi-18 BdTR10c Adi-6 BdTR9a Adi-1 BdTR12c BdTR5g Adi  (Fig 1B). The yellow lineage was the most diverged and 191 represents subspecies B, with the brown and red structure groups representing 192 the predominantly East and West populations of the A subspecies. To visualise 193 the geographic distribution, the ancestral group composition was summed 194 across accessions for each geographic site (Fig 1C). The single B. distachyon 195 accession from Australia WLE2-2 was nearly identical to BdTR9f (GBS data, 196 Supp. S02) from southern western Turkey, where it may have originated from. It 197 is shown in its ancestral location (Fig 1C arrow). four and five mainstem leaves, the dimensions of leaf #3, seedling height, total 251 leaf area, and above ground dry weight were measured and phenotypic 252 correlations were calculated (Supp. S09B). Narrow sense heritability was also 253 calculated to determine which early vigour trait would provide the most power 254 for mapping QTLs with GWAS (Supp. S09C). Leaf #3 width and length correlated 255 well with above ground biomass (r 2 =0.46, p<0.01 and r 2 =0.48, p<0.01, 256 respectively) and had quite high heritabilities of h 2 =0.60 and h 2 =0.64, 257 respectively, as compared to above ground dry mass, h 2 =0.51. Interestingly, 258 seedling height also had a strong correlation with above ground biomass 259 (r 2 =0.74, p<0.01) with a heritability of h 2 =0.74. However, this trait was also more 260 highly correlated with development, as indicated by the number of leaves 261 (r 2 =0.21, p<0.01), than the dimensions of leaf #3. To get the most direct measure 262 of early vigour the dimensions of leaf #3 were chosen as the focus for the GWAS.

298
As expected, the accessions developed quicker and grew larger in the 2050 299 temperature profile (Fig 2A, B) as is consistent with a quicker accumulation of 300 thermal time (Fig 2D). Early vigour parameters and energy use efficiency traits 301 were measured when the majority of plants were at a four-leaf stage. Growth 302 stages, tiller number and ear emergence date were monitored twice a week 303 (Supp. S12, S13, S14). The experiment was ceased after 200 days of growth, at 304 which time there were 5 and 7 lines that did not flower in the 2015 condition 305 and 2050 conditions, respectively. The remaining lines reached ear emergence at 306 a similar number of days in both the present and future conditions (Fig 2E).  For early vigour, 22 significant QTLs were identified for 5 traits across the two 346 climate conditions (Supp. S15). Two QTLs were identified in both conditions, 347 EarlyVigour_QTL1.1 and EarlyVigour_QTL3.1, and both of these were for leaf #3 348 length. The 100kb region surrounding these QTLs contained 19 and 13 genes, 349 respectively (Fig 4B and C). There was a highly significant QTL on chromosome 3 350  The 100kb region surrounding this QTL contained 13 genes (Supp. S18). A total 353 of 6 QTLs were identified for the genotype x environment interaction across the 354 two conditions for early vigour traits.

368
For the energy use efficiency traits, a total of 47 QTLs were identified across the 369 two conditions for the three measured traits and four derived traits (Supp. S15). 370 Of these QTL, none were found in both environments. However, a strong QTL, 371  we present a unique systematic method of determining the species of an 388 accession using low coverage genotyping by sequencing and bioinformatics, 389 providing a high throughput and low cost alternative for species identification. 390 We found that the majority of our accessions were in fact B. hybridum (56%) 391 including the vast majority of accessions in Australia and North America (Fig  392   1A). The wide dispersion of this species may be due to the benefit of the multiple 393 genomes resulting from polyploidisation (47). There were relatively few B. stacei 394 (3%), which were limited to the Mediterranean region ( Fig 1A). 395 396 Within B. distachyon itself we found significant population structure including 397 high level subspecies splits, with 6.5% divergence between subspecies, which is 398 greater than that found between indica and japonica rice at 1.4% divergence 399 #3 length was also found to be significant across both environments. There were 478 no obvious candidate genes for this QTL but a number of signalling proteins that 479 could be involved in molecular control of leaf size (Fig 4C, Supp. S18). To determine the population structure of B. distachyon a pairwise Identity By 550 State genetic distance was calculated to identify among 490 high quality samples 551 a core diversity set of 72 distinct genotypes using 82,800 SNPs derived from GBS 552 data and the SNPRelate package using a z-score of 3.5. Occasionally, when 553 genotypes are closely related, noise between technical replicates of an accession 554 will result in them being split across the related genotypes. Therefore, we keep 555 replicate(s) from the genotype with the majority of replicates for that accession, 556 breaking ties by keeping the replicate with the lowest missing data. In addition, 557 29 accessions whose geographic origin was suspect were also excluded. 558

559
To avoid bias from including up to 30 inbred accessions of the same genotype, a 560 reduced set was input into STRUCTURE V.2.3.4 (34). A total of six replicates were 561 run of population (K) 1-13 with a burnin setting of 10,000 sets, and 100,000 562 permutations per run (Fig 1B, Supp. S04). The optimal K was determined as K=3 563 In preparation for GWAS, the genotype data was filtered to remove non-variant 694 SNPs and redundant SNPs (i.e. SNPs whose genotypes are not different from 695 adjacent SNPs but have more missing data points). Then SNPs with a minor allele 696 frequency of < 3% were filtered out. As there was 18.5% missing data in the 697 original data set, imputation was undertaken. First, if the observed genotypes of 698 two adjacent SNPs were not different, then the missing genotype of one SNP was 699 accessions. These parameters were determined by simulations to achieve an 706 optimal imputation success rate, which was 97.95% for our data. Finally, SNPs 707 with a minor allele frequency <5% were filtered. To determine a significance threshold, the permutation test was implemented on 725 1000 permutations of the phenotype data to estimate the genome-wide 726 significance threshold at 0.05 for the trait of days to ear emergence. The 727 significance threshold was determined to be a LOD (Logarithm of ODds) of 728 4.43583. 729