A global barley panel revealing genomic signatures of breeding in modern cultivars

The future of plant cultivar improvement lies in the evaluation of genetic resources from currently available germplasm. Recent efforts in plant breeding have been aimed at developing new and improved varieties from poorly adapted crops to suit local environments. However, the impact of these breeding efforts is poorly understood. Here, we assess the contributions of both historical and recent breeding efforts to local adaptation and crop improvement in a global barley panel by analysing the distribution of genetic variants with respect to geographic region or historical breeding category. By tracing the impact breeding had on the genetic diversity of barley released in Australia, where the history of barley production is relatively young, we identify 69 candidate regions within 922 genes that were under selection pressure. We also show that modern Australian barley varieties exhibit 12% higher genetic diversity than historical cultivars. Finally, field-trialling and phenotyping for agriculturally relevant traits across a diverse range of Australian environments suggests that genomic regions under strong breeding selection and their candidate genes are closely associated with key agronomic traits. In conclusion, our combined dataset and germplasm collection provide a rich source of genetic diversity that can be applied to understanding and improving environmental adaptation and enhanced yields. Author summary Today’s gene pool of crop genetic diversity has been shaped during domestication and more recently by breeding. Genetic diversity is vital for crop species to be able to adapt to changing environments. There is concern that recent breeding efforts have eroded the genetic diversity of many domesticated crops including barley. The present study assembled a global panel of barley genotypes with a focus on historical and modern Australian varieties. Genome-wide data was used to detect genes that are thought to have been under selection during crop breeding in Australian barley. The results demonstrate that despite being more extensively bred, modern Australian barley varieties exhibit higher genetic diversity than historical cultivars, countering the common perception that intensive breeding leads to genetic erosion of adaptive diversity in modern cultivars. In addition, some loci (particularly those related to phenology) were subject to selection during the introduction of other barley varieties to Australia – these genes might continue to be important targets in breeding efforts in the face of changing climatic conditions.

production is relatively young, we identify 69 candidate regions within 922 genes that were 27 under selection pressure. We also show that modern Australian barley varieties exhibit 12% 28 higher genetic diversity than historical cultivars. Finally, field-trialling and phenotyping for 29 agriculturally relevant traits across a diverse range of Australian environments suggests that 30 genomic regions under strong breeding selection and their candidate genes are closely 31 associated with key agronomic traits. In conclusion, our combined dataset and germplasm 32 collection provide a rich source of genetic diversity that can be applied to understanding and 33 improving environmental adaptation and enhanced yields. Author summary 36 Today's gene pool of crop genetic diversity has been shaped during domestication and more 37 recently by breeding. Genetic diversity is vital for crop species to be able to adapt to 38 changing environments. There is concern that recent breeding efforts have eroded the genetic  The diversity of the existing genetic pool for commercially important plant species has been 52 shaped during plant domestication, human migration, varietal selection processes and, more 53 recently, breeding. However, there is concern that breeding efforts have eroded genetic 54 variation, thereby resulting in a narrow range of genotypes in the current gene pools of 55 domesticated crops [1,2]. Although regionally adapted landraces and wild relatives represent 56 the most diverse germplasm reservoirs, the introgression of desirable alleles into elite 57 germplasm used by breeders-whilst minimising the introduction of other genes from the 58 wild germplasm that might reduce the agronomic fitness of the elite cultivar -has been 59 challenging and time consuming [3]. As a result of these challenges and the often limited 60 availability of high-density markers and detailed information for key adaptive traits, the high 61 degree of genetic diversity in wild crop relatives has been poorly exploited. with a minor allele frequency (MAF) > 0.01 were used in the present study (S1 Table). As data for all 632 accessions in the barley panel, the PIC was estimated to be 0.17 (Table 1), 140 although we observed marked differences among different geographical regions (Australia, 141 Africa, Asia, Europe, North America, and South America) and among historical subgroups of    geographic location (Fig 1e), row type, and growth habit (Fig 1f). As expected, no clear  Genetic marker pairs were sorted into 100-kb bins based on the distance between pairs, and 212 mean r 2 values were estimated for each bin (S2-6 Files). Owing to selection pressure on large 213 genomic regions for positive alleles, the subsequent fixation of the alleles during breeding, 214 and high rates of self-fertilization, Australian barley subgroups (CatA to CatC) were found to 215 contain larger LD blocks, higher baseline LD, and higher long-range LD than the entire   Table).  the earliest-(CatA) and latest-released (CatC) Australian barley cultivars (Fig 3c), as well as 283 for the North American and Asian varieties (Fig 3d).

284
To unravel genomic regions targeted by breeders in efforts to improve barley production in we identified 119 missense tolerated and deleterious mutations in 42 genes (S10 File). Of 340 these variants, 29 were detected and annotated within 12 genes that exhibited large 341 differences in allele frequency among the three historical categories (≥20%; S5 Table). These   [FDR] of P < 0.05) for flowering time located within 327 unique genes with functional 381 annotations, each explaining up to 18.7% of the phenotypic variation (S13 File, S18 Figure).

394
Notably, we also detected novel associations with candidate phenology-related genes that 395 were not included in the previous target-enrichment sequencing study [10]   to photoperiod [28]. For grain yield, novel and highly significant MTAs were located within

566
Z scores were used to determine above-(positive Z score) and below-average (negative Z 567 score)-yielding cultivars, as well as cultivars that flowered earlier (negative Z score) or later 568 (positive Z score) or were shorter (negative Z score) or taller (positive Z score) than average 569 for all year-by-location combinations. The critical Z score values for a 95% confidence level 570 were −1.96 and +1.96 standard deviations, equal to a P-value of 0.05. Genotype trait 571 characteristics (e.g. early flowering, high yielding, or short stature) were defined as 'robust' if 572 they were consistently below or above the population mean in one location, 'stable' if they 25 573 were significant (less than −1.96 or greater than +1.96 standard deviation) in more than one 574 location, and 'consistent' if they were significant (less than −1.96 or greater than +1.96 575 standard deviation) across at least two years at one or more locations.       Stringent filtering steps were adopted to obtain clean data as previously described [10,11].

641
All genotype data were combined, filtered based on duplicates and MAF >1%, and imputed   collection inferred using different methodologies were compared, and the final K-value was 672 ascertained using ADMIXTURE [19].