Applied phenomics and genomics for improving barley yellow dwarf resistance in winter wheat

Paula Silva; Byron Evers; Alexandria Kieffaber; Xu Wang; Richard Brown; Liangliang Gao; Allan Fritz; Jared Crain; Jesse Poland

doi:10.1101/2022.01.05.475073

Abstract

Barley yellow dwarf (BYD) is one of the major viral diseases of cereals. Phenotyping BYD in wheat is extremely challenging due to similarities to other biotic and abiotic stresses. Breeding for resistance is additionally challenging as the wheat primary germplasm pool lacks genetic resistance, with most of the few resistance genes named to date originating from a wild relative species. The objectives of this study were to, i) evaluate the use of high-throughput phenotyping (HTP) from unmanned aerial systems to improve BYD assessment and selection, ii) identify genomic regions associated with BYD resistance, and iii) evaluate genomic prediction models ability to predict BYD resistance. Up to 107 wheat lines were phenotyped during each of five field seasons under both insecticide treated and untreated plots. Across all seasons, BYD severity was lower with the insecticide treatment and plant height (PTHTM) and grain yield (GY) showed increased values relative to untreated entries. Only 9.2% of the lines were positive for the presence of the translocated segment carrying resistance gene Bdv2 on chromosome 7DL. Despite the low frequency, this region was identified through association mapping. Furthermore, we mapped a potentially novel genomic region for resistance on chromosome 5AS. Given the variable heritability of the trait (0.211 – 0.806), we obtained relatively good predictive ability for BYD severity ranging between 0.06 – 0.26. Including Bdv2 on the predictive model had a large effect for predicting BYD but almost no effect for PTHTM and GY. This study was the first attempt to characterize BYD using field-HTP and apply GS to predict the disease severity. These methods have the potential to improve BYD characterization and identifying new sources of resistance will be crucial for delivering BYD resistant germplasm.

Introduction

Wheat (Triticum aestivum L.) is one of the most essential food crops in the world and is constantly threatened by several biotic stresses (Savary et al. 2019). Among the most important viral stresses is barley yellow dwarf (BYD). This disease is widespread across the world, caused by viruses and transmitted by aphids (Shah et al. 2012), and can cause significant yield reductions in susceptible cultivars. In Kansas, BYD is the fourth most significant wheat disease in terms of average estimated yield losses with an average yield loss of approximately 1% estimated over the past 20 years (Hollandbeck et al. 2019), equivalent to a loss of more than $10 million per year. However, yield losses are highly variable ranging from 5% to 80% in a single field depending on the environment, management practices, the host, and the genetic background, (Miller and Rasochová 1997; Perry et al. 2000; Gaunce and Bockus 2015). Moreover, the wide host range and the complex lifestyle of its vectors make BYD extremely difficult to manage, and different management strategies (e.g., planting date and control of vector populations) are inconsistent depending on climate and location (Bockus et al. 2016). Thus, in many production environments, particularly in the Central and Eastern regions of Kansas, BYD is often the most economically impactful disease.

Barley yellow dwarf disease symptoms are highly variable depending on the crop, variety, time, and developmental stage when the infection occurs, aphid pressure, and environmental conditions (Shah et al. 2012; Choudhury et al. 2019b). BYD characterization in the field is extremely challenging as the symptoms can easily be confused with other viral disease symptoms such as wheat streak mosaic virus symptoms, nutrient deficiencies, or environmental stresses like waterlogging (Shah et al. 2012). Typical BYD symptoms can be observed at all levels of plant organization – leaf, roots, and flowers. Leaf discoloration in shades of yellow, red, or purple, specifically starting at the tip of the leaf and spreading from the margins toward the base is common as well as a reduction in chlorophyll content (Jensen and Van Sambeek 1972; D’arcy 1995). Often the entire plant visually appears stunted or dwarfed from a reduction in biomass by reducing tiller numbers. Spike grain yield is decreased through a reduction in kernels per spike and kernel weight which also affects grain quality (Riedell et al. 2003; Choudhury et al. 2019b). Quality can be further reduced by a reduction in starch content (Peiris et al. 2019). Below ground effects of BYD have also been reported including reduced root growth (Riedell et al. 2003).

Currently, there is no simple solution to control BYD (Walls et al. 2019), however, the use of genetic resistance and tolerance is the most appealing and cost-effective option to control this disease (Comeau and Haber 2002; Choudhury et al. 2017; 2019b). Resistance and tolerance could be different genetic mechanisms, namely stopping virus replication and minimizing disease symptoms respectively, but within this paper all mention of resistance includes both genetic resistance and tolerance. Breeding strategies involving genetic resistance can target either the aphids or the virus. Resistance to aphids can be achieved by three different strategies, antixenosis, antibiosis, or tolerance (Girvin et al. 2017). To date, most breeding efforts have been directed to the identification of viral tolerance, also known as ‘field resistance’, that refers to the ability of the plant to yield under BYD infection and is associated with a reduction of symptoms of infection independent of the virus titer (Foresman et al. 2016). Field resistance has been reported to be polygenic, falling under the quantitative resistance class, where several genes with very small effects control the resistance response (Qualset et al. 1973, Cisar et al. 1982; Ayala et al. 2002; Choudhury et al. 2019a; c).

Presently, no major gene conferring immunity or a strong resistant phenotype to BYD has been identified in bread wheat, and only four resistance genes have been described for BYD. Located on chromosome 7DS, Bdv1 is the only gene described from the primary pool of wheat and was originally identified in the wheat cultivar ‘Anza’ (Qualset et al. 1984; Singh et al. 1993). This gene provides resistance to some but not all the viruses that cause BYD (Ayala-Navarrete and Larkin 2011). The other three named genes were all introduced into wheat through wide crossing from intermediate wheatgrass (Thinopyrum intermedium) (Ayala et al. 2001; Zhang et al. 2009). Bdv2 and Bdv3 are both located on a translocation segment on wheat chromosome 7DL (Brettell et al. 1988; Sharma et al. 1995), while Bdv4 is located on a translocation segment on chromosome 2D (Larkin et al. 1995; Lin et al. 2007). Bdv2 was the first gene successfully introgressed in wheat breeding programs from the tertiary gene pool for BYD resistance (Banks et al. 1995) and deployed into varieties.

In addition to the four known resistance genes, other genomic regions associated with BYD resistance have been identified through genetic mapping. These regions have been described on nearly all wheat chromosomes but have not been genetically characterized (Ayala et al. 2002; Jarošová et al. 2016; Choudhury et al. 2019a; b; c). Moreover, two recent studies have reported that some of these new genomic regions display additive effects (Choudhury et al. 2019a; b). Additive genetic effects had already been reported in lines combining Bdv2 and Bdv4 (Jahier et al. 2009).

Taken together, research indicates that resistance genes to BYD in wheat are rare. With a lack of major genes and difficulty to characterize resistance in the wheat pool likely due to the polygenic nature of many small effect loci, identifying resistance has been limited. Nevertheless, breeding programs have devoted large efforts for breeding BYD resistance due to the economic importance of this disease, with some of the greatest success coming from wide crosses to the tertiary gene pool.

Breeding for BYD resistance can be improved by applying strategies for more effective evaluation and utilization of the identified resistance. To get a better understanding of BYD and its quantitative nature, consistent and high-throughput methods are needed for the identification of resistant wheat lines for large-scale selection in breeding programs (Aradottir and Crespo-Herrera 2021). Effective selection on the quantitative resistance with low heritability can be aided by the high-throughput genotyping, high-throughput phenotyping (HTP), or a combination of both.

Access to high-density genetic markers at a very low-cost, owing to the rapid developments in DNA sequencing, have enabled breeding programs to apply molecular breeding for quantitative traits. Genomic selection (GS) is a powerful tool to breed for quantitative traits with complex genetic architecture and low heritability (e.g., yield, quality, and diseases such as Fusarium head blight), because it has greater power to capture loci with small effect compared with other marker-assisted selection strategies (Meuwissen et al. 2001; Poland and Rutkoski 2016). In addition to molecular data, HTP using unmanned aerial systems (UAS), or ground-based sensors is providing high density phenotypic data that can be incorporated into breeding programs to increase genetic gain (Haghighattalab et al. 2016; Crain et al. 2018; Wang et al. 2020). Using precision phenotyping for disease scoring can improve the capacity for rapid and non-biased evaluation of large field-scale numbers of entries (Poland and Nelson 2011). Taken together improvements in genomics and phenomics have the potential to aid breeding progress for BYD resistance.

In an effort to accelerate the development of resistant lines, we combined high throughput genotyping and phenotyping to assess BYD severity in a large panel of elite wheat lines. We evaluated the potential of HTP data to accurately assess BYD severity as well as identify genetic regions associated with BYD resistance and inform whole genome prediction to identify resistant lines.

Materials and Methods

Plant Material

A total of 381 different wheat genotypes were characterized for BYD resistance, including 30 wheat cultivars and 351 advanced breeding lines in field nurseries over five years (Table S1). In each nursery, an unbalanced set of 52 – 107 wheat entries were evaluated including both cultivars and breeding lines (Table 1). The BYD susceptible cultivar ‘Art’ and BYD resistant cultivar ‘Everest’ were included in all the nurseries (seasons) as checks.

View this table:

Table 1:

Field experimental details for the five wheat nurseries

Field Experiments

Nurseries for BYD field-screening were conducted during five consecutive wheat seasons (2015 – 2016 to 2019 – 2020) (Table 1). Seasons 2015 – 16 and 2016 – 17 were conducted at Kansas State University (KSU) Rocky Ford experimental station (39°13′45.60″ N, 96°34′41.21″ W), while the 2017 – 18, 2018 – 19, and 2019 – 20 nurseries were planted at KSU Ashland Bottoms experimental station (39°07′53.76″ N, 96°37′05.20″ W). The nurseries were established for natural infections by planting about three weeks earlier than the normal planting window in mid-September. The susceptible cultivar ‘Art’ was planted as a spreader plot in the borders and as a control check plot also with the resistant cultivar ‘Everest’. The experimental unit was 1.5m × 2.4m with a six-row plot on 20cm row spacing.

A split-plot field design with two or three replications was used where the main plot was insecticide treatment, and the split plot was the wheat genotype. Three replications were used for proof of concept during the first two seasons but then two replications were chosen as a balance of space and number of entries for the following seasons. For the treated replications the seed were treated at planting with Gaucho XT (combination of insecticide and fungicide) at a rate of 0.22 ml/100g of seed, followed with foliar insecticide applications starting from approximately 2 – 3 weeks after planting through heading. Depending on field conditions, spray treatments were conducted every 14 – 21 days if average air temperatures remained above 10°C. Foliar insecticides were applied to the treated replications in a spray volume of 280.5 L/ha using a Bowman MudMaster plot sprayer equipped with TeeJet Turbo TwinJet tips. Insecticide applications consisted of a rotation of Warrior II, Lorsban, and Mustang Max at rates of 0.14L/ha, 1.17L/ha, and 0.29L/ha, respectively. For the control insecticide treatment (untreated), the seed were treated with Raxil MD (fungicide) at a rate of 0.28ml/100g of seed, and no foliar insecticide applications were applied. Foliar fungicide Nexicor was applied to the whole experiment at a rate of 0.73L/ha, at both planting and heading, to control all other diseases so the main disease pressure was focused on BYD.

Phenotypic Data

Individual plots were assessed for i) BYD severity characterized as the typical visual symptoms of yellowing or purpling on leaves using a 0 – 100% visual scale, determined directly after spike emergence by recording the proportion of the plot exhibiting the symptoms (Table 1), ii) manual plant height (PTHT_M, meters), and iii) grain yield (GY, tons/ha). Experimental plots were harvested using a Kincaid 8XP plot combine (Kincaid Manufacturing., Haven, KS, USA). Grain weight, grain moisture and test weight measurements for each plot was recorded using a Harvest Master Classic GrainGage and Mirus harvest software (Juniper Systems, Logan, UT, USA). Visual phenotypic assessment was recorded using the Field Book phenoapp (Rife and Poland 2014).

High-Throughput Phenotyping

To compliment the manually recorded phenotypic data, we applied HTP using a ground-based proximal sensing platform or an UAS (Table 2). Seasons 2015 – 16 and 2016 – 17 were characterized by the ground platform as described in Barker et al. (2016) and Wang et al. (2018). For the other three seasons, we used a quadcopter DJI Matrice 100 (DJI, Shenzhen, China) carrying a MicaSense RedEdge-M multispectral camera (MicaSense Inc., United States). The HTP data was collected on multiple dates throughout the growth cycle from stem elongation to ripening (GS 30 – 90; Zadoks et al. 1974) (Table 2). Flight plans were created using CSIRO mission planner application and missions were executed using the Litchi Mobile App (VC Technology Ltd., UK, https://uavmissionplanner.netlify.app/) for DJI Matrice100. The aerial image overlap rate between two geospatially adjacent images was set to 80% both sequentially and laterally to ensure optimal orthomosaic photo stitching quality. All UAS flights were set at 20m above ground level at 2m/s and conducted within two hours of solar noon. To improve the geospatial accuracy of orthomosaic images, white square tiles with a dimension of 0.30m × 0.30m were used as ground control points and uniformly distributed in the field experiment before image acquisition and surveyed to cm-level resolution using the Emlid REACH RS+ Real-Time Kinematic Global Navigation Satellite System unit (Emlid Ltd., HongKong, China).

View this table:

Table 2

Dates of high-throughput phenotypic data collection and details of image acquisition in the five wheat nurseries screened for BYD, Kansas, USA (2015-2020).

An automated image processing pipeline (Wang et al. 2020) was used to generate the orthomosaics and extract plot-level plant height (PTHT_D (m), Singh et al. 2019) and the normalized difference vegetation index (NDVI) (Rouse et al. 1974), calculated as: where NIR and Red are the near-infrared and red bands of the multispectral images and NDVI is the output image. Both traits were selected based on potential BYD characterization where the most typical BYD symptoms include chlorosis and stunting of the plants, thus, influencing NDVI and PTHT.

Statistical Data Analyses

First, the adjusted mean best linear unbiased estimator (BLUE) was calculated for each entry for all the different traits for each season (Table S1), using the following model: where y_ijklm is the phenotype for the trait of interest, μ is the overall mean, G_i is the fixed effect of the i^th entry (genotype), T_j is the fixed effect of the j^th insecticide treatment, GT_ij is the fixed effect of the interaction between the i^th entry and the j^th insecticide treatment (genotype by treatment effect), R_k(j) is the random effect of the k^th replication nested within the j^th insecticide treatment and distributed as iid , B_l(kj) is the random effect of the l^th row nested within the k^th replication and j^th treatment distributed as iid , C_m(kj) is the random effect of the m^th column nested within the k^th replication and j^th treatment and assumed distributed as iid , and e_ijklm is the residual for the ijklm^th plot and distributed as iid . The ‘lme4’ R package (Bates et al. 2014) was used for fitting the models.

The BLUEs were used to inspect trait distributions and to calculate Pearson correlations between all traits. In addition, BLUE values were used to calculate the reduction in GY for each entry as the difference of GY between the untreated and insecticide treated main plots. This variable reflects the level of BYD resistance of each entry, and it was used to perform GWAS and GS analyses.

For NDVI and PTHT_D, the plot-level observed values extracted for the different phenotypic dates were fitted to a logistic non-linear regression model (Fox and Weisberg 2011) as, where is y the phenotype for the trait of interest at the time-point x measured as days after January 1, θ₁ is the maximum value (upper asymptote) represented by the final PTHT or maximum achieved NDVI, θ₂ is the inflection point that represents the greatest rate of change in the growth curve, either senescence for NDVI or height of growth, θ₃ is the lag phase or onset of senescence or growth rate from time x where x is the calendar day of the year since January 1, and ε is the residual error (Figure S1). The “nlme” R package was used for model fitting (Pinheiro et al. 2015). The model parameters obtained for each trait (θ_1NDVI, θ_2NDVI, θ_3NDVI, θ_{1PTHT_D}, θ_{2PTHT_D}, and θ_{3PTHT_D}) were used in addition to the other phenotypic traits to calculate BLUEs, distributions, correlations, and BLUPs.

Secondly, we used a mixed linear model to calculate the best linear unbiased predictors (BLUPs) for each entry in each nursery (season) (Table S1), using the same model as described in equation 2 but defining G_i, T_j, and GT_ij as random effects. BLUPs were used because of the unbalanced nature of the data (not all lines were evaluated in all the seasons). The BLUPs calculated for each season were then combined for GWAS and GS. Furthermore, we calculated broad-sense heritability on a line-mean basis by splitting the data based on whole plot treatment for insecticide treatments as: where is the genotypic variance, is the residual error variance, and r is the number of replications.

Genotypic Data

A total of 346 wheat entries were genotyped using genotyping-by-sequencing (GBS) (Poland et al. 2012) and sequenced on an Illumina Hi Seq2000. Single nucleotide polymorphisms (SNPs) were called using Tassel GBSv2 pipeline (Glaubitz et al. 2014) and anchored to the Chinese Spring genome assembly v1.0 (Appels et al. 2018). SNP markers with minor allele frequency < 0.01, missing data > 85%, or heterozygosity > 15% were removed from the analysis. After filtering, we retained 29,480 SNPs markers that were used to investigate the population structure through principal component analysis (PCA), genome-wide association analysis (GWAS), and GS. In addition, GBS data was used to run a bioinformatics pipeline to predict the presence or absence of the translocated segment on chromomere 7DL carrying the Bdv2 gene for each entry (Table S1). The prediction was done based on a modified alien predict pipeline (Gao et al. 2021). Briefly, alien or wheat specific tags were counted in the 7DL region and tabulated using a training set of cultivars or lines that are known to be Bdv2 positive and negative. A simple classification was done based on alien to wheat tag counts ratios.

Genome-Wide Association Analysis

The GWAS analysis was performed with a mixed linear model implemented in the ‘GAPIT’ R package (Lipka et al. 2012) that includes principal components to account for population structure as fixed effects and the individuals to explains familial relatedness as random effects, where y is the vector of phenotypic BLUPs, X and Z are the incidence matrix of β and u_i, respectively, with u_i assumed where K is the individual kinship matrix, and e is the vector of random residual effects with , where I is the identity matrix and is the unknown residual variance. The false discovery rate correction with an experimental significance level value of 0.01 was used to assess marker-trait associations. Manhattan plots were generated with ‘CMplot’ package in R software (Yin 2020). PCA using GBS-SNPs was performed in R language. Eigenvalues and eigenvectors were computed with ‘e’ function using ‘A.mat’ function and the ‘mean’ imputation method of ‘rrBLUP’ package (Endelman, 2011). To declare a quantitative trait locus (QTL) we considered only the regions having several SNP markers in linkage disequilibrium, clearly showing a peak. We did not consider regions with a single SNP above the significant threshold as a QTL.

Genomic Selection

Using data from the five seasons, GS models using the genomic best linear unbiased predictor (G-BLUP) were developed to assess predictive ability. A five-fold cross-validation method was used to assess model accuracy where the data set was split into five sets based on season, with four seasons forming the training set and the fifth season serving as prediction set. This process was repeated until all seasons were predicted. Along with predicting all other seasons from each season, a model was evaluated with a leave-two-out cross-validation strategy. This strategy was used to get a better mix of years with and without disease incidence, where the training population consisted of three seasons, and the remaining two seasons were predicted from the combined training population. The GS model was fitted with the training population using ‘rrBLUP’ kin.blup function (Endelman 2011), the GS model equation was, where y is a vector of phenotypic BLUPs, W is the design matrix of g, g is the vector of genotypic values and ε is the vector of residual errors (Endelman 2011). Predictive ability was assessed using Pearson’s correlation (r) between the predicted value (G-BLUP) and the BLUP for the respective phenotype. In addition, for both GS strategies we also tested the effect of adding the genotype of the Bdv2 loci as a fixed effect cofactor, using the model, which combines parameters described in equation 6 and X is the matrix (n x 1) of individual observation for presence or absence of Bdv2 and β is the fixed effect for the Bdv2 measurements.