Identification of QTL hotspots affecting agronomic traits and high-throughput vegetation indices in rainfed wheat

Understanding the genetic basis of agronomic traits is essential for wheat breeding programmes to develop new cultivars with enhanced grain yield under climate change conditions. The use of high-throughput phenotyping (HTP) technologies for the assessment of agronomic performance through drought-adaptive traits opens new possibilities in plant breeding. HTP together with a genome-wide association study (GWAS) mapping approach can become a useful method to dissect the genetic control of complex traits in wheat to enhance grain yield under drought stress. This study aimed to identify molecular markers associated with agronomic and remotely sensed vegetation index (VI)-related traits under rainfed conditions in bread wheat and to use an in silico candidate gene (CG) approach to search for upregulated CGs under abiotic stress. The plant material consisted of 170 landraces and 184 modern cultivars from the Mediterranean basin that were phenotyped for agronomic and VI traits derived from multispectral images over three and two years, respectively. GWAS identified 2579 marker–trait associations (MTAs). The QTL overview index statistic detected 11 QTL hotspots involving more than one trait in at least two years. A candidate gene analysis detected 12 CGs upregulated under abiotic stress in 6 QTL hotspots. The current study highlights the utility of VI to identify chromosome regions that contribute to yield and drought tolerance under rainfed Mediterranean conditions.

Agronomic data 141 The following traits were measured across the three years according to the protocol described in Rufo (NGm 2 ), thousand kernel weight (TKW, g), aboveground biomass at physiological maturity After excluding markers with more than 25% missing values and with a minor allele frequency 187 (MAF) lower than 5%, a total of 10090 SNPs were used for mapping purposes.  Gene annotation for the target region of QTL hotspots was performed using the gene models for high-214 confidence genes reported for the wheat genome sequence (IWGSC 2018), available at https://wheat-In silico expression analysis and the identification of upregulated gene models were carried out using 218 the RNA-seq data available at http://www.wheat-expression.com/ (Ramírez-González et al. 2018) for 219 the following studies: 1) drought and heat stress time-course in seedlings, 2) spikes with water stress, 220 3) seedlings with PEG to simulate drought, and 4) shoots after two weeks of cold.

224
Environmental conditions 225 The experimental site has a typical Mediterranean climate characterized by an irregular pattern of 226 yearly rainfall distribution, low temperatures in winter that rise sharply in spring and high 227 temperatures continuing until the end of the crop cycle. Figure 1 represents a graphical summary of 228 the rainfall and maximum and minimum temperatures during the crop cycle across the three years of 229 field trials. All the weather variables were representative of long-term data from the region for each 230 growing season, although 2017 was considered exceptionally dry due to the low rainfall received.

231
The last year (2018) was characterized as the wettest from December (sowing) to June (physiological 232 maturity) with 269 mm of rainfall, whereas the first and second growing seasons with 207 mm and 233 105 mm of rainfall were rather dry, respectively, suffering severe water scarcity during the grain 234 filling period with only 5 mm of precipitation.

236
The results of the analyses of variance (ANOVAs) for the agronomic traits measured during the three 237 growing seasons are shown in Table 2. The percentage of variability explained by year was the highest 238 for GA (81.6%) and GS65 (67.9%), while the sum of squares of SP was the highest for yield (76.5%), 239 NGm 2 (65.0%), HI (62.6%), PH (61.4%) and GS87 (59.0%). Finally, the highest percentage 240 explained by the interaction between year and SP was found for biomass, NSm 2 and TKW, reaching 241 86.2%, 84.9% and 71.0%, respectively. Significant differences were found between SPs and years 242 for all traits. The year x SP interaction was also significant for all traits, except for HI. 243 Table 3 shows the results of the ANOVA for the VIs and LAI estimated through the MTVI2 at the whereas at PA, the percentages ranged from 10.0% (TCARI/OSAVI) to 92.3% (LAI). The percentages of the total variation explained by SP ranged from 2.3% (RDVI) to 11.0% 249 (TCARI/OSAVI) at anthesis, while they ranged from 1.0% (MTVI2) to 11.2% (TCARI/OSAVI) PA. by the year x SP interaction at PA ranged from 6.6% (LAI) to 78.8% (TCARI/OSAVI).

255
The mean values of phenotypic traits for each year and SP are shown in Table 4. Yearly means   (Table 6). 284 However, no pattern was found for VI traits among SPs PA. SP2 and SP4 had higher mean values for 285 all traits PA, with the exception of TCARI/OSAVI.

286
To quantify the relation between trait variation and population structure, multiple linear regressions 287 were carried out between population coefficients (Supplementary Table S2  The bidimensional clustering shown in Figure 2 represents the relationships among phenotypic traits 297 and their mean values in the germplasm collection (3 years for agronomic traits and 2 years for VIs).

298
The horizontal cluster grouped accessions according to their phenotypic similarity based on the traits 299 in the vertical cluster. Horizontal clustering separated two main clusters: cluster A was composed 300 only of landraces, and cluster B included modern cultivars and two landraces: cv 'TRI 11548' from 301 Iraq and cv '1170' from Turkey. Cluster A was characterized by lower yield and yield components, 302 except NSm 2 , lower biomass, a shorter GFD but longer GS65, and taller plants than cluster B, but 303 cluster A had higher values for VIs at anthesis, except for GNDVI_A.

304
Each of these two clusters was separated into two subclusters, A1 and A2 for landraces and B1 and 305 B2 for modern cultivars. Subcluster A1 was represented mainly by south Mediterranean landraces 306 (77%), including those from east and west regions, whereas subcluster A2 contained most of the north 307 Mediterranean landraces (62%). East Mediterranean landraces were in a single cluster within A1, 308 whereas west Mediterranean landraces were distributed in other clusters within A1. Differences 309 among subclusters A1 and A2 were due to higher NSm 2 and TCARIOSAVI_PA in A1 and longer  MTAs with a phenotypic variance explained (PVE) lower than 0.10 was 97.5%, which agreed with 330 the highly quantitative nature of the analysed traits ( Figure 3C).

331
A total of 815 MTAs were identified for seven agronomic traits (Table 8). Yield showed the highest 332 number of MTAs (368), most of them (268) from 2016, whereas only one association was found for 333 NSm 2 with the mean across years. MTAs for TKW were found mainly during 2016 (96%), and those 334 for PH were found mainly during 2017 (68%).

335
A total of 1764 MTAs over -log10P>3 were identified for 15 VI traits (Table 8)    wheat TaSPL16, found that this gene was highly expressed in young panicles but expressed at low 585 levels in seeds, in agreement with the expression profile of TraesCS5B01G286000 found in our study.