Unravelling consensus genomic regions associated with quality traits in wheat (Triticum aestivum L.) using meta-analysis of quantitative trait loci

A meta-analysis of quantitative trait loci (QTLs) associated with following six major quality traits (i) arabinoxylan, (ii) dough rheology properties, (iii) nutritional traits, (iv) polyphenol content, (v) processing quality traits, and (vi) sedimentation volume was conducted in wheat. For this purpose, as many as 2458 QTLs were collected from the 50 mapping studies published during 2013-20. Of the total QTLs, 1126 QTLs were projected on to the consensus map saturated with 2,50,077 markers resulting into the identification of 110 meta-QTLs (MQTLs) with average confidence interval (CI) of 5.6 cM. These MQTLs had 18.84 times reduced CI compared to CI of initial QTLs. Fifty-one (51) MQTLs were also verified with the marker-trait associations (MTAs) detected in earlier genome-wide association studies (GWAS). Physical region occupied by a single MQTL ranged from 0.12 to 749.71 Mb with an average of 130.25 Mb. Candidate gene mining allowed the identification of 2533 unique gene models from the MQTL regions. In-silico expression analysis discovered 439 differentially expressed gene models with >2 transcripts per million (TPM) expression in grains and related tissues which also included 44 high-confidence candidate genes known to be involved in the various cellular and biochemical processes related to quality traits. Further, nine functionally characterized wheat genes associated with grain protein content, high molecular weight glutenin and starch synthase enzymes were also found to be co-localized with some of the MQTLs. In addition, synteny analysis between wheat and rice MQTL regions identified 23 wheat MQTLs syntenic to 16 rice MQTLs. Furthermore, 64 wheat orthologues of 30 known rice genes were detected in 44 MQTL regions. These genes encoded proteins mainly belonging to the following families: starch synthase, glycosyl transferase, aldehyde dehydrogenase, SWEET sugar transporter, alpha amylase, glycoside hydrolase, glycogen debranching enzyme, protein kinase, peptidase, legumain and seed storage protein enzyme.


Introduction 34
Wheat (Triticum aestivum L.) is a globally grown cereal crop and is a major contributor of 35 calories and protein to the human diet. Currently, wheat is widely consumed and processed into bread, 36 noodles, cakes, pasta, beer, and other products. Wheat research has greatly contributed to its yield 37 enhancement and disease resistance, but focus on quality of the produce took the back stage while 38 enhancing yield. Hence, developing the high yielding varieties with enhanced quality characters is the 39 foremost concern of the breeders [1]. Improving end-use qualities is a tough endeavour because 40 firstly, it is very difficult to measure the seed quality and rheological properties such as grain protein

49
Since the first report on QTL analyses for wheat quality traits published in 1990s [4], a plentiful of 50 QTLs have been identified using different mapping populations to underpin the genetic architecture 51 underlying end-use quality, including GPC [5,6], dough rheological properties [5,6], SDS [7], falling 52 number (FN) [8] and starch pasting properties [9]. However, the rationality of these QTL mapping 53 results is strongly influenced by the experimental conditions, type and size of mapping population, 54 density of genetic markers, statistical methods used among others [10]. Thus, the practical implication 55 of these QTLs for quality improvement via molecular QTL cloning and marker-assisted selection has 56 been rather limited [2]. Considering this challenge, it is desirable to identify QTLs that show major 57 effect on target phenotype and are consistently detected across the multiple genetic backgrounds and 58 environments.

59
Meta-analysis of available QTLs enable the identification of consensus and robust QTLs or 60 MQTL regions that are most frequently associated with trait variation in diverse studies and reduce 61 their confidence intervals (CIs) [11,12]. Software packages, such as 62 facilitate meta-analysis of QTLs derived from independent studies by formulating and embedding 63 specific sets of algorithms for exact evaluation and recalculation of the genetic position for the given 64 QTLs [13]. Among them, BioMercator is the most advanced and commonly used software for

Results
Characterization of QTL studies involving quality traits 83 Characteristics of 50 independent mapping studies involving 63 bi-parental populations were 84 rigorously reviewed to compile information on available QTLs. In total, 2458 QTLs associated with 85 quality traits were collected and they were found to be distributed unequally on all the wheat 86 chromosomes (Fig 1a).  Table). Peak positions of the initial QTLs also varied (ranging from 0 to 678.2 cM with a mean of 97 108.18 cM) (Fig 1h).

98
Construction of consensus genetic map 99 The consensus map, "Wheat_Reference_GeneticMap-2021" showed significant variation for  (Fig 2). The total length of the consensus map measured 13,637.4 cM which included a 104 total of 2,50,077 markers of different types such as AFLP, RFLP, SSR, SNP, etc. Since, different 105 genetic maps with varying number and type of markers were used to construct the consensus map, the 106 distribution of markers on the chromosomes was uneven and density of the markers was 107 comparatively higher at the fore-end of the chromosome (Fig 2; S4

141
The physical coordinates of the MQTLs identified in the present study were also compared 142 with MTAs reported in earlier GWA studies (S6 Table). Among the 108 MQTLs, as many as 43 143 MQTLs (39.81 %) co-localized with at least one MTA available from the GWAS (Fig 4; S6 Table).

144
There were some MQTLs co-localized with MTAs available from more than one GWA study. For 148 Surprisingly, five genes, TraesCS1A02G040600, TraesCS2D02G531100,

150
(2021) GWA study were overlapped with four MQTLs identified in the present study. Among them,

152
last two genes located on the 3D chromosomes were present in the MQTL3D.3. Functional annotation 153 and GO study revealed their participation in the various biological processes associated with grain 154 quality in wheat. Genes, namely TraesCS3D02G095700 and TraesCS3D02G096000 encode for the 155 wheat allergens and trypsin and alpha-amylase inhibitor present in the seeds.

Wheat homologues of known rice genes in MQTL regions
157 An extensive search made on known rice genes associated with quality traits led to the 158 collection of information on 34 genes which were further utilized for the identification of their 159 corresponding homologues in wheat MQTL regions. These genes encode proteins/products mainly 160 belonging to the following families: starch synthase, glycosyl transferase, aldehyde dehydrogenase,

164
RP6, RM1, and OsAGPL4/OsAPL4 could not be identified due to low sequence identity (<60%) and 165 query coverage (<30%). Further, it has been noticed that, wheat homologues for nine rice genes are identified on the different chromosomes of wheat. For example, wheat homologues for rice gene, Wx1 167 are present on the 4 th and 7 th chromosomes. Sixty-four wheat homologues of these 30 rice genes were 168 detected in the 44 wheat MQTL regions (S7 Table); some MQTL regions harboured more than one 169 wheat homologues of rice genes. For example, wheat homologues of rice genes wx1, OsACS6 and

200
209 differentially expressed gene models with more than 5 TPM expression were involved in the 210 improvement of grain quality (Fig 7). Based on their expression pattern, gene models could be divided 211 into two classes, i.e., genes in class-I showed expression in the seed, while genes in the class-II 212 exhibited expression in the different sub-tissues. Further, functional characterization of these genes 213 revealed their involvement in transcriptional and translational regulation of various genes, signalling 214 mechanism, metabolism, cellular development and transfer, etc.

215
In recent study, transcriptome analysis in the hexaploid wheat and its diploid progenitors 216 identified the differentially expressed genes (DEGs) associated with carbohydrate metabolism [27].

217
Seven DEGs (Table 3) identified from Kaushik et al. (2020) study were common to the genes 218 characterized in the present study. Of these seven genes, six genes were downregulated (Table 3) and 219 one gene, TraesCS7B02G194000.1 was upregulated. Among the downregulated genes, two genes 220 were encoding for NB-ARC domain containing proteins and one gene for the F-box domain (Table 3).

221
While the functions of the three downregulated genes were not characterized.  CGs (Table 4). All the six CGs, TraesCS3D02G095700, TraesCS3D02G096000,

240
TraesCS4B02G017500, TraesCS4D02G016000, TraesCS4D02G016100 and TraesCS6D02G000200 241 having high expression value of >10 TPM were showed to be involved in improving the seed quality 242 through nutrient reservoir activity.   (Table 3). While the functions of the three downregulated genes have not been characterized. Up-379 regulated gene, TraesCS7B02G194000 encode the alpha/beta hydrolase enzyme which take part in the 380 multitude of biochemical processes, including bioluminescence, fatty acid and polyketide biosynthesis 381 and metabolism [44].

382
To strengthen the location of MQTLs identified in this study, a search for co-localization of 383 quality related genes was also carried out. This enabled detection of functionally characterized wheat 384 genes such as, GPC-B1/NAM-B1, 1Dx2t,TaSSI,TaSSIIa,385 TaGBSSIa and TaSSIVb  and CGs in their study to improve the quality related traits in wheat.

Material and methods A comprehensive bibliographic search and data retrieval on the published papers from 2013
427 to 2020 enabled us to compile information from the available studies. From each independent study, 428 the following information was obtained: (i) QTL name (wherever available), (ii) flanking or closely 429 linked markers, (iii) peak position and CI, (iv) LOD score, (v) phenotypic variation explained 430 (PVE) or R 2 value for each QTL and (vi) type and size of the mapping population (S1 and S2

431
Tables). Wherever, information about the peak position of the QTL was missing, it was calculated as 432 the mid-point of the two markers flanking the QTL, and when LOD scores of individual QTLs were 433 unavailable from a particular study, LOD score of 3 was considered for all the QTLs identified in the 434 concerned study.

435
All the quality traits were grouped into the following six major trait categories: (i)

QTL Projection, meta-QTL and ortho-MQTL analysis 453
For the QTLs with no CI available, CI (95%) was calculated by using the following 454 population-specific equations [16,57].  used for this purpose. These studies provided the expression data on "morphological stages of 504 developing wheat grain", "inner pericarp, outer pericarp and endosperm layers from developing grain 505 of bread wheat at 12 days post-anthesis", "aleurone and starchy endosperm tissues of the wheat seed 506 at aleurone layer development time of 6, 9 and 14-days post anthesis" and "candidate genes 507 underlying the grain dormancy in wheat", respectively.

508
Only the gene models with ≥ 2 TPM (transcripts per million) expression were considered in

584
The authors of this manuscript declare no conflicts of interest.

585
Availability of data and material

586
Relevant data are included in this paper and its associated Supplementary Information (SI).