Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Phenomic selection: a low-cost and high-throughput alternative to genomic selection

Renaud Rincent, Jean-Paul Charpentier, Patricia Faivre-Rampant, Etienne Paux, Jacques Le Gouis, Catherine Bastien, View ORCID ProfileVincent Segura
doi: https://doi.org/10.1101/302117
Renaud Rincent
1GDEC, INRA, UCA, 63000 Clermont-Ferrand, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jean-Paul Charpentier
2BioForA, INRA, ONF, 45075 Orléans, France
3GenoBois analytical platform, INRA, 45075 Orléans, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patricia Faivre-Rampant
4EPGV, INRA, CEA-IG/CNG, 91057 Evry, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Etienne Paux
1GDEC, INRA, UCA, 63000 Clermont-Ferrand, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jacques Le Gouis
1GDEC, INRA, UCA, 63000 Clermont-Ferrand, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Catherine Bastien
2BioForA, INRA, ONF, 45075 Orléans, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vincent Segura
2BioForA, INRA, ONF, 45075 Orléans, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vincent Segura
  • For correspondence: vincent.segura@inra.fr
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

ABSTRACT

Genomic selection - the prediction of breeding values using DNA polymorphisms - is a disruptive method that has widely been adopted by animal and plant breeders to increase crop, forest and livestock productivity and ultimately secure food and energy supplies. It improves breeding schemes in different ways, depending on the biology of the species and genotyping and phenotyping constraints. However, both genomic selection and classical phenotypic selection remain difficult to implement because of the high genotyping and phenotyping costs that typically occur when selecting large collections of individuals, particularly in early breeding generations. To specifically address these issues, we propose a new conceptual framework called phenomic selection, which consists of a prediction approach based on low-cost and high-throughput phenotypic descriptors rather than DNA polymorphisms. We applied phenomic selection on two species of economic interest (wheat and poplar) using near-infrared spectroscopy on various tissues. We showed that one could reach accurate predictions in independent environments for developmental and productivity traits and tolerance to disease. We also demonstrated that under realistic scenarios, one could expect much higher genetic gains with phenomic selection than with genomic selection. Our work constitutes a proof of concept and is the first attempt at phenomic selection; it clearly provides new perspectives for the breeding community, as this approach is theoretically applicable to any organism and does not require any genotypic information.

INTRODUCTION

To meet the world’s current and future challenges, especially in terms of food and energy supplies, there is a great need to develop efficient crop varieties, livestock races or forest materials through breeding. Until recently, the selection of promising individuals in animal and plant breeding was mostly based on their phenotypic records. This approach was a strong limit to genetic progress because the high costs of phenotyping strongly constrain the number of candidates that can be evaluated, which is even more the case when there are interactions between individuals and environments that necessitate the evaluation of selection candidates in various environments. Another strong constraint – typically in perennial crops, trees or animals – is that it can sometimes take several years to evaluate phenotypes, which increases the duration of selection cycles. These limitations are some of the main reasons why genomic selection (GS) has become so popular in the last two decades. Its principle is based on a combination of phenotypic records and genome-wide molecular markers to train a prediction model that can in turn be used to predict the performances of – potentially unphenotyped – individuals1. We can thus select more individuals faster, which increases selection efficiency. The development of high-throughput genotyping tools at decreasing costs has made GS possible for many animal and plant species. It can be used both in pre-breeding to screen diversity material2-3 and in breeding to make the schemes more efficient4-5. However, many species are still orphans of any genotyping tool, and for many others, genotyping costs remain a limit to the implementation of GS in pre-breeding and breeding. In addition, genotyping thousands to millions of individuals (potentially each year) remains a challenge, if only because of the necessity of destructive sampling for DNA extraction, and is inaccessible for most species.

One of the reference GS models is the ridge regression BLUP (RR-BLUP1,6) in which a penalized regression is made on all markers simultaneously. This model assumes that the genes affecting the trait of interest are spread across the whole genome and that all of these genes have small effects. Despite its simplicity, this model has been proven to be one of the most effective in many situations, except for when major genes contribute to trait architecture. Interestingly, this model is equivalent to the genomic BLUP model (G-BLUP7-10) in which markers are used to estimate a realized genomic relationship matrix between individuals, also called kinship. This framework means that we can compress genome-wide information from numerous molecular markers into summary statistics (kinship coefficients between individuals) without diminishing prediction accuracy. Considering this fact, we should ask the question: are there easier or cheaper alternatives to genotyping to estimate the kinship matrix? For this question, we propose to evaluate the efficiency of near-infrared spectroscopy (NIRS) as a high-throughput, non-destructive, low-cost phenotyping tool to estimate relatedness between individuals and to make predictions with G-BLUP (or equivalently, RR-BLUP) using these traits instead of molecular markers.

The application of NIRS on a biological sample yields a spectrum whose shape depends mainly on its chemical composition. Because the chemical composition of a sample is under genetic control11, we expect NIR spectra to capture some genetic variability, in turn making it possible to estimate relatedness between individuals. Numerous studies have demonstrated the usefulness of NIRS for barcoding samples and discriminating species or varieties12-19 and have thus suggested that NIRS could be considered as a genetic marker20. Moreover, a few studies have shown that NIRS can capture some genetic variability by estimating the heritability of the spectrum and even mapping corresponding quantitative trait loci (QTL)17,21-23. However, to the best of our knowledge, no studies have proposed using NIRS to perform “phenomic selection” (PS). There are several advantages to this approach. One can obtain NIRS for any plant or animal species at a lower cost than genotyping. One can also obtain NIRS directly in the field thanks to portable devices24-25 or autonomous high-throughput vectors, such as phénomobiles26 that generate hyperspectral images27-28. As a result, prediction-based selection would be possible for any species and at a low enough cost to make it interesting to implement, even if its results are less accurate than those of GS. As a proof of concept of PS, we report an evaluation of the usefulness of NIRS for predicting quantitative traits of economic interest into two different species, a tree (poplar) and a cereal (winter wheat), and compare the results to those of a GS prediction based on several thousand SNPs.

RESULTS

Genetic variability captured by NIRS

We first sought to characterize the ability of NIRS to capture genetic variability by estimating genomic heritability and partitioning the variance into genetic (G), genetic by environment (G × E) and residual variances (e) along the NIR spectrum collected on a panel of winter wheat (leaves and grains) and a population of black poplar (wood) grown in two contrasting environments (see Methods). For both species and tissues, genomic heritability was highly variable along the spectrum with peaks above 60%, showing the existence of strong polygenic signals for some wavelengths (Fig. 1, Fig. S1). For a given species, the proportion of G × E variance could reach 24% (poplar), 54% (wheat leaves) or 71% (wheat grains), suggesting that it is probably valuable to combine NIRS information obtained in different growing conditions to capture more polygenic variance (Fig. 1 a, b). It is interesting to note that for at least half of the wavelengths, the cumulative proportion of G and G × E variances was above 15%, showing that the NIR signal was often partially related to genetics. The kind of tissue analyzed by NIRS seemed to matter, as shown by the comparison of variance partition along spectra obtained on wheat leaves and grains. G and G × E variances were higher and more stable along the spectrum for grains than for leaves.

Figure 1
  • Download figure
  • Open in new tab
Figure 1

Proportion of genetic (red), genetic by environment (green), and residual (blue) variances along the NIR spectrum of (a) winter wheat leaves, (b) winter wheat grains and (c) poplar wood. NIRS was performed on plant material collected on genotypes grown under favorable and unfavorable environmental conditions. The median normalized and derived spectra, along with their first and third quartiles across the genotypes under study, are indicated in gray.

We ran association mapping along the NIR spectrum to identify wavelengths associated with major QTL (Fig. S1). In poplar, the signal appeared to be mainly polygenic with very few QTL detected, and the largest SNP R2 was below 0.025 for any wavelength. In contrast, in winter wheat, we detected numerous large-effect QTLs. For some wavelengths, a single SNP could have an R2 of 0.23 for leaves and of 0.11 for grains, and this SNP could be in spectrum regions of high or of low genomic heritability (Fig. S1). This finding means that depending on the wavelength, NIRS could capture highly polygenic relationships (wavelengths with high genomic heritability) or could tag specific regions of the genome (major QTLs). These two kinds of wavelengths can be useful for making predictions because they can potentially track the two main factors responsible for GS accuracy: relatedness and linkage disequilibrium.

Comparing predictive abilities obtained with markers and with NIRS

We estimated the efficiency of GS and PS to predict the performance of new individuals within a cross-validation framework (see Methods). The performances of the individuals in the validation set were predicted with genotypic information in GS (G-BLUP and Bayesian LASSO models) and with NIRS only in PS (RR-BLUP model). We considered two scenarios: in S1, NIRS analysis and cross-validation were performed in the same environment (Fig. 2 a), whereas in S2, the environments in which the cross-validation was applied were different from those in which NIRS was obtained (Fig. 2 b). The broad-sense heritabilities of the adjusted means were above 0.8 for all traits in each environment (Table S1).

Figure 2
  • Download figure
  • Open in new tab
Figure 2

Schematic representation of the concept of phenomic selection, including the two scenarios tested in the present work: (a) S1, where the calibration model is trained with breeding values (BVs) and NIRS data collected at the same - reference - site and (b) S2, where the calibration model is trained with NIRS data collected at the reference site and BVs from other(s) environment(s). In both scenarios, the outcome of the prediction consists of estimated breeding values (EBVs).

In wheat, the predictive abilities of PS were highly variable and appeared to be dependent on the predicted trait and on the environment and tissue in which NIRS was measured (Fig. 3 a, b, c, d, Fig. S2). While combining NIRS collected in different environments or different tissues increased the predictive ability, this increase did not occur systematically. One major result is that for both traits, NIRS could lead to better predictions than molecular markers, even in the six independent environments (Fig. 3 c, d, Fig. S2). The gain with NIRS in comparison to molecular markers in S2 was up to 34% and 22% for heading date and grain yield, respectively. In each S2 environment and for both traits, there was always a type of NIRS that performed better or as well as the best GS model (Fig. S2). The gain was even stronger in S1: NIRS led to an increase in predictive ability of up to 53% and 117% for heading date and grain yield, respectively. In poplar, the predictive abilities with NIRS were always lower than those with SNP, except for growth traits under S1 (Fig. 3 e). In the other cases, the predictive ability with NIRS varied depending on the trait and scenario considered, but they were always significantly greater than 0. In general, they were higher when the spectra were collected in the same environment (S1) than when spectra from another environment were used (S2), except for bud flush evaluated in one site and bud set evaluated in another site. Interestingly, irrespectively of the scenario, for some traits apparently unrelated to wood chemical properties, such as resistance to rust or bud set, NIRS predictive abilities were fairly high ranging between 0.34 and 0.53.

Figure 3
  • Download figure
  • Open in new tab
Figure 3

Predictive ability of SNP (G-BLUP or Bayesian LASSO (BL) models) or NIRS (RR-BLUP model) when predicting the phenotypic values of individuals within a cross-validation in winter wheat (a, b, c, d) and black poplar (e). Two scenarios were considered for NIRS prediction: in S1, the RR-BLUP model was trained with NIRS data and phenotypes that were collected within the same environment (a, b, e), whereas in S2, NIRS and phenotypic data used to train the RR-BLUP model were collected in distinct environments (c, d, e). For wheat, two traits were considered: heading date (a, c) and grain yield (b, d). The bars of a, b, c, and d are labeled with the origin of the NIRS data (I: irrigated treatment, D: drought treatment), and the bars of e are labeled with the combination of trait and experiment (HT: height, CIRC: circumference, BF: bud flush, BS: bud set, RUST: resistance to rust, ORL: experimental design in Orléans, France, SAV: experimental design in Savigliano, Italy). The medians of the accuracies obtained over repeated cross-validations are reported as the height of the bars together with the first and third quartiles as confidence intervals.

Expected genetic progress with genomic and phenomic selection in a simple example

To further evaluate the potential of PS with respect to GS, the genetic progress expected with both approaches was compared in a simple scenario in which a budget of 200,000 € could be spent to genotype or analyze the NIRS of selection candidates (see Methods). The difference in efficiency between GS and PS is highly dependent on the genotyping and NIRS costs and on the reliability of the two approaches (Fig. 4). In the scenarios that we considered here, the gain of using PS instead of GS was between 11% and 127%. In extreme scenarios in which genotyping was cheap (25 €) and NIRS was expensive (8 €) or in which GS reliability (0.6) was much higher than PS reliability (0.3), PS was still better than GS. We applied the simulation process with the reliabilities and costs obtained in the wheat example (35 € for genotyping and DNA extraction and 3 € for sample treatment and NIRS acquisition). The increase of genetic progress with PS in comparison to GS was between +60% and +127% for heading date and between −10% and +222% for grain yield, depending on the tissue and environment used for NIRS acquisition and scenario considered (Table S2). In poplar, considering genotyping and NIRS acquisition costs of 50 € and 2.5 €, respectively, as well as the reliabilities estimated with cross-validation predictive abilities, the gain in genetic progress varied depending on the trait and scenario considered (Table S3). It was mainly positive for growth traits (−2 to 93%), bud set (−6 to 25%) and rust resistance (−10 to 21%), whereas for bud flush, NIRS prediction did not seem to provide any advantage over regular SNP-based prediction.

Figure 4
  • Download figure
  • Open in new tab
Figure 4

Theoretical increase of genetic progress (%) by using NIRS instead of genotyping. a: Gain in genetic progress for various genotyping and NIRS costs for a reliability of 0.4, a budget of 200,000 €, and a selection of 400 individuals. b: Gain in genetic progress for various reliabilities, a budget of 200,000 €, genotyping and NIRS costs of 50 € and 4 €, respectively, and a selection of 400 individuals. For each scenario, true breeding values and estimated breeding values were simulated thanks to multivariate normal distributions with a covariance adapted to the chosen reliability.

DISCUSSION

In pre-breeding or in early generations of breeding programs, breeders have to select among thousands to millions of individuals. This selection is often rough and based on phenotypic data with low accuracy because it is too expensive or simply impossible to make a precise phenotypic evaluation. It is also difficult and too expensive to genotype all individuals to apply GS, despite important economies of scales. Alternative approaches based on transcriptomics or metabolomics have been proposed to predict phenotypes29-31, but their relatively low throughput and high costs make them difficult to implement in breeding programs. To increase genetic progress in this context, we propose a new approach in which we use NIRS as high-throughput phenotypes to make predictions at low costs. The basic idea of this approach, which we call “phenomic selection” (PS), is that the absorbance of a sample in the near-infrared range is mainly related to its chemical composition, which depends itself on genetics. Therefore, NIRS is supposed to capture at least part of the genetic variance, and as a result, one could use it to make predictions of traits unrelated to the analyzed tissue or in independent environments. The process of PS is similar to GS, but instead of reference material and selection candidates being genotyped, they are sown or planted at a reference site (trial or nursery) to obtain their NIRS (Fig. 2). The NIRS are then transformed and used as if they were molecular markers to make predictions of any trait at the reference site or at independent environments.

We applied PS to the NIR spectrum of different tissues sampled on an association population of poplar and a panel of elite winter wheat. By estimating the extent of genetic variance along the NIR spectrum of poplar wood and winter wheat leaves and grains, we could show that most wavelengths captured part of the genetic variability (Fig. 1). This result agreed with previous findings with eucalyptus wood23. The NIR spectra were specific to the environments in which they were obtained, but when they were analyzed jointly, we observed that G variance was superior to G × E variance for most wavelengths in both species. Posada et al.21 also reported a similar trend with coffee grains. This finding shows that even if the absorbances were partly environment specific, it should be possible to make predictions in independent environments. This result was indeed further demonstrated by the good predictive abilities obtained with PS for most phenotypes in both species in scenario S2, i.e., when the environment in which we trained the calibration model was different from the environment in which we collected NIRS. For both species, PS abilities were in the same range as GS abilities, sometimes performing better and sometimes performing worse than one another. For wheat, the results were very encouraging as we always found a situation (combination of environment and tissue analyzed) for which NIRS performed better than GS, even in six independent environments. More importantly, even when the correlation between the S1 and S2 environment was as low as 0.16 for the predicted trait (Table S4, GY in Mon12N-), PS could produce better predictions than GS (Fig. S2). In other words, a relationship matrix computed with NIRS obtained in one specific environment could be used to make predictions in completely different environments. Generally, the predictive abilities obtained in the environment in which we obtained NIRS were higher than those obtained in independent environments. This result shows that NIRS could capture G × E variance in addition to the G variance, and is useful for making environment-specific predictions. Consequently, depending on the extent of G × E for the target trait, one of the two proposed scenarios may be more advantageous than the other. These promising results obtained in scenarios S1 and S2 open the way to important opportunities in the breeding community. As revealed by our theoretical computations (Fig. 4), PS would be able to generate large gains in genetic progress in comparison to GS, even in pessimistic scenarios. In the realistic scenarios that we experienced, the gain brought by using PS instead of GS could be up to 81% for wheat grain yield in scenario S2 (Table S2).

There are various applications of PS, which we see both as a complement and as an alternative to GS, depending on the situation. The first obvious application of PS is its use when no genotyping tool is available at a reasonable cost, which is still the case for many orphan organisms. For these species, PS could potentially be a new efficient breeding tool to increase genetic progress. A second application would be to use PS to screen material in pre-breeding or in an early generation, as we can easily apply it to numerous individuals at low cost. Even if the prediction accuracy is low, PS can be used to filter out a given proportion of selection candidates. One should define this proportion with respect to PS accuracy: the higher the accuracy, the more confident we are at filtering out many individuals without losing the best candidates. Note that even if PS is less accurate than GS, it could nevertheless be interesting to filter out the worst individuals considering the low cost of NIRS acquisition (Fig. 4), and the fact that NIRS is often already routinely carried out (for example, in wheat or forest trees to predict quality traits). In a second step or in later generations, one could use GS to make complementary predictions on a limited number of selection candidates. A third application of PS would be to make environment-specific predictions to take benefit of the fact that NIRS captures both G and G × E variances. The idea here would be to collect NIRS of the reference material and the selection candidates in various environments in order to obtain environment-specific genetic similarities. A last major application of PS would be to help conservation geneticists manage diversity collections. The use of genotyping to organize seed banks and screen and define core collections is strongly limited by its cost. PS offers a new opportunity to manage seed banks because it allows distance matrices to be computed cheaply and reliably.

Considering that PS gave interesting results for both a tree and an annual crop regarding various traits related to development, productivity and tolerance to disease and using tissues of a completely different nature (wood, leaf, grain), we can expect PS to work in many other plants and possibly in animal species using NIRS on organic fluids, such as blood or milk. Our work constitutes a proof of concept and a first attempt at PS, which clearly opens new perspectives for the breeding community. Indeed, one could further optimize many parameters to increase PS efficiency. The differences observed here between the PS efficiencies reported for wheat and poplar could represent a first direction for improving the approach. Indeed, PS appeared to be more efficient in wheat than in poplar and several hypotheses could be proposed to explain this result. First, spectra were acquired on different spectrometers resulting in a broader wavelength range in wheat, which also covered the visible part of the electromagnetic spectrum. Consequently, the information brought by the spectra on wheat tissues was potentially richer than the one brought by the spectra on poplar. Second, we could see that in wheat a larger proportion of G and G × E variance could be captured by the spectra regardless of the tissue sampled and that this was especially true for the lowest wavelengths (including the visible part), which were absent in poplar (Fig. 1). Third, the tissues in which NIRS was collected differed, and this difference seems to be an important parameter as highlighted by the differences in predictive ability between leaf and grain in wheat (Fig. 3). Another possibility for the improvement of PS efficiency could be the optimization of the growing conditions of plants in the reference site. In wheat, it was typically better to use NIRS collected on plants grown in unfavorable conditions than in favorable conditions. This result might be because we could expect more pronounced dissimilarities between genetically distant individuals in stress conditions. Therefore, there is a clear need to optimize these conditions and potentially combine different tissues and/or different environments. Once the NIRS data are collected, one could also try to improve the pretreatment of the signal and the statistical model of calibration. In our case, we choose as pretreatment the first derivative of the normalized spectrum, but other options could be tested, and these options might not necessarily be the same depending on the species considered, environment, tissue sampled or target trait. For calibrations, we have used RR-BLUP, but one might test other techniques, such as those typically allowing non-additive effects or involving feature selection, to improve the accuracy of PS. These points clearly indicate that there is great room of improvement of PS, which will likely constitute in the near future an active field of research. Finally, the recent advent of portable NIR devices as well as of hyperspectral imaging allows this technology to be used in the field. Unmanned vehicles and robots are currently being developed and can already be used to automatically collect reflectance at an industrial scale26. These new developments will considerably increase the throughput and conversely decrease the cost of NIRS data. We thus expect that these technological advances will reinforce the advantages of the proposed PS.

METHODS

Data

The wheat panel is composed of 228 European elite varieties of winter wheat released between 1977 and 2012, 89% of which have been released since 2000. 72.8% of these varieties are in the panel introduced in Ly et al.1. The full panel was sown in one trial in Clermont-Ferrand (France) in 2015/2016. This trial was an augmented design with two treatments: one drought treatment under rain-out shelters (DRY), and one irrigated treatment (IRR) next to it. There was a difference of 223 mm in water supply (rainfall and irrigation) between the two treatments at the end of the experiment. For both treatments, the panel was divided into eight blocks of precocity with one replicate within the same block for 64 varieties and no replicates for the other 164, except for four checks, which were replicated three times in each block. Phenotypes and NIRS were collected in these two reference environments. A subset of 161 varieties (together with 59 additional varieties that were not used in the present study because they were not in the panel of 228) were sown and phenotyped in six independent environments located in Estrées-Mons (France, 2011/2012 and 2012/2013) and Clermont-Ferrand (France, 2012/2013) with two treatments corresponding to two levels of nitrogen input (intermediate and high). This subpanel was divided into six groups of earliness and each group was repeated in two blocks. Four checks were present in each block.

The poplar population was an association population comprising 1,160 cloned genotypes representative of the natural range of the species in Western Europe and previously described2-4. Clonally replicated trials of subsets of this association population were established in 2008 at two contrasting sites in central France (Orléans, ORL) and Northern Italy (Savigliano, SAV). At each site, a randomized complete block design was used with a single tree per block and six replicates per genotype. Growth data collected in each design clearly indicated that the Italian site was more favorable than the French site2,4.

NIRS data

Wheat

NIRS data were obtained on flag leaves and harvested grains from the two treatments of the drought trial in Clermont-Ferrand (France) in 2015/2016. For each variety in each treatment, twenty flag leaves were sampled on one plot at 200 degree days after flowering. The samples were oven dried at 60°C for 48 h. Leaves were milled (Falaise miller, SARL Falaise, France), and the powder was analyzed with a FOSS NIRS 6500 (FOSS NIRSystems, Silver Spring, MD) and its corresponding softwares (ISIscan™ and WINisi™ 4.20). For each variety in each treatment, 200 g of grains harvested at one plot were analyzed with a FOSS NIRS XDS (FOSS NIRSystems, Silver Spring, MD) and its corresponding softwares (ISIscan™ and WINisi™ 4.20). For leaf powder and grain, absorbance was measured from 400 to 2500 nm with a step of 2 nm. 5varieties were removed from the dataset because their leaf absorbance was abnormal, resulting in a final panel of 223 varieties. The resulting spectra were loaded into R software5 to be pretreated using custom R code. They were normalized (centered and scaled) and their first derivative was computed using a Savitzky-Golay filter6 with a window size of 37 data points (74 nm) implemented in the R package signal7. In the end, each variety in each treatment was characterized by a transformed spectrum of flag leaf powder and a transformed spectrum of grains.

Poplar

NIRS was carried out on wood from stem sections collected at 1 m above ground on 2-year-old trees for 1,081 genotypes in three blocks at Orléans (total of 2,860 samples) and 792 genotypes in three blocks at Savigliano (total of 2,254 samples). After harvest, the wood samples were oven dried at 30°C for several days, cut into small pieces with a big cutter and milled using a Retsch SM2000 cutting mill (Retsch, Haan, Germany) to pass through a 1-mm sieve. The wood samples were not debarked prior to milling. After stabilization, wood powders were placed into quartz cups for NIR collection with a Spectrum 400 spectrometer (Perkin Elmer, Waltham, MA, USA) and its corresponding software (Spectrum™ 6.3.5). For each sample, the measurement consisted of an average of 64 scans done while rotating the cups over the 10,000 cm−1 - 4,000 cm−1 range with a resolution of 8 cm−1 and a zero-filling factor of 4, resulting in absorbance data every 2 cm−1. The resulting spectra were loaded into R software5 to be processed using custom R code. They were first restricted to the 8000 cm−1 - 4000 cm−1 range because the most distant part of the spectra (8000 cm−1 - 10,000 cm−1) appeared to be quite noisy. Then, the restricted spectra were normalized (centered and scaled), and their first derivative was computed using a Savitzky-Golay filter6 with a window size of 37 data points (74 cm−1) implemented in the R package signal7. Finally, these normalized and derived spectra were averaged by genotype at each site.

SNP data

Wheat

The 228 wheat varieties were genotyped with the TaBW280K high-throughput genotyping array described in Rimbert et al.8. This array was designed to cover both genic and intergenic regions of the three subgenomes. Markers with a minor allele frequency below 1%, or with a heterozygosity or missing rate above 5% were removed. Redundant markers were filtered out. Eventually, we obtained 84,259 SNPs, either polymorphic high resolution or off-target variants, with an average missing data rate of 0.83%. Missing values were imputed as the marker frequency.

Poplar

The poplar association population was genotyped with an Illumina Infinium BeadChip array3, yielding 7,918 SNPs for 858 genotypes. Missing values were rare (0.35%) and they were imputed with FImpute9. The data were restricted to the subset of 562 genotypes with SNP data and NIRS data at both sites. Within this set, SNPs with a minor allele frequency below 1% were discarded, yielding a final SNP dataset of 7,808 SNPs.

Phenotypic data

Wheat

The 228 wheat varieties were phenotyped for heading date (HD) and grain yield (GY) at the two environments in which the NIRS analysis was conducted (drought experiment in Clermont-Ferrand 2015/2016). The subpanel of 161 varieties was phenotyped for the same traits in six independent environments. In each environment, the phenotypic data were adjusted for micro-environmental effects using the random effect block and when necessary by modeling spatial trends using two-dimensional penalized spline (P-spline) models as implemented in the R package SpATS10.

Poplar

The poplar association population was evaluated at each of the two sites for the following traits on up to six replicates by genotype: height at 2 years at Orléans (HT-ORL), circumference at 1 m above ground at 2 years at both sites (CIRC-ORL and CIRC-SAV), bud flush at both sites (BF-ORL and BF-SAV) and bud set at both sites (BS-ORL and BS-SAV) as discrete scores for a given day of the year (see Dillen et al.11 and Rohde et al.12 for details on the scales used) and resistance to rust at Orléans (RUST-ORL) as a discrete score of susceptibility on the most affected leaf of the tree and on a 1 to 8 scale. Within each site, the phenotypic data were adjusted for micro-environmental effects using random effect block and/or spatial position when needed following a visual inspection of spatial effects with a variogram as implemented in the R package breedR13. Finally, the adjusted phenotypes were restricted to the subset of 562 genotypes with SNP and NIRS data for computing an averaged genotypic value for each trait by genotype within each site for further analyses.

Genetic variance captured by the NIRS

Genomic heritability and partition of phenotypic variance along spectra

The estimation of genomic heritability was based on the following bivariate statistical model across environments: Embedded Image where y1,y2 are the phenotypic values (absorbance for a given wavelength) in each environment, β is a fixed effect of the environment, u is a vector of random polygenic Embedded Image effect with being the scaled realized relationship matrix (see below), ϵ is a vector of independent and normally distributed residuals with Embedded Image and X and Z are design matrices relating observations to the effects.

SNPs were used to estimate the genomic relationship matrix (A) between individuals, following the formula of VanRaden14: Embedded Image where and Gi,ℓ are the genotypes of individuals i and j at marker / (G.,ℓ = 0 or 1 for homozygotes, 0.5 for heterozygotes), Pℓ is the frequency of the allele coded I for the marker I, and σ2 is the average empirical marker variance. K was obtained by scaling A to have a sample variance of 115-16.

Genomic heritability was estimated for each wavelength within each environment (m) as follows: Embedded Image with Embedded Image the REML estimates Embedded Image obtained with the Newton-Raphson algorithm implemented in the R package sommer17.

Following Yamada et al.18, the variance/covariance estimates from the previously defined bivariate mixed-model were used to compute estimates of genetic Embedded Image genetic by environment Embedded Image and residual Embedded Image variances across sites as follows: Embedded Image

Association mapping of NIRS reflectance

Association mapping was carried out along spectra considering the absorbance at a given wavelength as a trait in a bivariate setting and using previous estimates of genetic and residual variances (EMMAX philosophy as previously proposed in the multi-trait mixed-model approach19).

Genomic and phenomic predictions

The efficiency of genomic and phenomic predictions was evaluated by cross-validations in two types of scenarios. In scenario S1, NIRS analysis and cross-validation were applied to the same environment. In scenario S2, cross-validation was applied to independent environments: the environment(s) in which NIRS was collected and the environment in which the cross-validation was applied (calibration and prediction) were different. In S1, the objective was to limit expensive or labor-demanding phenotyping to a calibration set of reduced size and to predict the remaining individuals using NIRS. In scenario S2, one experiment (or a nursery) was dedicated to collecting the NIRS of the calibration set and the predicted set, and a multi-environment trial was dedicated to phenotyping the calibration set. The main difference between S1 and S2 was that in S1, NIRS could potentially capture both genetic and G × E variances, leading to environment-specific predictions. For both scenarios, 5- and 8-fold cross-validation procedures repeated 20 times were used for poplar and wheat, respectively. A larger fold-number was considered for wheat in comparison to poplar because the sample size in the wheat dataset (n = 223 in the panel, and n=161 in the subpanel) was lower than the sample size in the poplar dataset (n = 562). Predictive ability was computed as the Pearson correlation between the predictions and adjusted means. For genomic predictions, we tested two complementary reference models: G-BLUP and Bayesian LASSO20-21. The underlying assumptions of these two models are that the SNP effects are normally distributed for G-BLUP, whereas Bayesian LASSO allows for departure from normality (i.e., SNPs with bigger effects). G-BLUP and Bayesian LASSO were run with the R packages rrBLUP and BGLR22, respectively. For Bayesian LASSO, the chain was composed of 30,000 iterations with a burn-in of 5,000 iterations, and the hyperparameter λ was chosen as recommended in table 1 of de los Campos et al.22. For phenomic predictions, we used RR-BLUP but considered NIRS data instead of molecular markers. Prior to the analysis, the pretreated NIRS matrices were centered and scaled. The shrinkage parameter was estimated within the cross-validation scheme on the calibration set only to avoid overfitting. In other words, the reported predictive abilities for NIRS prediction were unlikely to be overestimated because of model optimization.

Expected genetic progress with genomic and phenomic selection in a simple example

We ran simulations to illustrate the expected genetic progress with GS and PS that would be achieved in one cycle of selection for various combinations of costs and reliabilities. Reliability was defined as the squared correlation between true breeding values (TBV) and the genomic or NIRS estimated breeding values (EBV).

We considered a situation in which a given budget (200,000 €) was available to predict the performances of selection candidates with NIRS or genotyping. Depending on the costs of the methods (DNA extraction and genotyping for GS or tissue sampling and NIRS acquisition for PS), we computed the number of selection candidates (N) that could be analyzed. The TBV and genomic or NIRS EBV of these N individuals were then sampled from a multivariate normal distribution with means equal to 0, variances equal to 1 and covariance equal to the square root of reliability (R package mvtnorm23). The genetic gain was then computed as the difference between the average TBV of the 400 individuals having the best EBV and the average TBV of the population (equal to 0). We selected 400 individuals because for many species, it is feasible to apply heavier phenotyping (multi-environment trials) on a few hundred individuals. We considered two situations; in the first situation, the genetic progress of GS and PS was computed for various genotyping and NIRS costs with a reliability set to 0.4. In the second situation, the reliability of GS and PS varied between 0.3 and 0.6, and genotyping and NIRS costs were set to 50 € and 4 €, respectively. For each combination of parameters (reliabilities and costs of GS and PS), the simulation procedure was repeated 1000 times to obtain stable results.

Because genotyping and NIRS costs are highly dependent on the species and the number of samples analyzed, we let the genotyping costs (DNA extraction and genotyping itself) vary between 25 € and 100 € and the NIRS costs (sample treatment and NIRS analysis itself) vary between 1 € and 8 € in the first situation.

To provide concrete examples, we applied this simulation process with the reliabilities and costs that we experienced for wheat and poplar. GS costs were between 35 € and 50 € per individual for wheat and poplar, respectively, and PS costs were between 3 € and 2.5 € per individual for wheat and poplar, respectively. Reliabilities were estimated as the square of predictive abilities estimated by cross-validation divided by the heritability of the adjusted means. For each combination of trait, scenario, and NIRS data considered (tissue, environment), the increase in genetic progress using PS instead of GS was computed with the best performing GS model as a reference.

Data availability

The datasets generated during and/or analysed during the current study are available in the INRA Dataverse repository (https://data.inra.fr/). They can be accessed with the following link: http://dx.doi.org/10.15454/MB4G3T.

Code availability

R code used throughout the study is available upon request.

AUTHOR CONTRIBUTIONS

V.S. and R.R. designed the study, analyzed the data and wrote the paper with input from J-P.C., P.F.R, E.P., J.L.G., and C.B.

COMPETING INTERESTS

The authors declare no competing financial interests.

ACKNOWLEDGEMENTS

The authors gratefully acknowledge the staff of the INRA GBFOR experimental unit for the establishment and management of the poplar experimental design in Orléans, the collection of wood samples in each site, and their contribution to phenotypic measurements on poplars in Orléans; Alasia Franco Vivai staff for management of the poplar experimental plantation in Savigliano, and M. Sabatti and F. Fabbrini for their contribution to phenotypic measurements on poplars in Savigliano. We acknowledge the staff of the INRA GenoBois platform for the preparation of samples and collection of NIRS on wood samples and the staff of EPGV and BioForA for their contribution to obtaining SNP data on poplar. We would like to thank J. Messaoud for NIRS acquisition on wheat samples, V. Allard, B. Adam and D. Cormier for implementation of the rain-out shelter experiment (Phéno3C, INRA Clermont-Ferrand), and E. Heumez (UE GCIE) for the experiment in Estrées-Mons. We would also like to thank A. Chateigner, L. Sanchez, G. Charmet, V. Allard, P. Martre and S. Bouchet for useful discussions and comments on the manuscript. We are grateful to the Genotoul bioinformatics platform Toulouse Midi-Pyrenees (Bioinfo Genotoul) for providing computing resources. Establishment and management of the poplar experimental sites until harvests were carried out with financial support from the NOVELTREE project (EU-FP7-211868). NIRS measurements on poplar wood samples were supported by the SYBIOPOP project funded by the French National Research Agency (ANR-13-JSV6-0001). Management of the wheat multi-environment trials was financially supported by the French National Research National Agency under Investment for the Future (BreedWheat project ANR-10-BTBR-03) and by FranceAgriMer. The Phéno3C platform was financially funded by the French National Research National Agency under the Investment for the Future Phenome project (ANR-11-INBS-12) and by the European Regional Development Fund (AV0011535).

REFERENCES

  1. 1.↵
    Meuwissen, T. H., Hayes, B. J., & Goddard, M. E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829 (2001).
    OpenUrlAbstract/FREE Full Text
  2. 2.↵
    Crossa, J., Jarquin, D., Franco, J., Pérez-Rodríguez, P., Burgueño, J., Saint-Pierre, C., Vikram, P., Sansaloni, C., Petroli, C., Akdemir, D., Sneller, C., Reynolds, M., Tattaris, M., Payne, T., Guzman, C., Pñna, R. J., Wenzl, P. & Singh, S. Genomic prediction of gene bank wheat landraces. G3 6, 1819–1834 (2016).
    OpenUrl
  3. 3.
    Yu, X., Li, X., Guo, T., Zhu, C., Wu, Y., Mitchell, S. E., Roozeboom, K. L., Wang, D., Wang, M. L., Pederson, G. A., Tesso, T. T., Schnable, P. S., Bernardo, R. & Yu, J. Genomic prediction contributing to a promising global strategy to turbocharge gene banks. Nature Plants 2, 16150 (2016).
    OpenUrl
  4. 4.↵
    Heffner, E. L., Lorenz, A. J., Jannink, J.-L. & Sorrells, M. E. Plant Breeding with Genomic Selection: Gain per Unit Time and Cost. Crop Science 50, 1681 (2010).
    OpenUrlCrossRefWeb of Science
  5. 5.↵
    Meuwissen, T., Hayes, B., & Goddard M. Accelerating improvement of livestock with genomic selection. Annual Review of Animal Biosciences 1, 221–237 (2013).
    OpenUrlCrossRefPubMed
  6. 6.↵
    Whittaker, J. C., Thompson, R., Denham, M. C. Marker-assisted selection using ridge regression. Genetics Research 75, 249–252 (2000).
    OpenUrlCrossRef
  7. 7.↵
    Habier, D., Fernando, R. L. & Dekkers, J. C. M. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389–2397 (2007).
    OpenUrlAbstract/FREE Full Text
  8. 8.
    Goddard, M. Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245–257 (2009).
    OpenUrlCrossRefPubMedWeb of Science
  9. 9.
    Hayes, B. J., Visscher P. M. & Goddard, M. E. Increased accuracy of artificial selection by using the realized relationship matrix. Genetics Research 91, 47–60 (2009).
    OpenUrl
  10. 10.↵
    Zhong S., Dekkers J. C. M., Fernando R. L. & Jannink J.-L. Factors Affecting Accuracy From Genomic Selection in Populations Derived From Multiple Inbred Lines: A Barley Case Study. Genetics 182, 355–364 (2009).
    OpenUrlAbstract/FREE Full Text
  11. 11.↵
    Fox, H. M. Chemical Taxonomy. Nature 157, 511–511 (1946).
    OpenUrlPubMed
  12. 12.
    Bertrand, D., Robert, P. & Loisel, W. Identification of some wheat varieties by near infrared reflectance spectroscopy. Journal of the Science of Food and Agriculture 36, 1120–1124 (1985).
    OpenUrlCrossRef
  13. 13.
    Adedipe, O. E., Dawson-Andoh, B., Slahor, J. & Osborn, L. Classification of Red Oak (Quercus Rubra) and White Oak (Quercus Alba) Wood Using a near Infrared Spectrometer and Soft Independent Modelling of Class Analogies. Journal of Near Infrared Spectroscopy 16, 49–57 (2008).
    OpenUrl
  14. 14.
    Espinoza, J. A., Hodge, G. R. & Dvorak, W. S. The Potential Use of near Infrared Spectroscopy to Discriminate between Different Pine Species and Their Hybrids. Journal of Near Infrared Spectroscopy 20, 437–447 (2012).
    OpenUrl
  15. 15.
    Fischnaller, S., Dowell, F. E., Lusser, A., Schlick-Steiner, B. C. & Steiner, F. M. Non-destructive species identification of Drosophila obscura and D. subobscura (Diptera) using near-infrared spectroscopy. Fly 6, 284–289 (2012).
    OpenUrlCrossRefPubMedWeb of Science
  16. 16.
    Abasolo, M., Lee, D. J., Raymond, C., Meder, R. & Shepherd, M. Deviant near-infrared spectra identifies Corymbia hybrids. Forest Ecology and Management 304, 121–131 (2013).
    OpenUrl
  17. 17.↵
    O’Reilly-Wapstra, J. M., Freeman, J. S., Barbour, R., Vaillancourt, R. E. & Potts, B. M. Genetic analysis of the near-infrared spectral phenome of a global Eucalyptus species. Tree Genetics & Genomes 9, 943–959 (2013).
    OpenUrl
  18. 18.
    Meder, R., Kain, D., Ebdon, N., Macdonell, P. & Brawner, J. T. Identifying Hybridisation in Pinus Species Using near Infrared Spectroscopy of Foliage. Journal of Near Infrared Spectroscopy 22, 337–345 (2014).
    OpenUrl
  19. 19.
    Lang, C., Almeida, D. R. A. & Costa, F. R. C. Discrimination of taxonomic identity at species, genus and family levels using Fourier Transformed Near-Infrared Spectroscopy (FT-NIR). Forest Ecology and Management 406, 219–227 (2017).
    OpenUrl
  20. 20.↵
    Cruickshank, R. H. & Munck L. It’s barcoding Jim, but not as we know it. Zootaxa 2953, 55–56 (2011).
    OpenUrl
  21. 21.↵
    Posada, H., Ferrand, M., Davrieux, F., Lashermes, P. & Bertrand, B. Stability across environments of the coffee variety near infrared spectral signature. Heredity 102, 113–119 (2008).
    OpenUrl
  22. 22.
    Diepeveen, D., Clarke, G., Ryan, K., Tarr, A., Ma, W. & Appels, R. Molecular genetic mapping of NIR spectra variation. Journal of Cereal Science 55, 6–14 (2012).
    OpenUrl
  23. 23.↵
    Hein, P. R. G. & Chaix, G. NIR Spectral Heritability: A Promising Tool for Wood Breeders? Journal of Near Infrared Spectroscopy 22, 141–147 (2014).
    OpenUrl
  24. 24.↵
    Ecarnot, M., Compan, F. & Roumet, P. Assessing leaf nitrogen content and leaf mass per unit area of wheat in the field throughout plant cycle with a portable spectrometer. Field Crops Research 140, 44–50 (2013).
    OpenUrl
  25. 25.↵
    Teixera dos Santos, C. A., Lopo, M., Páscoa, R. N. & Lopes, J. A. A Review on the Applications of Portable Near-Infrared Spectrometers in the Agro-Food Industry. Applied Spectroscopy 67, 1215–1233 (2013).
    OpenUrlCrossRefPubMed
  26. 26.↵
    Madec, S., Baret, F., de Solan, B., Thomas, S., Dutartre, D., Jezequel, S., Hemmerlé, M., Colombeau, G. & Comar, A. High-Throughput Phenotyping of Plant Height: Comparing Unmanned Aerial Vehicles and Ground LiDAR Estimates. Frontiers in Plant Science 8, 2002 (2017).
    OpenUrl
  27. 27.↵
    Diago, M. P., Fernandes, A., Millan, B., Tardaguila, J. & Melo-Pinto, P. Identification of grapevine varieties using leaf spectroscopy and partial least squares. Computers and Electronics in Agriculture 99, 7–13 (2013).
    OpenUrl
  28. 28.↵
    Peerbhay, K. Y., Mutanga, O. & Ismail, R. Commercial tree species discrimination using airborne AISA Eagle hyperspectral imagery and partial least squares discriminant analysis (PLS-DA) in KwaZulu–Natal, South Africa. ISPRS Journal of Photogrammetry and Remote Sensing 79, 19–28 (2013).
    OpenUrl
  29. 29.↵
    Riedelsheimer, C., Czedik-Eysenberg, A., Grieder, C., Lisec, J., Technow, F., Sulpice, R., Altmann, T., Stitt, M., Willmitzer, L. & Melchinger, A. E. Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nature Genetics 44, 217–220 (2012).
    OpenUrlCrossRefPubMed
  30. 30.
    Fernandez, O., Urrutia, M., Bernillon, S., Giauffret, C., Tardieu, F., Le Gouis, J., Langlade, N., Charcosset, A., Moing, A. & Gibon, Y. Fortune telling: Metabolic markers of plant performance. Metabolomics 12, 158 (2016).
    OpenUrlCrossRef
  31. 31.↵
    Westhues, M., Schrag, T. A., Heuer, C., Thaller, G., Utz, H. F., Schipprack, W., Thiemann, A., Seifert, F., Ehret, A., Schlereth, A., Stitt, M., Nikoloski, Z., Willmitzer, L., Schön, C. C., Scholten, S. & Melchinger, A. E. Omics-based hybrid prediction in maize. Theoretical and Applied Genetics 130:1927–1939 (2017).
    OpenUrlCrossRef

ADDITIONAL REFERENCES (method section)

  1. 1.↵
    Ly, D., Huet, S., Gauffreteau, A., Rincent, R., Touzy, G., Mini, A., Jannink, J.-L., Cormier, F., Paux, E., Lafarge, S., Gouis, J. L. & Charmet, G. Whole-genome prediction of reaction norms to environmental stress in bread wheat (Triticum aestivum L.) by genomic random regression. Field Crops Research 216, 32–41 (2018).
    OpenUrl
  2. 2.↵
    Guet, J., Fabbrini, F., Fichot, R., Sabatti, M., Bastien, C. & Brignolas, F. Genetic variation for leaf morphology, leaf structure and leaf carbon isotope discrimination in European populations of black poplar (Populus nigra L.). Tree Physiology 35, 850–863 (2015).
    OpenUrlCrossRefPubMed
  3. 3.↵
    Faivre-Rampant, P., Zaina, G., Jorge, V., Giacomello, S., Segura, V., Scalabrin, S., Guérin, V., Paoli, E. D., Aluome, C., Viger, M., Cattonaro, F., Payne, A., Paulstephenraj, P., Paslier, M. C. L., Berard, A., Allwright, M. R., Villar, M., Taylor, G., Bastien, C. & Morgante, M. New resources for genetic studies in Populus nigra: genome-wide SNP discovery and development of a 12k Infinium array. Molecular Ecology Resources 16, 1023–1036 (2016).
    OpenUrl
  4. 4.↵
    Gebreselassie, M. N., Ader, K., Boizot, N., Millier, F., Charpentier, J.-P., Alves, A., Simões, R., Rodrigues, J. C., Bodineau, G., Fabbrini, F., Sabatti, M., Bastien, C. & Segura, V. Near-infrared spectroscopy enables the genetic analysis of chemical properties in a large set of wood samples from Populus nigra (L.) natural populations. Industrial Crops and Products 107, 159–171 (2017).
    OpenUrl
  5. 5.↵
    R Core Team. R: A language and environment for statistical computing (2017). https://www.R-project.org/.
  6. 6.↵
    Savitzky, A. & Golay, M. J. E. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Analytical Chemistry 36, 1627–1639 (1964).
    OpenUrlCrossRef
  7. 7.↵
    signal developers. signal: Signal processing. R package (2013). http://r-forge.r-project.org/projects/signal/.
  8. 8.↵
    Rimbert, H., Darrier, B., Navarro, J., Kitt, J., Choulet, F., Leveugle, M., Duarte, J., Rivière, N., Eversole, K., Gouis, J. L., Davassi, A., Balfourier, F., Paslier, M.-C. L., Berard, A., Brunel, D., Feuillet, C., Poncet, C., Sourdille, P. & Paux, E. High throughput SNP discovery and genotyping in hexaploid wheat. Plos One 13, (2018).
  9. 9.↵
    Sargolzaei, M., Chesnais, J. P. & Schenkel, F. S. A new approach for efficient genotype imputation using information from relatives. BMC Genomics 15, 478 (2014).
    OpenUrlCrossRef
  10. 10.↵
    Rodríguez-Álvarez, M. X., Boer, M. P., Eilers, P. H. C., van Eeuwijk, F. A. SpATS: spatial analysis of field trials with splines. R package version 1.0–4 (2016). https://cran.r-project.org/package=SpATS
  11. 11.↵
    Dillen, S. Y., Marron, N., Sabatti, M., Ceulemans, R. & Bastien, C. Relationships among productivity determinants in two hybrid poplar families grown during three years at two contrasting sites. Tree Physiology 29, 975–987 (2009).
    OpenUrlCrossRefPubMedWeb of Science
  12. 12.↵
    Rohde, A., Storme, V., Jorge, V., Gaudet, M., Vitacolonna, N., Fabbrini, F., Ruttink, T., Zaina, G., Marron, N., Dillen, S., Steenackers, M., Sabatti, M., Morgante, M., Boerjan, W. & Bastien, C. Bud set in poplar - genetic dissection of a complex trait in natural and hybrid populations. New Phytologist 189, 106–121 (2010).
    OpenUrl
  13. 13.↵
    Muñoz, F. & Sanchez, L. breedR: Statistical Methods for Forest Genetic Resources Analysts. R package version 0.12-2 (2017). https://github.com/famuvie/breedR.
  14. 14.↵
    Vanraden, P. Efficient Methods to Compute Genomic Predictions. Journal of Dairy Science 91, 4414–4423 (2008).
    OpenUrlCrossRefPubMedWeb of Science
  15. 15.↵
    Kang, H. M., Sul, J. H., Service, S. K., Zaitlen, N. A., Kong, S.-Y., Freimer, N. B., Sabatti, C. & Eskin, E. Variance component model to account for sample structure in genome-wide association studies. Nature Genetics 42, 348–354 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  16. 16.↵
    Forni, S., Aguilar, I. & Misztal, I. Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information. Genetics Selection Evolution 43, 1 (2011).
    OpenUrlCrossRefPubMed
  17. 17.↵
    Covarrubias-Pazaran, G. Genome-Assisted Prediction of Quantitative Traits Using the R Package sommer. Plos One 11, (2016).
  18. 18.↵
    Yamada, Y., Itoh, Y. & Sugimoto, I. Parametric relationships between genotype x environment interaction and genetic correlation when two environments are involved. Theoretical and Applied Genetics 76, 850–854 (1988).
    OpenUrlCrossRefWeb of Science
  19. 19.↵
    Korte, A., Vilhjálmsson, B. J., Segura, V., Platt, A., Long, Q. & Nordborg, M. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nature Genetics 44, 1066–1071 (2012).
    OpenUrlCrossRefPubMed
  20. 20.↵
    Park, T. & Casella, G. The Bayesian Lasso. Journal of the American Statistical Association 103, 681–686 (2008).
    OpenUrlCrossRefWeb of Science
  21. 21.↵
    de los Campos, G., Naya, H., Gianola, D., Crossa, J., Legarra, A., Manfredi, E., Weigel, K. & Cotes, J. M. Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and Pedigree. Genetics 182, 375–385 (2009).
    OpenUrlAbstract/FREE Full Text
  22. 22.↵
    de los Campos, G., Hickey, J. M., Pong-Wong, R., Daetwyler, H. D. & Calus, M. P. L. Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding. Genetics 193, 327–345 (2012).
    OpenUrlPubMed
  23. 23.↵
    Genz, A., Bretz, F., Miwa T., Mi X., Leisch F., Scheipl F. & Hothorn T. mvtnorm: Multivariate Normal and t Distributions. R package version 1.0-6 (2017). http://CRAN.R-project.org/package=mvtnorm
Back to top
PreviousNext
Posted April 16, 2018.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Phenomic selection: a low-cost and high-throughput alternative to genomic selection
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Phenomic selection: a low-cost and high-throughput alternative to genomic selection
Renaud Rincent, Jean-Paul Charpentier, Patricia Faivre-Rampant, Etienne Paux, Jacques Le Gouis, Catherine Bastien, Vincent Segura
bioRxiv 302117; doi: https://doi.org/10.1101/302117
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Phenomic selection: a low-cost and high-throughput alternative to genomic selection
Renaud Rincent, Jean-Paul Charpentier, Patricia Faivre-Rampant, Etienne Paux, Jacques Le Gouis, Catherine Bastien, Vincent Segura
bioRxiv 302117; doi: https://doi.org/10.1101/302117

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4246)
  • Biochemistry (9176)
  • Bioengineering (6807)
  • Bioinformatics (24069)
  • Biophysics (12161)
  • Cancer Biology (9568)
  • Cell Biology (13847)
  • Clinical Trials (138)
  • Developmental Biology (7662)
  • Ecology (11739)
  • Epidemiology (2066)
  • Evolutionary Biology (15547)
  • Genetics (10673)
  • Genomics (14366)
  • Immunology (9517)
  • Microbiology (22916)
  • Molecular Biology (9135)
  • Neuroscience (49170)
  • Paleontology (358)
  • Pathology (1488)
  • Pharmacology and Toxicology (2584)
  • Physiology (3851)
  • Plant Biology (8353)
  • Scientific Communication and Education (1473)
  • Synthetic Biology (2302)
  • Systems Biology (6207)
  • Zoology (1304)