PT - JOURNAL ARTICLE AU - Lukas Roth AU - María Xosé Rodríguez-Álvarez AU - Fred van Eeuwijk AU - Hans-Peter Piepho AU - Andreas Hund TI - Phenomics data processing: A plot-level model for repeated measurements to extract the timing of key stages and quantities at defined time points AID - 10.1101/2021.05.02.442243 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.05.02.442243 4099 - http://biorxiv.org/content/early/2021/05/02/2021.05.02.442243.short 4100 - http://biorxiv.org/content/early/2021/05/02/2021.05.02.442243.full AB - Decision-making in breeding increasingly depends on the ability to capture and predict crop responses to changing environmental factors. Advances in crop modeling as well as high-throughput field phenotyping (HTFP) hold promise to provide such insights. Processing HTFP data is an interdisciplinary task that requires broad knowledge on experimental design, measurement techniques, feature extraction, dynamic trait modeling, and prediction of genotypic values using statistical models. To get an overview of sources of variations in HTFP, we develop a general plot-level model for repeated measurements. Based on this model, we propose a seamless stage-wise process that allows to carry on estimated means and variances from stage to stage and approximates the gold standard of a single-stage analysis. The process builds on the extraction of three intermediate trait categories; (1) timing of key stages, (2) quantities at defined time points or periods, and (3) dose-response curves. In a first stage, these intermediate traits are extracted from low-level traits’ time series (e.g., canopy height) using P-splines and the quarter of maximum elongation rate method (QMER), as well as final height percentiles. In a second and third stage, extracted traits are further processed using a stage-wise linear mixed model analysis. Using a wheat canopy growth simulation to generate canopy height time series, we demonstrate the suitability of the stage-wise process for traits of the first two above-mentioned categories. Results indicate that, for the first stage, the P-spline/QMER method was more robust than the percentile method. In the subsequent two-stage linear mixed model processing, weighting the second and third stage with error variance estimates from the previous stages improved the root mean squared error. We conclude that processing phenomics data in stages represents a feasible approach if using appropriate weighting through all stages. P-splines in combination with the QMER method are suitable tools to extract timing of key stages and quantities at defined time points from HTFP data.HighlightsGeneral plot-level model for repeated high-throughput field phenotyping measurementsThree main intermediate trait categories for dynamic modelingSeamless stage-wise process that allows to carry on estimated means and variancesPhenomics data processing cheatsheetCompeting Interest StatementThe authors have declared no competing interest.