Image-based methods for phenotyping growth dynamics and fitness components in Arabidopsis thaliana

François Vasseur; George Wang; Justine Bresson; Rebecca Schwab; Detlef Weigel

doi:10.1101/208512

Abstract

Background The model species Arabidopsis thaliana has extensive resources to investigate intraspecific trait variability and the genetic bases of ecologically relevant traits. However, the cost of equipment and software required for high-throughput phenotyping is often a bottleneck for large-scale studies, such as mutant screening or quantitative genetics analyses. Simple tools are needed for the measurement of fitness-related traits, like relative growth rate and fruit production, without investment in expensive infrastructures. Here, we describe methods that enable the estimation of biomass accumulation and fruit number from the analysis of rosette and inflorescence images taken with a regular camera.

Results We developed two models to predict plant dry mass and fruit number from the parameters extracted with the analysis of rosette and inflorescence images. Predictive models were trained by sacrificing growing individuals for dry mass estimation, and manually measuring a fraction of individuals for fruit number at maturity. Using a cross-validation approach, we showed that quantitative parameters extracted from image analysis predicts more 90% of both plant dry mass and fruit number. When used on 451 natural accessions, the method allowed modelling growth dynamics, including relative growth rate, throughout the life cycle of various ecotypes. Estimated growth-related traits had high heritability (0.65 < H² < 0.93), as well as estimated fruit number (H² = 0.68). In addition, we validated the method for estimating fruit number with rev5, a mutant with increased flower abortion.

Conclusions The method we propose here is based on the automated computerization of plant images with ImageJ, and subsequent statistical modelling in R. It allows plant biologists to measure growth dynamics and fruit number in hundreds of individuals with simple computing steps that can be repeated and adjusted to a wide range of laboratory conditions. It is thus a flexible toolkit for the measurement of fitness-related traits in large plant populations.

Background

Relative growth rate (RGR) and fruit number are two essential parameters of plant performance and fitness [1–3]. Proper estimation of RGR is achieved with the destructive measurement of plant biomass across several individuals sequentially harvested [4,5]. However, sequential harvesting is space and time consuming, which makes this approach inappropriate for large-scale studies. Furthermore, it is problematic for evaluating measurement error, as well as to compare growth dynamics and fitness-related traits, like fruit production, on the same individuals. Thus, a variety of platforms and equipment have been developed in the last decade for high-throughput phenotyping of plant growth from image analysis, specifically in crops [6–10] and in the model species A. thaliana [11–14]. Because commercial technologies are powerful but generally expensive [6,8,11,13], low-cost methods have been proposed, for instance to estimate rosette expansion rate from sequential imaging of A. thaliana individuals [14–16]. These methods can be adapted to a variety of lab conditions, but they do not allow the quantification of complex traits like biomass accumulation, RGR and fruit production.

Strong variation in RGR has been reported across and within plant species [17–22], which has been assumed to reflect the inherent diversity of strategies to cope with contrasting levels of resource availability [3,23,24]. For instance, species from resource-scarce environments generally show a lower RGR than species from resource-rich environments, even when they are grown in non-limiting resource conditions [25,26]. Ecophysiological studies [18,26] have shown that plant RGR depends on morphological traits (e.g. leaf mass fraction, leaf dry mass per area) and physiological rates (e.g. net assimilation rate) that differ between species, genotypes, or ontogenetic stages. For instance, plants become less efficient to accumulate biomass as they get larger and older, resulting in a decline of RGR during ontogeny [4]. This is due to developmental and allometric constraints such as self-shading and increasing allocation of biomass to supporting structures, like stems, in growing individuals.

To assess plant performance, response to environment, or genetic effects, it is important to link individual’s growth trajectory to productivity, yield or reproductive success. However, while several methods have been proposed to estimate growth dynamics from image analysis [8,11–16], methodologies for automated, high-throughput phenotyping of fruit number per plant remain surprisingly scarce [27,28]. Yet, the analysis of inflorescence images in A. thaliana could offer a valuable tool to connect growth dynamics and plant fitness. Because of its small size, inflorescence can easily be collected, imaged and analyzed with simple equipment. Furthermore, the genetic resources available in this species enable large-scale analyses (mutants screening, quantitative trait loci mapping and genome-wide association studies). For instance, the recent release of 1,135 natural accessions with complete genomic sequences allows conducting large comparative analysis of phenotypic variation within the species (http://1001genomes.org/) [29].

With the methods proposed here, we aimed at developing flexible and customizable tools based on the automated computerization and analysis of plant images to estimate fruit number and growth dynamics, including RGR throughout the life cycle, of many A. thaliana genotypes. The estimation of biomass accumulation is semi-invasive, as it requires sacrificing some individuals to train a predictive model. This approach considerably reduced the number of plants needed to estimate RGR during ontogeny, from seedling establishment to fruiting. Furthermore, the estimation of fruit number from automated image analysis of A. thaliana inflorescence could greatly help link growth variation to plant performance and fitness, in various genotypes and environmental conditions.

Results

Estimation of biomass accumulation, RGR and growth dynamics

Description

The method for growth analysis requires a set of rosette images on which we want to non-destructively measure plant dry mass, and a set of individuals harvested to train a predictive model (Fig. 1). In the case study presented here, we evaluated the method on 472 genotypes of A. thaliana grown in trays using a growth chamber equipped with Raspberry Pi Automated Phenotyping Array (hereafter RAPA) built at the Max Planck Institute (MPI) of Tübingen. We partitioned the whole population (n = 1920) in two subpopulations: the focal population (n = 960) on which growth dynamics (and fruit production) were measured, and the training population (n = 960) on which a predictive model of plant dry mass was developed.

Figure 1. Estimation of plant dry mass from image analysis and statistical modelling.

(a) Example of sequential tray images, analyzed with ImageJ to extract individual rosette shape descriptors during ontogeny. (b) Rosette dry mass measured at 16 DAG in the training population. (c) Series of cross-validation performed for different predictive models with different training population size (x axis). Dots represent mean prediction accuracy, measured as Pearson’s coefficient of correlation (r²) between observed and predicted values. Error bars represent 95% CI across 100 random permutations of the training dataset. (d) Correlation between observed and predicted values for cross-validation of the best model obtained with stepwise regression, performed 60 individuals to train the model, and tested on 300 individuals not used to train the model.

Individuals of the focal population were daily photographed during ontogeny (Fig. 1a), and harvested at the end of reproduction when the first fruits (siliques) were yellowing (stage 8.00 according to Boyes et al. [30]). Top-view images were automatically taken with RAPA, as well as with a manual setup during the first 25 days of plant growth (Fig. S1). To show the general applicability of the method, here we only used images manually taken with a regular camera. Moreover, the method can also be applied to images taken on individual pots. Plants of the training population were harvested at 16 days after germination (DAG), dried and weighed for building a predictive model of rosette biomass with top-view images (Fig. 1b). Predictive models were trained and evaluated with a cross-validation approach (Fig. 1c). Once a predictive model has been chosen and validated, rosette dry mass can be non-destructively estimated on all individuals of the focal population, which allows modelling growth trajectory, biomass accumulation and RGR throughout the plant life cycle.

Implementation

We developed an ImageJ [31] macro to extract shape descriptors of the rosette from tray or individual pot images (Fig, 1a, Additional File 1). To run the macro, users need to install ImageJ, go to Plugins > Macros > Run, and select the “RAPAmacro_RosetteShape.txt” file in the corresponding folder. Macro will guide users in the different steps of image analysis to label plant individual, perform segmentation and measure rosette shape descriptors. It proceeds all images (trays or individual pots) present in an input folder, and returns shape descriptors of individual rosettes in an output folder defined by users. Shape descriptors include individual rosette area and perimeter in pixels, rosette circularity (Circ = 4π × ), aspect ratio , and roundness (Round = . Rosette area and perimeter can be converted into cm² and cm, respectively, by measuring the area and perimeter of a surface calibrator defined by users.

Predictive models of plant dry mass from shape descriptors were tested against measurements in the training population (Additional File 2). Depending on the training population size, we observed variable prediction accuracy for different models, as measured by the coefficient of correlation (r²) between measured and predicted rosette dry mass in individuals not used to train the model (Fig. 1c). LASSO and RIDGE models reached high prediction accuracy even with very small training population size (< 20 individuals). However, with a minimum of 50 training individuals, lm and RIDGE/LASSO performed equally, with a prediction accuracy > 90%. Using stepwise regression, we showed that using only rosette area (RA) and circularity (Circ) as predictors in a simple linear model framework can reach high prediction accuracy (r² = 0.91, Fig. 1d). Thus, the final equation we used to estimate rosette dry mass from rosette pictures was Rosette DM = −0.00133 + 0.00134 × RA + 0.00274 × Circ (cross-validation r² = 0.91, Fig. 1d).

Application

From estimated rosette dry mass during the ontogeny and final rosette dry mass measured on the same individuals at maturity, we modelled sigmoid growth curves of biomass accumulation (mg), M(t), for all individuals in the focal population with a three-parameter logistic function [4,32] (Fig. 2a and 2b), as in Equation 1: where A, B and t_inf are the parameters characterizing the shape of the curve, which differ between individuals depending on the genotypes and/or environmental conditions. A is the upper asymptote of the sigmoid curve, which was measured as rosette dry mass (mg) at maturity. The duration of growth was estimated as the time in days between the beginning of growth after vernalization (t₀) and the end of reproduction (maturity). B controls the steepness of the curve, as the inverse of the exponential growth coefficient r (r = 1/B). t_inf is the inflection point that, by definition, corresponds to the point where the rosette is half the final dry mass. Both B and t_inf were estimated for every individual by fitting a logistic growth function to the data in R (Additional File 3).

Figure 2. Application of the dry mass estimation method to model growth dynamics in A. thaliana.

Statistical modelling of rosette dry mass during ontogeny, M(t), with three-parameter logistic growth curve, on one individual (a) and 451 natural accession accessions (b); absolute growth rate during ontogeny, GR(t), on one individual (c) and the 451 accessions (d); relative growth rate during ontogeny, RGR(t), on one individual (e) and the 451 accessions (f). t_inf (red dashed line) represents point of growth curve inflection. Black dashed lines in right panels represent 95% CI of fitted growth curve. Individuals on the right panels are coloured by duration (days) of plant life cycle. (g-i) Variation of M(t_inf), GR(t_inf) and RGR(t_inf) across the 451 accessions phenotyped, with broad-sense heritability (H²) on the top-left corner of each panel. Dots represent genotypic mean ± standard error (n = 2).

Growth dynamics variables were computed from the fitted parameters, such as GR(t), the derivative of the logistic growth function (Fig. 2c and 2d), as in Equation 2: and the relative growth rate (mg d^-1 g^-1), RGR(t), measured as the ratio GR(t) / M (t) (Fig. 2e and 2f), as in (Equation 3:

Comparing growth traits measured at t_inf, i.e. when GR is maximal for all individuals [4], revealed important variation between accessions (Fig. 2g-i), with a high part of phenotypic variance accounted by genetic variability, as measured by broad-sense heritability (H² = 0.93, 0.90 and 0.65 for M(t_inf), GR(t_inf) and RGR(t_inf), respectively). To evaluate the robustness of the method, we repeated an experiment on 18 accessions selected for their highly contrasted phenotypes (Fig. S2). Results showed a good correlation between the rosette dry mass at the inflection point estimated in the first experiment and the dry mass measured in the second experiment (r² = 0.67; Fig. S3a).