Reflectance spectroscopy allows rapid, accurate, and non-destructive estimates of functional traits from pressed leaves

More than ever, ecologists seek to employ herbarium collections to estimate plant functional traits from the past and across biomes. However, many trait measurements are destructive, which may preclude their use on valuable specimens. Researchers increasingly use reflectance spectroscopy to estimate traits from fresh or ground leaves, and to delimit or identify taxa. Here, we extend this body of work to non-destructive measurements on pressed, intact leaves, like those in herbarium collections. Using 618 samples from 68 species, we used partial least-squares regression to build models linking pressed-leaf reflectance spectra to a broad suite of traits, including leaf mass per area (LMA), leaf dry matter content (LDMC), equivalent water thickness, carbon fractions, pigments, and twelve elements. We compared these models to those trained on fresh- or ground-leaf spectra of the same samples. Our pressed-leaf models were best at estimating LMA (R2 = 0.932; %RMSE = 6.56), C (R2 = 0.855; %RMSE = 9.03), and cellulose (R2 = 0.803; %RMSE = 12.2), followed by water-related traits, certain nutrients (Ca, Mg, N, and P), other carbon fractions, and pigments (all R2 = 0.514-0.790; %RMSE = 12.8-19.6). Remaining elements were predicted poorly (R2 < 0.5, %RMSE > 20). For most chemical traits, pressed-leaf models performed better than fresh-leaf models, but worse than ground-leaf models. Pressed-leaf models were worse than fresh-leaf models for estimating LMA and LDMC, but better than ground-leaf models for LMA. Finally, in a subset of samples, we used partial least-squares discriminant analysis to classify specimens among 10 species with near-perfect accuracy (>97%) from pressed- and ground-leaf spectra, and slightly lower accuracy (>93%) from fresh-leaf spectra. These results show that applying spectroscopy to pressed leaves is a promising way to estimate leaf functional traits and identify species without destructive analysis. Pressed-leaf spectra might combine advantages of fresh and ground leaves: like fresh leaves, they retain some of the spectral expression of leaf structure; but like ground leaves, they circumvent the masking effect of water absorption. Our study has far-reaching implications for capturing the wide range of functional and taxonomic information in the world’s preserved plant collections.


Introduction
The world's herbaria together contain more than 390 million specimens (Thiers 2021) which are a rich 48 source of information about global plant diversity. Herbarium specimens are collected for many 49 reasons-often to document where a species is present, or to serve as vouchers for taxonomic studies. But 50 these specimens are often repurposed for new ends, unforeseen by their collectors (Meineke et al. 2018). 51 More than ever, ecologists and evolutionary biologists seek to use herbarium specimens to measure 52 functional traits (Heberling 2021): for example, to evaluate the long-term imprint of human activity on 53 plant communities (Lang et al. 2018; Meineke et al. 2018); to fill in gaps in sparse trait databases (Perez 54 et al. 2020); or to conduct comparative studies of clades (Jardine et al. 2020). Measuring functional traits 55 on herbarium specimens carries the promise of letting us reach the inaccessible, including the past or 56 distant parts of the world. Using herbarium specimens also allows researchers to benefit from the 57 expertise of taxonomists and refer back to the same specimens for further use-for example, as sources of 58 genetic data, or as references for species identification from new collections (Heberling 2021). Using 59 specimens, researchers can address ecological and evolutionary questions that require merging functional, 60 genetic, and distributional data at global scales. 61 62 Many functional trait measurements require destructive sampling-for example, by grinding up tissue for 63 chemical analyses. Such measurements include most protocols to determine the elemental or molecular 64 than the n chosen during the downsampling step. We applied the cross-validated PLS-DA model from the 255 upsampling step to the validation subset and summarized its performance using raw classification 256 accuracy and κ. 257 258

Results 259
Patterns in traits and reflectance spectra 260 We saw large variation among samples in each of our target traits within the CABO data, ranging from 261 1.4-fold variation in C to more than 20-fold variation in traits like lignin, P, K, and Zn ( Table 2). The 262 ranges of most traits in our dataset covered a large portion of the global distributions in the TRY dataset, 263 but tended to be narrower at both extremes (Kattge et al. 2020). Many traits-including LMA, LDMC, 264 EWT, cellulose, and many elements-had distributions with a pronounced skew (most often positive). 265 Broadleaf trees tended to have higher LDMC, C, and lignin than other growth forms. Among the herbs, 266 grasses had very high hemicellulose and cellulose and low lignin content, while forbs often had high N. 267 Some of the trait variation was driven by specific projects; for example, A. flexuosa in the Warren project 268 tended to have particularly high LMA, Na, and C and low N. 269 270 Both pressed and ground leaves had higher median reflectance across nearly the entire spectrum ( Fig. 1), 271 as expected based on changes in water content and structure (Carter 1991 Within each tissue type, the coefficient of variation (CV) of reflectance was generally highest where 278 average reflectance was lowest. Across tissue types, pressed-leaf spectra tended to show greater absolute 279 variation in reflectance throughout much of the spectrum, particularly towards the tails of the distribution (e.g. the middle 95 percent in Fig. 1B). The species that have the most exceptionally reflective pressed 281 leaves across the spectrum (mainly Phragmites australis (Cav.) Trin. ex Steud., Phalaris arundinacea L., 282 and Asclepias syriaca L.) do not have particularly reflective fresh leaves, leaving it uncertain why their 283 pressed leaves are so reflective. In contrast, discolored leaves tended to have lower reflectance throughout 284 the visible and NIR ranges (Fig. S5). Unlike pressed leaves, ground leaves showed very low absolute 285 variation in reflectance throughout the SWIR, likely because grinding eliminates variation in leaf 286 structure. However, they showed high variation from 700 to 1100 nm, which may also result from varying 287 degrees of discoloration. 288

298
We compared pressed-leaf models to both fresh-leaf and ground-leaf models. The optimal number of 299 components selected to predict each trait was between 3 and 26. For any given trait, ground-leaf models 300 usually had the most components, often followed by pressed-leaf models (Table 2). Fresh-leaf models 301 were best for predicting the structural and water-related traits-LMA, LDMC, and EWT (Figs. 2-3 and 302 S6-12). Ground-leaf models were best for predicting chemical traits, like carbon fractions and most 303 elements. For most traits, pressed-leaf models had intermediate performance, although for some (e.g. 304 pigments, LDMC) both fresh-and ground-leaf models performed better. Statistics from jackknife 305 analyses showed that model performance was more variable for traits that were predicted less accurately 306 (Figs. S13-15). There was no correlation between the magnitude of residuals from pressed-leaf models 307 and our discoloration index for any trait (p > 0.05). 308 309 For all traits except LMA and Fe, the VIP metric for fresh-leaf spectra showed a global maximum 310 between 710 and 720 nm (Fig. 4 and S16-18)-wavelengths slightly longer than the typical inflection 311 point of the red edge (Richardson et al. 2002). Many traits also show high VIP across the green hump at 312 ~530-570 nm. Bands in the NIR range were less important for predicting most traits than much of the 313 visible range. The SWIR range was generally important for predicting LMA, EWT, Na, and pigments, 314 and many other traits showed several local peaks of importance, most prominently at about 1880 nm, but 315 also near 1480 and 1720 nm. 316

317
For predicting traits from pressed-leaf spectra, the general trend held that visible reflectance and certain 318 ranges in the SWIR were important for predicting most traits, while the NIR and much of the shorter 319 SWIR (800-1750 nm) were less important (Fig. 4 and S16-18). The red edge peak of importance for most 320 traits was near 705 nm. Other prominent local maxima for many traits lay close to 1440, 1720, 1920, 321 2130, and 2300 nm. We saw broadly similar patterns in ground-leaf spectra, except that VIP for most 322 traits was lower at longer SWIR wavelengths (2000-2400 nm). 323 324

External validation 325
For most traits, pressed-leaf model performance on the external validation dataset from Cedar Creek was 326 not quite as strong as the internal validation (Table 3; Fig. 5). For C, the models performed very poorly 327 (R 2 < 0.05). Among the remaining traits, R 2 ranged from 0.345 (LDMC) to 0.876 (LMA). For N in 328 particular, %RMSE was high (37.6%) due to bias-N concentrations were underestimated for conifers 329 but slightly overestimated for remaining samples. Since conifers were absent from the CABO training dataset, we considered whether the models we built 332 would extend to this new functional group. For LDMC, models performed better when excluding conifers 333 (R 2 = 0.406, %RMSE = 22.8) than when retaining them (R 2 = 0.345; %RMSE = 24.7). In contrast, for 334 LMA and EWT, models performed better when retaining conifers. For LMA in particular, estimates for 335 conifers were quite good, and their extension of the trait range raised R 2 (from 0.752 to 0.876) and 336 reduced %RMSE (from 16.5 to 10.3). Restricted-range models yielded somewhat better external 337 validation R 2 for N and LDMC both including and excluding conifers ( PLS-DA models using pressed-and ground-leaf spectra showed near-perfect performance at classifying 343 species (Fig. 6). Models using fresh-leaf spectra were slightly worse but still showed strong performance. 344 The optimal fresh-leaf model, which had 28 PLS components, correctly predicted the taxonomic identity We show that we can estimate a wide range of leaf functional traits for 68 woody and herbaceous species 352 from reflectance spectra of pressed leaves ( for LMA, C, and N, followed by a mixture of water-related traits, carbon fractions, pigments, and a few 354 important elements (Ca, Mg, and P). Other elements could only be estimated with fairly low accuracy. 355 These results show that pressed-leaf spectra provide an integrative measure of leaf phenotypes, much like 356 fresh-leaf spectra (Cavender-Bares et al. 2017), but with stronger potential to characterize variation in chemical traits. Perhaps as a result, we could use pressed-leaf spectra to classify species as accurately as 358 ground-leaf spectra, and better than fresh-leaf spectra. Our results underscore the potential that using 359 reflectance spectroscopy on herbarium specimens could yield rapid and non-destructive estimates of 360 many functional traits, enabling more expansive studies of trait variation across space and time. 361 362 Comparing PLSR model performance 363 We compared pressed-leaf models to fresh-leaf and ground-leaf models from the same samples. Our 364 findings about which kind of tissue was best for predicting each trait mostly supported our hypotheses. 365 Ground-leaf spectra showed the strongest performance for most chemical traits, likely because grinding 366 homogenizes the lamina and removes the potentially confounding influence of leaf structure (Table 2). 367 Pressed-leaf spectra showed intermediate performance for most chemical traits, perhaps because, like 368 ground leaves, they lack the major water absorption features that mask the smaller features of other 369 compounds in the SWIR range (Peterson et al. 1988). Contrary to our predictions, ground-leaf spectra 370 performed about as well as fresh-leaf spectra, and better than pressed-leaf spectra, for estimating pigment 371 concentrations. The same factors that provided an advantage to ground-leaf spectra in estimating chemical 372 traits also perhaps explain why pressed-and (especially for LMA) ground-leaf spectra performed worse 373 for estimating water-related and structural traits (LMA, LDMC, and EWT). Pressed-leaf models may 374 represent a good compromise in allowing many traits to be estimated with mostly intermediate but 375 nonetheless quite high accuracy. 376

377
Our pressed-leaf models often performed as well as fresh-and ground-leaf models published here and 378 elsewhere. For example, our models for LMA had an RMSE (0.00970 kg m -2 ) lower than many fresh-leaf patterns that encompass a wide range of trait variation. For some traits (e.g. EWT, N, K, Mn) many of the 386 samples with the greatest errors were at the poorly sampled tails of the measured trait distribution, which 387 suggests that more thorough sampling may be needed to ensure models can make reliable predictions at 388 these extremes (Figs. 2 and 3). Nevertheless, our external validation analyses indicate that our models for 389 some important traits-like LMA, LDMC, EWT, and N, but not C-can transfer reliably to other 390 datasets, and even sometimes to new functional groups (Fig. 5). 391 392

Interpreting PLSR model performance 393
It may seem perplexing that we could succeed at all in predicting LMA from ground-leaf spectra, or 394 LDMC and EWT from pressed-and ground-leaf spectra. The ability to estimate these traits must not 395 result from the optical expression of the traits themselves. We suggest that we instead sense these traits may be so important because of its sensitivity to both chlorophyll content and leaf structure (Richardson 411 et al. 2002). Much of the visible range was proportionally even more important for pressed-and ground-412 leaf models, and most of the NIR was less important (except for LMA; Fig. 4 and S16-18). There were 413 multiple small VIP peaks in the SWIR. Although some (e.g. at 1440 and 1920 nm) lie within major water 414 absorption features, any causal link to the leaf's fresh water content is unlikely for pressed and ground 415 leaves. Many of these peaks also lie near broad absorption features for many components of dry matter, With some exceptions, the VIP metric showed that the same bands are often important for predicting 420 different traits. This pattern might be taken as an artifact of trait covariance: For example, the three 421 pigment pools covaried strongly (R 2 = 0.827-0.969) and had nearly identical VIP across the spectrum 422 (Figs. S16-18). One might take similarities in VIP further to imply that there are a small number of traits 423 whose tight coordination with others underlies the performance of all models through constellation 424 effects. Nevertheless, across the whole dataset, many traits covaried only weakly but still shared VIP 425 patterns. For example, EWT, cellulose, N, and K were not tightly coordinated (R 2 = 0.003-0.152) but 426 shared similar patterns of pressed-leaf VIP across the spectrum (Fig. 4), including peaks at 705 and 1920 427 nm. While VIP is a useful heuristic, it does not show the direction in which a band's reflectance alters 428 trait estimates; the same bands may matter for different traits in different ways. Here, similarities in VIP 429 do not appear to result solely from strong networks of trait covariance. Nevertheless, the fact that we can 430 estimate traits like LDMC and EWT from pressed-leaf spectra appears to imply some role for trait 431 covariance, perhaps in a more diffuse way. 432 433 PLS-DA modeling for species classification PLS-DA models showed that fresh-, pressed-, and ground-leaf spectra alike could be used to classify 435 species with perfect accuracy for ground leaves, near-perfect accuracy (>97%) for pressed leaves, and 436 excellent accuracy (>93%) for fresh leaves (Fig. 6). In contrast to prior work that deliberately selected 437 many congenerics (Lang et al. 2015), our most common species were often distantly related. Among the 438 misidentified samples, most were mistaken for congenerics, which implies that related species are more Our analysis reinforces that that pressed-or ground-leaf models might be particularly suited to the task of 444 classifying or delimiting species (Fig. 6). This finding is notable because measuring spectra of pressed 445 leaves in an herbarium is also much simpler than measuring spectra of fresh leaves through an intensive 446 field campaign across the range of a clade. We conjecture that these models have an advantage because 447 drying reveals the absorption features of multiple compounds in the SWIR range that might together 448 allow finer discrimination of species than water content does. Indeed, ground-leaf spectra have greater 449 intrinsic dimensionality than fresh-leaf spectra (Kothari & Schweiger 2022), which suggests they have 450 more independent axes of variation along which species may separate. Our results support the growing 451 practice of using spectra of pressed herbarium specimens in species delimitation and identification (Prata of functional traits. But researchers may be deterred if they must each build their own models tailored to 461 particular uses-and for herbarium specimens, it might not be possible to do the destructive trait 462 measurements often needed to train the models. Ideally, spectral models would be general enough that 463 researchers could confidently use them without further validation, but this aim is not easy to achieve: for 464 several reasons, a model trained on any particular spectral dataset may make poor trait predictions on new 465 data. As with any other technique, the goal for spectroscopic trait estimation is to improve model 466 accuracy and generality as much as they can be jointly improved. Below, we discuss some challenges one 467 by one, particularly as they concern pressed leaves. 468

469
One concern is that the new data could be outside the range of traits or optical properties in the training 470 dataset (Schweiger 2020). A general model, if such a thing is possible, would need to represent the vast 471 range of leaf functional traits and optical properties. Another kind of concern about model generality 472 concerns sample preparation before spectral measurements. For example, particle size influences ground-473 leaf spectra (Foley et al. 1998). For pressed leaves, it may be particularly important to prepare samples in 474 consistent ways that preserve the leaves' anatomical integrity. In our external validation analyses, we 475 found that pressed-leaf models yielded reasonably accurate predictions of most traits, even though the 476 validation dataset differed in sample preparation protocols and included conifers, which were absent from 477 the training dataset. Nevertheless, even setting aside conifers, external validation for one trait (C) was measured with different foreoptics, including integrating spheres, contact probes, or leaf clips. We used a 483 leaf clip with pressed specimens because mounting delicate pressed leaves in an integrating sphere could 484 damage them. While leaf clips and contact probes often have a higher signal-to-noise ratio, they are less 485 likely to produce consistent measurements among instruments or replicate samples due to variation in viewing geometry and anisotropic surface reflectance (Petibon et al. 2021). The logistical constraint of 487 having to use them on pressed leaves could thus make it harder to compare data among instruments. In 488 theory, the greater inconsistency of leaf clip measurements could have reduced the performance of our 489 pressed-leaf models compared to our integrating sphere-based fresh-leaf models, but we still found that 490 the former performed better for most chemical traits. 491

492
Another challenge is that while many herbarium specimens are glued to a paper backing, measuring 493 reflectance with a leaf clip or probe usually requires placing a black absorbing background under the 494 sample to keep transmitted light from being reflected back into the sensor. When unattached leaves are 495 not available, using spectra from these specimens may require new methods to correct for reflectance 496 from the mounting paper. EcoSIS, https://ecosis.org/; or the CABO data portal, https://data.caboscience.org/leaf) and spectral 504 models (like EcoSML; https://ecosml.org/) will contribute to this goal. Lastly, we note that many of the 505 same concerns about discrepancies among sampling and measurement protocols could arise when using 506 existing spectral libraries to aid in species identification (Draper et al. 2020). 507 508

Implications for herbarium-based research 509
A particular challenge for herbarium specimens is that their optical and chemical properties (especially 510 light-sensitive pigments) may degrade during preparation or storage. Such degradation could make it hard 511 to distinguish changes in the traits of living plants over time from changes in storage. Even in this study, 512 where no specimens were collected before 2017, many underwent visible changes in color, including 513 browning or blackening; ~12% were scored at 2 or higher, with large variation among species (e.g. 42% 514 of Populus grandidentata specimens, but 0% of Betula papyrifera specimens). We found little evidence 515 that such discoloration hinders trait estimation: Both the results of our discoloration analyses and the 516 similar performance of full-and restricted-range models suggest that PLSR is flexible enough to predict 517 traits despite the variable influences of discoloration in our specimens. This capacity likely depends on 518 using samples for model calibration that show a similarly wide range of discoloration. 519 520 Our specimens were collected no more than three years before measurement, but ecologists may want to 521 use specimens collected decades ago. While our findings give reason to be optimistic that properly 522 calibrated models could return accurate estimates of many traits from old or discolored specimens, it 523 remains untested whether there are any limits to this potential. In general, not much is known about long- Some of the challenges we describe pertain to projects that would measure spectra on samples already 532 collected, but spectroscopy-like other novel uses for herbarium specimens-could also prompt changes 533 in collection practice. For example, it underscores the potential value of gathering and storing extra leaf 534 material (e.g. in fragment packets), which would circumvent the challenge of measuring mounted leaves 535 and aid destructive analyses of herbarium specimens (Heberling 2021). We propose that herbaria could 536 also incorporate spectroscopy into their operations by measuring incoming specimens shortly after 537 pressing, which could mitigate the challenges caused by mounting and degradation. 538 Linking spectral data measured on herbarium specimens to the digital record of the voucher could be a 540 powerful tool to enable data synthesis, but it may require new informatic tools ( designed to store metadata about instrumentation and processing. We would advocate for coordination 545 between herbarium managers and researchers who use reflectance spectroscopy, which could build 546 agreement about best practices for spectral measurement and curation and allow data to be synthesized 547 and compared across research groups. 548

549
We show that non-destructively measured pressed-leaf spectra retain much of the information about many 550 leaf functional traits found in fresh-leaf spectra. While validating this technique on older specimens will 551 require extensive further research, our findings suggest that reflectance spectroscopy could allow herbaria 552 to take on a greater role in plant functional ecology and evolution. Our study has far-reaching 553 implications for capturing the wide range of functional and phenotypic information in the world's 554 preserved plant collections.

Acknowledgements 556
We performed this research on the ancestral and contemporary land-mostly unceded-of Native people:

Conflict of interest statement 582
The authors have no conflicts of interest to declare. 583 584 Author contributions 585 SK, JCB, and EL conceived the ideas and designed the methodology. RBR and SK collected the data. SK 586 analyzed the data and wrote the first draft of the manuscript. All authors made substantial contributions to 587 further drafts and gave final approval for publication. 588 589

Data availability 590
All fresh-leaf spectral data are available through the CABO data portal (https://data.caboscience.org/leaf). 591 Upon publication, we will upload other spectral data, as well as metadata and trait data, to the Ecological 592 Spectral Information System (EcoSIS, https://ecosis.org/), and upload models to the Ecological Spectral 593 Model Library (EcoSML, https://ecosml.org/). At that stage, we will update this section accordingly.  Table 3: Summary statistics for external validation of pressed-leaf PLSR models. The models were trained on CABO data (see Table 2) and applied to a dataset 604 collected at Cedar Creek. 605

Including conifers
Excluding     showed no visible discoloration, as would be indicated by loss of green color (e.g. browning or blackening). Specimens 127 evaluated at 1 (C and D) showed <10% discoloration throughout the leaf, or in some cases a more subtle, pervasive 128 silvery or matte white appearance to the leaf surface in areas that remained otherwise green. On close inspection, these 129 two samples both show some browning in a small portion of the total leaf area. Note that we do not consider spots or 130 marks due to pathogen damage that were present at sampling time to be a form of discoloration; we also exclude some 131 immature anthocyanic leaves as in (A). When in doubt, we examined scans of fresh leaves taken shortly after they were 132 All photos in Figs. S2-4 depict specimens included in the study, but were taken in March 2022, nearly three years 135 after we took pressed-leaf spectral measurements. Due to continuing discoloration, their assigned scores as part of this set 136 of example images may be larger than they were at the time we measured their spectra; the latter are what we used for 137 data analysis. 138 Fig. S5: (A) The wavelength-wise mean reflectance spectrum of leaves at each discoloration score. In general, more 160 discolored specimens have lower visible and NIR reflectance and a blunted red edge. This representation does not 161 separate changes in spectra due to discoloration from aspects of spectral variation that might correlate with predisposition 162 to discoloration. (B) A histogram of the number of pressed specimens evaluated at each discoloration stage at the time of 163 spectral measurements. 164 (left), pressed-(middle), and ground-leaf (right) spectra, displayed as in Fig. S6. 188 pressed-(middle), and ground-leaf (right) spectra, displayed as in Fig. S6. Abbreviations: sol = solubles, hemi = hemicellulose, cell = cellulose, lign = lignin, car = total carotenoids. The dashed 259 horizontal line at 0.8 represents a heuristic threshold for importance suggested by Burnett et al. (2021). 260 Fig. S17: The variable importance of projection (VIP) metric calculated for all traits from pressed-leaf spectra. 262 Abbreviations: sol = solubles, hemi = hemicellulose, cell = cellulose, lign = lignin, car = total carotenoids. The dashed 263 horizontal line at 0.8 represents a heuristic threshold for importance suggested by Burnett et al. (2021). 264 Fig. S18: The variable importance of projection (VIP) metric calculated for all traits from ground-leaf spectra. 266 Abbreviations: sol = solubles, hemi = hemicellulose, cell = cellulose, lign = lignin, car = total carotenoids. The dashed 267 horizontal line at 0.8 represents a heuristic threshold for importance suggested by Burnett et al. (2021). 268