Quantitative analysis of spore shapes improves identification of fungi

Morphology of organisms is an important source of evidence for biodiversity assessment, taxonomic decisions, and understanding of evolution. Shape information about zoological and botanical objects is often treated quantitatively and in this form improves species identification. In studies of fungi, quantitative shape analysis was almost ignored. The disseminated propagules of fungi, the spores, are crucial for their taxonomy – currently in the form of linear measurements or subjectively defined shape categories. It remains unclear how much quantifying spore shape information can improve species identification. In this study, we tested the hypothesis that shape, as a richer source of information, overperforms size when performing automated identification of fungal species. We used the fungi of the genus Subulicystidium (Agaricomycetes, Basidiomycota) as a study object. We analysed 2D spore shape data via elliptic Fourier and Principal Component analyses. With flexible discriminant analysis, we achieved a slightly higher species identification success rate for shape predictors (61.5%) than for size predictors (59.1%). However, we achieved the highest rate for a combination of both (64.7%). We conclude that quantifying fungal spore shapes is worth the effort. We provide an open access protocol which, we hope, will stimulate a broader use of quantitative shape analysis in fungal taxonomy. We also discuss the challenges of such analyses that are specific to fungal spores.


29
In eucaryotic organisms, morphology is an important source of evidence for biodiversity assessment, 30 taxonomic decisions, and understanding of evolutionary and ecological processes. Morphological 31 information is quickly accessible and can be processed at relatively low costs. Therefore it is broadly 32 used by both researchers and citizen scientists. Morphological information, on the one hand, provides 33 a starting point for molecular analyses, and on the other hand, serves as reference data to validate 34 the molecular results [1]. 35 Morphology covers two principal concepts: size and shape. The former is easier to measure and was 36 dominating for more than a century in the quantitative analyses known as traditional morphometrics 37 [2]. Although being important, the size alone is often insufficient for the delimitation of species or 38 populations. For example, within a single taxonomic group of diatom algae (Bacillariophyceae), linear 39 measurements allowed to delimit species in some genera [3] but not in others [4]. The authors of the 40 latter study concluded that involving quantitative shape descriptors, in addition to size, would make 41 delimitation of taxa more efficient. It is a discipline of geometric morphometrics that aims to quantify 42 a shape, geometric information about the object that remains after removing the effects of location, 43 rotation, and scale [5]. Tools of geometric morphometrics allow splitting the shape information into 44 symmetric and asymmetric components and analyse them separately [6]. Besides dealing with richer 45 data of numeric nature, geometric morphometric allows reconstruction of the original look of the 46 object after analyses -a valuable property that traditional morphometrics does not offer [1,2].

47
Fungi are among the species-richest organism groups on Earth [7]. Generally, for morphology-based 48 fungal taxonomy, the features of the disseminated propagules, the spores, are of the highest priority 49 among phenotypic characters [8]. However, there is a difference in how the size and shape of spores 50 are usually treated. The spore size is routinely used for species delimitations, mostly in the form of 51 quantitative traits such as length and width [9,10]. Spore shape also plays a big role but has been 52 treated differently. The length to width ratio is used frequently as a proxy of the spore shape [9]. Image acquisition and pre-processing 130 We acquired and pre-processed images from light microscopy as described in detail in our online 131 protocol [29]. In this paper, we highlight the most essential steps and we illustrate a workflow of 132 image processing in Fig 2. We performed all work on images on a desktop computer with 64-bit 133 Windows 10 operating system (build 19041). We obtained images of spores from squash 134 preparations of fungal herbarium specimens examined at 1000× magnification (Fig 2A). . We started with grayscaling the BMP images in the program ChainCoder ( Fig 2C).

171
Then we converted the images to binary (assigning white pixels to spores and black to background) 172 based on a threshold that was mostly automatically selected by ChainCoder and was only rarely 173 adjusted manually. To remove the shape artifacts, we applied an erosion-dilation filter (that excludes 174 the noisy pixels from the outline) and rarely also a dilation-erosion filter (to fill in artifact cavities).

175
Then a chain code, i.e. a sequence of x and y coordinates describing the outline, was taken for each 176 spore. We imported the chain codes to Ch2Nef program [35] and checked the orientation of all spores to 178 be the same, i.e. spore hilar appendix placed in the upper left quarter of the image (Fig 2D) Fourier Descriptors, we used the approach based on the first harmonic [35]. We checked that NEFDs 188 are alignable with the NefViewer program ( Fig 2E).

189
We combined manually NEFDs for individual specimens into a single .txt file ( Fig 2F).

250
Maximum likelihood phylogenetic analysis showed all ten Subulicystidium species on the 251 phylogenetic tree as clearly separate clades, mostly with high bootstrap supports (Fig 3). In the discriminant analysis, the species identification success rate for the individual trait group was 288 highest for the global shape variation (61.5%, Fig 6)  Discussion 306 Quantitative analysis of shapes helps to better identify and describe the organisms. In studies of 307 fungi, despite their immense morphological diversity, quantitative shape analysis was almost 308 ignored. In this study, we confirmed the hypothesis that shape, as a richer source of information, 309 outperforms size during automated identification of fungal species by their spores. The highest 310 identification success rate was achieved in a discriminant model that combined shape and size 311 descriptors. The symmetric shape variation outperformed the classical length to width ratio. In 312 general, we found that 313 (i) It is possible to adequately extract the shape information from microscopy images of 314 fungal spores;

(ii)
It is possible to recognise by a human eye the spore shape differences reconstituted 316 after the multivariate analyses;

(iii)
It is justified to split the spore shape information into symmetric and asymmetric 318 portions for separate analyses. As quantitative shape analysis has been barely applied in mycology, we first had to ensure that 320 available tools for extracting shape information can be used for our goal. Many of the tools were

334
With our study design, we observed high correlations between several trait variables. The most 335 important to note is the correlation between the shape descriptors (PC1 of symmetric and 336 asymmetric variation) and size descriptors (especially length). This is due to our choice to represent 337 the size as linear measurements, which are known to be geometrically dependent on a shape [2].

338
We kept linear measurements as size descriptors to demonstrate their properties and to explore the 339 performance of these classical variables for species identification. We also generated the size 340 variable that is a combination of separate linear measurements and is less independent of shape other fungal taxa, the lateral and frontal faces are both important [28] and their shapes should be 374 analysed with the same attention. Authors of [32] combined three faces of the objects to achieve a 375 high identification success rate. In fungi, it may be easy to identify the spore as exposed in the lateral 376 face if its hilar appendix is large enough. Unfortunately, this is not the case for all species. Special 377 care should be taken to identify the position of the hilar appendix in the needle-like spores (as in S.

378
cochleum and S. perlongisporum in our dataset). Applying scanning electron microscopy may help to 379 identify correctly the hilar appendix but also to get a detailed picture of the surface of the fungal 380 spore. While our study focused on the fungi with smooth spores, the latter in many taxa bear 381 additional projections like warts or ridges or distinct spines. These elements would be difficult to 382 capture with the elliptic Fourier descriptors because of the mathematical properties of the method 383 [1]. Therefore, ornamented spores of fungi give a chance to bring another approach, landmark-384 based methods of geometric morphometric, to mycology. These can be implemented in a two-385 dimensional and even three-dimensional space. Finally, different properties of the spores in lateral 386 versus frontal face, as well as the availability of the several spore types per species in many fungal 387 lineages, offer a possibility to bring the machine learning techniques to mycology.

388
We conclude that quantifying fungal spore shapes is worth the effort. We provide an open access 389 protocol to propagate a broader use of quantitative shape analysis in fungal taxonomy and to 390 stimulate the development of more efficient solutions to address the challenges we discussed above.