Spectral measure of color variation of black - orange - black (BOB) pattern in small parasitic wasps (Hymenoptera: Scelionidae), a statistical approach

A group of eight scelionid genera were studied by means of microspectrophotometric measurements for the first time. The orange and black colors were analyzed quantitatively, which in combination with Functional Data Analysis and statistical analysis of Euclidean distances for color components, describe and test the color differences between genera. The data analyzed by means of Functional Data Analysis proved to be a better method to treat the reflectance data because it gave a better representation of the physical information. When comparing the differences between curves of the same color but different genera, maximum differences were present in different ranges of the spectra, depending on the genus. Reflectance spectra were separated into their spectral color components contributions (red, blue and green). Each component had its own dominant wavelength at the maximum of the spectrum. We found differences in the dominant wavelength for specimens of the same genus, which are equivalent to differences in the hue. A correlation between the mean values of characteristics of the color components was used in an attempt to group the genera that show similar values. The spectral blue components of the orange and black areas were almost identical, suggesting that there is a common compound for the pigments. The results also suggest that cuticle from different genera, but with the same color (black vs black, orange vs orange) might have a similar chemical composition.

and third tergite of metasoma (2). (B) This figure also shows frozen insects recently brought from the field with the BOB color pattern. The focus stacked macro photography (specimen of 0, 6 mm) was obtained with a Reflex 850 camera coupled to a 20x microscope lens. The photograph of field insects was taken with a Canon SX10 IS camera with a 105mm macro lens.
It has been reported that most animals perceive color differently than humans [16] 26 which has created a new surge of interest motivating widespread efforts to quantify both 27 color traits and their visual environments. This new focus on generating objective 28 information about the coloration of insects provides a more functional and evolutionary 29 perspective, giving us a new tool to understand the real magnitude of diversity of color 30 in insects. 31 Color characterization of specimens, especially in taxonomy, has been done mostly 32 subjectively, although more quantitative methods have been proposed [17]. There are 33 few reports treating the measurement of color rigorously or analyzing spectral data and 34 its variability as a functional measurement, which potentially can show differences 35 detected by insects, and shed a light on their biology. 36 Among many studies that measure and compare colors in insects, the methods vary 37 widely, see for example [18], [19], [11]. Endler et al. [20] take into account the 38 continuous nature of the wavelengths as measures of color and use Principal Component 39 Analysis (PCA) to analyze patterns of color, while recognizing its limitations. 40 Functional data analysis (FDA) is a powerful method to study spectrally measured 41 data [21]. For example, Rivas et al. [22] propose the use of FDA techniques to test color 42 changes in stone, and present a comparison with a multivariate method. 43 A statistical approach of reflectance measurements of the dorsal surface of eight 44 Scelionidae genera was used in order to study the BOB color pattern. The reflectance 45 spectra were measured by means of a microspectrophotometer. A statistical description 46 of the BOB color pattern was obtained by means of different statistical approaches,

52
Microspectrophotometry (MSP) 53 The reflectance spectra of 8 genera belonging to the Hymenopteran family Scelionidae 54 were measured with a 508 PV UV-Visible-NIR Microscope Spectrophotometer (MSP) 55 (CRAIC, Los Angeles, USA). The MSP is coupled to an Eclipse LV100 ND Microscope 56 using episcopic illumination (Nikon, Tokyo, Japan). A spot size area of 200µm x 200µm 57 was measured. A spectralon standard was used as a white reference because of the 58 diffuse reflection produced by the roughness of the surface of the cuticle. Table 1 59 includes further information regarding the measurement settings. 60 Table 1. Information about the analysis of color data.  Triteleia(TR). They were found and caught with entomological nets over a period of 12 67 months. Additionally some specimens of the genus Evaniidae(EV) were included, 68 because they exhibit a similar BOB pattern, but are quite far removed from scelionids 69 in the phylogeny of Hymenoptera [23].

70
The collection site was a patch within the protected zone of El Rodeo, which is part 71 of the Mora Municipality, located 30 Km southwest of San José, Costa Rica. This 72 natural reserve is composed of a secondary forest and the last remnant of primary forest 73 (approximately 200 ha). Specimens were identified by R.M. and P.H. and voucher 74 specimens were deposited in the Museum of Zoology of the University of Costa Rica.

75
Due to the lack of taxonomic studies, species level identifications were not possible for 76 most of the included genera, but by collecting from just one locality the problem of 77 mixing species within a genus should be minimal (though not completely eliminated). A 78 previous study demonstrated that males and females do not differ in color [15].

79
All experimental protocols were carried out in accordance with the involved 80 institution's guidelines and regulations. The specimens were killed by freezing and then, 81 by using a needle cut and polished on both ends, one square millimeter or a small notch 82 of area was dissected in the scutum of the mesosoma (orange) and in the third tergite of 83 the metasoma (black) (Fig 2). The orange and black samples of 4 specimens of each of 84 the 8 genera were analyzed. For each specimen three samples were taken, from the 85 mesosoma (orange) and from the metasoma (black).  By using the measured spectra (Fig 2), the CIE-XYZ tristimulus values were 91 calculated (see S1). A D65 day light standard illuminant for the CIE observer was 92 chosen. Following the CIE recommendations, a spacing of 1 nm for wavelength was 93 used. Since the data measured with the microspectrophotometer have a different 94 wavelength spacing, an interpolation was made using the splinefun function in R [24], in 95 order to generate a list of spectral points with the desired spacing. After the tristimulus 96 values were obtained, they were used to calculate the CIE 1976 (L* a* b*) (CIELAB) 97 color space coordinates, using the corresponding definitions as detailed in Appendix S1. 98 The As a way to access information about similarities and differences between the reflectance 119 spectra in different portions of the visible electromagnetic spectrum, a thricromatic 120 decomposition is proposed using the CIE color matching functions (CMFs) as follows: 121 and where Z (λ) is the measured reflectance,x (λ),ȳ (λ) andz (λ) are the CIE 1976 each of the 8 genera, and 4 specimens for Evaniella, which was observed as a control.

137
The response variable is either a difference of reflectance or a measure of reflectance, 138 and it can be measured in three different levels:

139
• The univariate mean measure: m ijk where m is the mean difference between black 140 and orange measures for all possible wavelengths in curve ijk, where k represents 141 repetitions, j specimens, and i genera. This is is used as a naive base comparison. 142 • The functional measure: Y ijk (λ) where the difference in reflectance between black 143 and orange curves Z(λ) is represented by Y , and was measured for each 144 wavelength λ, and repeated k times in each specimen j, and genera i. In this case, 145 Y ijk (λ) describes a reflectance difference curve.

146
• The multivariate distance measure: a Euclidean distance ∆E as defined in 1 for 147 each pair of curves being compared.

148
A table was created relating the different factors (spots, specimen, and genera) with 149 the response in the univariate and functional case. In total, 96 combinations of 3 spots, 150 4 specimens, and 8 genera were recorded. In the multivariate distance case, a total of 151 N = C 192 2 = 18336 comparisons were used as observations, and then it was recorded 152 whether or not each pair had a difference: genus, color, spot and specimen.

154
Prior to the analysis, descriptive statistics were generated for each genus and color 155 section (black and orange). The objective of the data analysis is to contrast the null 156 hypothesis of no difference between means of each combination of genus and color area, 157 and for that, univariate and functional ANOVA were performed. The multivariate 158 distance case was also analyzed to match the physics literature on color differences. The 159 following methods were used in each case: also, specimen and spot were controlled.

172
The objective is to illustrate the functional method compared to the usual univariate 173 practice, and with a multivariate distance alternative, to highlight the importance of 174 taking into account all the information from the difference curves. More specifically, for 175 the univariate and functional responses, the ANOVA model can be written as the 176 following: for i = 1, 2, ..p = 8 species, j = 1, 2, .., r = 4 specimens, and k = 1, 2, 3 sampled spots in 178 each colored area from each of the 4 specimens. Usual restrictions and assumptions for 179 the ANOVA model apply: In the model, the global mean µ = 0 given that we are working with difference 181 curves, and α i is the genera effect of level i and β j is a block effect to explain the 182 intra-genus variance . In order to account for the dependence between spots from the 183 same specimens (in the two colored areas), a random effect η k was tested, and different 184 covariance structures for Σ η were fitted (assuming dependence η k ∼ N (0, Σ η ) or 185 independence between spots). Lastly, ijk is the residual accounting for the unexplained 186 variation specific to the kth observation, ith genus and jth specimen. Depending on the 187 response Y ijk , the residuals ijk are a number (univariate) or a curve (functional).

188
Assumptions were tested for all the model options.

189
For the multivariate distances, the ANOVA model differed in the factors: where the global mean µ represents the average change for the N comparisons, and then 191 each of the factors add the change given that the pair has different genera (γ i ), color 192 (η i ), spots (ω i ) or specimens (ρ i ) for each pair i. Finally, i is the error modeled as 193 white noise. Assumptions were tested for this model as well.

194
Statistical analysis was performed using the computing environment R ( [24]). The 195 code and data to perform each of the tests mentioned in this paper are available for 196 download in https://github.com/malfaro2/Mora_et_al.

198
The data provide support for reflectance difference variations between and within 199 genera of the scelionids observed, when testing both the univariate data and the 200 functional data (p-value < 0.00001 in all cases). The result is supported by the 201 multivariate distance test, where there is evidence to say that overall change µ is greater 202 than zero (p-value < 0.00001). Details about the significant factors that explain that 203 change are summarized in the following sections. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/652594 doi: bioRxiv preprint have low reflectance at lower wavelengths for both black and orange spots. It is 208 important to point out that the measurements were performed grouping genera and not 209 species, a factor that may account for the variability in some cases such as Chromoteleia. 210 Moreover, the effect of topography of the cuticle and homogeneity of the pigment 211 distribution can partly explain the data dispersion in some of the genera.

212
The difference reflectance of the curves defined as: Y ijk (λ) are plotted in In order to test the null hypothesis of H 0 : α i = 0 for all genera, models were used to 221 fit the data for each response. The best fit in all cases was from the models that assume 222 independence between spots, so linear models were used subsequently. The univariate The results for univariate comparison between genera can be found in   distribution is closer to zero, although its box -50% of its data -is not close to zero or 258 the threshold. In contrast, the distance ∆E between Acanthoscelio (AC) and 259 Acanthoscelio (AC) curves for the same color (in vermilion) clearly include the threshold 260 and can be taken as not significantly different.

261
The difference between distributions is not that clear when comparing colors.  The differences between curves can be described as an average of all ∆E, for each 267 genus and color. By that metric, the maximum and minimum mean difference can be 268 established, as described in Table 3 for comparisons between curves of different colors 269 and Table 4 for comparisons between curves of the same color, pointing to the specific 270 contrast with wider differences. Similarly, comparisons of different genera and the same 271 genus are presented in S3 in the supplementary material.     The control genus Evaniella (EV) appears to separate from the group in the black 301 measurements, but it is not that obvious for the orange measurements. Meanwhile, 302 Sceliomorpha (SM) and Macroteleia (MA) group together for the green and red orange 303 observations. The genus Chromoteleia (CR) is outside the main group in all three color 304 components for the black observations and in the blue component of the orange 305 observations. Chromoteleia (CR) and Baryconus (BA) also pair in the green and red 306 components of the black and orange observations. Additionally Sceliomorpha (SM) and 307 Acanthoscelio (AC) present a mean that is very close in the black measurements for the 308 blue, green and red contributions. The genera Lapitha (LA) and Triteleia (TR) group 309 together in the blue and green components for the orange contribution.

311
The differences between genera shown in Table 5 only take into account the aggregated 312 data collected at all wavelengths considered. The difference in intra-genus variances   The details about multiple comparisons can be even more specific, if the color 327 components are separated for each curve and the analysis is repeated.

328
The component analysis showed that the nature of the differences found using the 329 FDA and ∆E methods are related to the green and red color components, but no 330 correlation was found for the blue component. This is consistent with results obtained 331 qualitatively using the difference curves Y ijk (λ) shown in Fig 3. Moreover, ANOVA 332 results for model 7 (Table 2) confirm the reproducibility of the experimental protocol 333 and, for most of the cases studied, the results suggest that there is a homogeneity of the 334 sampling even though the taxonomic identification to just the genus level means that at 335 least some of the genera could consist of more than one species.

336
This particular method for statistically testing for color differences differs from those 337 in the literature due to the level of scrutiny. It goes from the general question, "Is there 338 a difference between colors?" to more specific contrasts, using both the FDA and the  Recent studies report that the color phenotype is clearly associated with the 348 concentration of certain pigment forms, but there could be other relevant factors besides 349 concentration related to the expression of color [30]. In the case of pigments of different 350 color (black vs. orange), the spectral blue components remain almost identical, 351 suggesting that there is a common compound for the pigments. . Among the few phylogenetic correlations detected was 363 for Evaniella, which belongs to a completely different family (Evaniidae) and might 364 therefore be expected to differ markedly from all the other genera. This indeed appears 365 to be the case for its black color, but not its orange color (Fig. 7). This is an interesting 366 result in that it suggests that the orange color shows more convergent evolution than 367 does the black color. Yet, within Scelionidae there are more differences in the orange 368 color than in the black (Table 4). The above results merit further investigation.

369
In some other insects, contrasting black and orange color patterns are known to 370 serve as aposematic (warning) coloration for potenital predators [32] and it is likely that 371 the same is true of the BOB pattern, although this has not yet been tested. However, 372 the orange mesosoma becomes less common at higher elevations [15] wherex (λ),ȳ (λ) andz (λ) are the standardized color-matching functions (CMF) of the 385 CIE 1931 standard observer. The constant k in equations 8, 9 and 10 is defined so that 386 Y=100 for objects with perfect reflectivity. Finally, φ λ is the relative color stimulus 387 function, defined as: with R (λ)being the spectral reflectance factor (the measure spectra) and S (λ) is the with X n , Y n and Z n being the tristimulus values of a idealized white reflector.

396
S2 Univariate comparison for the inter-genera case 397 For the univariate case, Table 5 shows how Baryconus, Lapitha, and Scelio absolute 398 differences are not significantly different from zero. This contradicts the functional data 399 analysis result presented before, where all the difference curves are different between 400 genera.

401
When describing the difference between genera, Table 5 shows how there are three 402 groups of genera without univariate mean differences: group a consists of Baryconus,