Pushing Raman spectroscopy over the edge: purported signatures of organic molecules in fossils are instrumental artefacts

Claims for the widespread preservation of fossilized biomolecules in many fossil animals have recently been reported in six studies, based on Raman microspectroscopy. Here, we show that the putative Raman signatures of organic compounds in these fossils are actually instrumental artefacts resulting from intense background luminescence. Raman spectroscopy relies upon the detection of photons scattered inelastically by matter as a result of its interaction with a laser beam. For many natural materials, this interaction also generates a luminescence signal that is often orders of magnitude more intense than the light produced by Raman scattering. Such luminescence, coupled with the transmission properties of the spectrometer, induced quasi-periodic ripples in the measured spectra that have been incorrectly interpreted as Raman signatures of organic molecules. Although several analytical strategies have been developed to overcome this common issue, Raman microspectroscopy as used in the studies questioned here cannot be used to identify fossil biomolecules.


Introduction 33
The fossil record contains unique information to document the evolutionary history of 34 life during geological time. Fossils mostly consist of mineralized remains or 35 impressions of organisms, however in exceptional cases remnants or derivatives of 36 ancient biomolecules can be preserved and used, for example, to clarify the 37 phylogenetic affinities of enigmatic fossils [1,2] . Several analytical approaches have 38 been developed to study these fossilized organics at the molecular level [2] . Fossilized 39 pigments in ancient invertebrates, feathered dinosaurs, and mammals were identified 40 by gas chromatography-mass spectrometry (GC-MS) and time of flight secondary 41 ion mass spectroscopy (ToF SIMS), and used to infer the original colours and colour 42 patterns of these extinct organisms [3] . The identification of chitin and/or chitosan by 43 Fourier-transform infrared (FTIR) spectroscopy in 1-billion-year-old microfossils, in 44 conjunction with detailed morphological and ultrastructural features documented by 45 transmission electron microscopy (TEM), was used to interpret them as the earliest 46 fungi [4] . Taking advantage of some of the previously mentioned tools, as well as 47 other advanced spectroscopic techniques such as scanning transmission X-ray 48 microscopy (STXM) coupled with X-ray absorption near-edge structure (XANES) 49 spectroscopy, numerous studies have highlighted that organic (bio)molecules can 50 sometimes experience only partial degradation during diagenesis and even 51 metamorphism, and be identified in the geological record [5][6][7][8][9][10][11][12][13][14] . 52 Recently, preservation of diverse organic degradation products of 53 biomolecules in more than a hundred different metazoan fossils was inferred from 54 spectroscopy data collected with a Raman microspectrometer using a 532 nm laser 55 as the excitation source under continuous illumination [15][16][17][18][19][20] . The reported 56 spectroscopic data were interpreted as evidence for the preservation of organic 57 pigments in eumaniraptoran dinosaur eggshells [15] and in a non-avian dinosaur 58 skin [18] , as well as of protein, lipid and/or sugar fossilization products in fossil 59 bones [16] , dinosaur eggshells [20] , and vertebrate and invertebrate soft-tissues [17,19] . 60 Unfortunately, the purported claims of biomolecules in these fossils are not well 61 supported by the data provided, which actually result from instrumental and analytical 62 artefacts. In this paper, we outline the limitations of Raman spectroscopy with respect 63 to the identification of biomolecules in fossil materials, and then describe in detail the 64 origin of the misinterpreted signal.

Raman spectroscopy and its limitations in the study of organic fossils 66
Raman spectroscopy is widely used in geosciences as it probes the vibration modes 67 of chemical bonds in both solids, liquids, and gases, with minimal sample preparation 68 [21] . Yet, there are several limitations in terms of the sensitivity and accessibility of 69 chemical fingerprints with the technique as used in the studies questioned here. First, 70 excitation with a 532-nm laser only provides specific information on C-C bonds, and 71 not other covalent linkages, in diagenetically altered organic materials such as 72 fossils [22] . As a result, Raman spectra of organic materials preserved in 73 (meta)sedimentary rocks display only the so-called graphite (G) and defect (D1-D4) 74 bands, which provide information about the structural organization of the aromatic 75 skeleton [23] . Consistently, Raman spectra of geologically altered organic materials 76 can be similar even when they have significantly different elemental and molecular 77 compositions [13,14,[24][25][26] . Second, under continuous illumination, luminescence occurs 78 concurrently with Stokes Raman scattering and generates a signal that overlaps with 79 the Raman spectral window [21,27] . Cross sections of Raman (the probability that 80 Raman scattering takes place) are typically 8 to 10 orders of magnitude smaller than 81 that of luminescence. As a result, a number of precautions are often necessary to be 82 able to detect and interpret Raman spectral features among a number of other 83 spectral variations. 84

The reported periodic broad bands are not Raman features 85
In all the studies questioned here [15][16][17][18][19][20] , the spectral bands assigned to organic 86 molecules are broader than the bands usually associated with Raman scattering, and 87 appear quasi-periodic, in contrast to the non-periodic spectral features typically 88 attributed to Raman scattering of organic compounds. 89 We investigated the periodicity of these bands using wavelet transform (Fig.  90 1), an effective signal processing technique that is used to decompose a distorted 91 signal into different frequency scales at various resolution levels. Unlike classical 92 Fourier spectral analyses, wavelet transform analyses are advantageous in 93 describing non-stationarities, i.e. localized variations in frequency or magnitude, and 94 providing a direct visualization of the changing statistical properties. It has become a 95 common tool for analysing localized variations within a time series [28,29] , but also for 96 spike removal, denoising and background elimination of Raman spectra [30,31] . We 97 selected one spectrum from each of the two studies for which data were made 98 available [15,19] . For the wavelet analysis displayed in Fig. 1a,b, we selected the 99 spectrum corresponding to the eggshell of the extant flightless bird Rhea 100 americana [15] , as it is more likely that a pigment is preserved in a modern sample 101 rather than in a fossil. For the wavelet analysis displayed in Fig. 1c,d, we selected the 102 spectrum collected from the crustacean Acanthotelson stimpsoni specimen 103 YPM52348 [19] , as the chitin-protein complex of crustacean cuticles has a high 104 preservation potential [8,32] , and this specimen appears to be one of the best 105 preserved (see fig. 1f in [19] ), with the spectrum clearly measured from the specimen 106 (unlike for the specimen shown in fig. 1d of [19] ). Note that these two spectra, as well 107 as all other reported ones, were provided by the original authors as baseline-108 subtracted spectra, not as raw data. 109 Both spectra display numerous broad bands for which our wavelet transform 110 analysis reveals clear high-frequency periodicities of ~64-96 cm -1 for wavenumber 111 shifts <1000-1200 cm -1 , and of ~128 cm -1 for higher wavenumber shifts (Fig. 1a,c). 112 Similar high-frequencies of 130.9 cm -1 are obtained by Fast Fourier Transform. The 113 1086 cm -1 carbonate Raman peak present in the R. americana spectrum reflects the 114 calcified composition of the eggshell, in contrast to all the other (broader) bands, 115 which are best described as a superposition of quasi-periodic wavelets (Fig. 1b,d). 116 These broad, quasi-periodic bands are not the consequence of any Raman effect, 117 but rather result from physical and instrumental artefacts. Indeed, when a sample is 118 illuminated by the laser, the presence of structural defects and inorganic/organic 119 components can generate significant luminescence, often overwhelming the weak 120 Raman signal [21,27] . When this background luminescence is intense, the transmission 121 properties of the interferometric edge filter used to reject the Rayleigh line induce 122 quasi-periodic "ripples" in the measured spectrum [33] . 123 To further illustrate this point, we performed a wavelet analysis on a 124 transmission spectrum of a 532 nm RazorEdge ® ultrasteep long-pass edge filter, 125 provided by the manufacturer (Semrock), that is designed to be used as an ultrawide 126 and low-ripple passband edge filter for Raman spectroscopy (Fig. 2). The 127 transmission spectrum of the filter exhibits the aforementioned ripples (Fig. 2a,b). 128 Our wavelet analysis highlights high-frequency periodicities of 64-96 cm -1 for low 129 wavenumbers, and of 128 cm -1 for higher wavenumbers (Fig. 2b,c), similar to the 130 results reported in the studies questioned herein [15][16][17][18][19][20] . Such edge filter-related instrumental artefacts actually explain the presence of most, if not all, of the broad 132 bands that were attributed to organic molecules. 133

Sample composition does not affect the position of ripples but impacts the 134
shape of the background 135 The transmission properties of the edge filter induce ripples on the measured spectra 136 when luminescence is intense, making it challenging to identify Raman features 137 without appropriate data processing for background subtraction [33] . The data provided 138 in the publications questioned here [15][16][17][18][19][20] are only the baseline-subtracted spectra, not 139 the raw data, which makes it impossible to precisely assess the impact of non-140 Raman processes and sample composition on the corrected spectra from which the 141 presence of organic molecules was inferred. To address these issues, we collected 142 Raman microspectroscopy data on modern and fossil crustaceans in analytical 143 conditions similar to those of the aforementioned studies (for details, see Material 144 and Methods in SI). 145 We reproduced the experiment performed by McCoy et al. [19] using a 146 specimen of the crustacean Peachocaris strongi (Fig. 3a) from the same fossil 147 locality (Mazon Creek, Carboniferous, USA). As with other fossils from Mazon Creek, 148 this specimen is preserved as aluminosilicates and calcite in a sideritic concretion 149 (Fig. S1). In order to further assess the impact of the sample's chemical composition 150 on the measured spectra, we also performed Raman spectroscopy on (i) a specimen 151 of the penaeid shrimp Cretapenaeus berberus from the Cretaceous of Morocco (Fig.  152 3b) preserved as a mixture of calcium phosphates and iron oxides in an illite 153 mudstone ( Fig. S1; see also Gueriau et al. [34] and references therein), and (ii) a 154 specimen of the modern shrimp Neocaridina davidii (Fig. 3c) dried after death and 155 still rich in organic carbon, likely in the form of chitin (Fig. S1). Whether or not organic 156 carbon is present, and whatever the mineralogical composition of the specimen or its 157 mineral matrix, all the measured spectra (Fig. 3d, solid lines)

display broad bands, 158
which all occur at the same wavenumber shifts and add up to a significant 159 background (Fig. 3d, dotted lines). Yet, the shape of the background differs 160 significantly from one measurement to another, and the more intense it is, the more 161 the ripples are expressed. In the baseline-subtracted spectra, the differences in the 162 relative intensity between bands from one measurement to another (Fig. 3e) only 163 result from distinct background profiles of the measurements. A wavelet transform 164 analysis reveals high-frequency periodicities of 64-128 cm -1 (Fig. 3f), as was the 165 case for the spectra questioned in the previous section [15][16][17][18][19][20] . Finally, other than the 166 presence of sharp peaks around 964 and 1086 cm -1 (Raman peaks of fluorapatite 167 and calcite, respectively), as well as one unidentified peak at 1156 cm -1 in the 168 modern shrimp, spectral differences are limited to relative variations in the ripple 169 band intensity that result from analytical artefacts. 170 In short, the ripples observed in the Raman microspectroscopy data 171 questioned here represent remnant instrumental signals that result from confounding 172 broad luminescence and inappropriate data processing. The broad luminescence 173 transmitted by the edge filter induced the ripple-shape features above the cut-off 174 wavelength on the raw spectrum. Background correction did not eliminate the ripple-175 shape distortions induced, and instead accentuated them to give the appearance of 176 putative signatures of organic molecules. 177

Conclusion and Outlook 178
Broad bands interpreted to be Raman signatures of diverse organic molecule 179 degradation products in various metazoan fossils [15][16][17][18][19][20] are artefactual quasi-periodic 180 ripples induced by the edge filter due to intense luminescence, and there is no 181 evidence for any preserved organic molecular information. Raman 182 microspectroscopy as used in these papers does not provide information about fossil 183 biomolecules [22] , but rather informs on the degree of crystallization of carbonaceous 184 materials, which can be used to quantify the peak temperature they reached during 185 geological burial [23] . 186 Raman spectroscopy can thus provide crucial information about the chemical 187 structure of organic materials, but requires robust and optimized analytical strategies 188 and/or data processing. Several methods have been developed to remove, a 189 posteriori, the undesired contribution of luminescence and ripples in Raman 190 spectra [33,35] . Note that in the papers discussed here [15][16][17][18][19][20] , such processing would 191 leave no signal other than the mineral peaks. Alternatively, non-conventional time-192 resolved Raman spectroscopy offers new ways to limit or exploit luminescence 193 signals, while techniques based on coherent anti-Stokes Raman scattering (CARS), 194 surface-enhanced Raman spectroscopy (SERS), and ultraviolet resonance Raman 195 spectroscopy, allow the Raman signal to be considerably enhanced (see Beyssac [27] 196 for review). Furthermore, synchrotron-based X-ray Raman scattering can probe the chemical speciation of light elements such as carbon, in heterogeneous materials 198 usually encountered in life, earth, environmental and materials sciences [36,37] . 199 The search for biomolecules in fossils is a very exciting field of research, 200 offering critical knowledge on both evolutionary events and fossilization processes, 201 yet conventional Raman spectroscopy cannot be used to identify fossil biomolecules. 202 Instead, non-conventional Raman spectroscopy, mass spectrometry and infrared and 203 X-ray absorption spectroscopy techniques, allow paleontologists to correctly identify 204 fossil biomolecules in the geological record [2,38] . 205

Wavelet analyses 247
Continuous wavelet transform and wavelet multiresolution analyses were performed 248 in R using the dplR and waveslim packages, respectively. In this study, the Morlet 249 wavelet (a Gaussian-modulated sinewave) was chosen for the continuous wavelet 250 transform, and we used a 95% level for the significance test. The hatched area 251 delineating the edges of the spectrum (so-called "cone of influence") marks parts of 252 the spectrum where energy bands are likely to appear less powerful than they 253 actually are because of the increasing importance of edge effects.

Baseline subtraction 261
Baseline subtraction (Fig. 3d, dotted line) was performed using the SpectraGryph 1.2 262 spectroscopic software (adaptive baseline, 15%, no offset, minimally smoothed 263 through rectangular averaging over an interval of 4 points), following protocols of the 264 questioned studies [15][16][17][18][19][20] . Depending on the composition of the sample, the shape and 265 quality of the baseline fit, and therefore the subtracted spectrum (Fig. 3e) varies 266 when using the same snuggling curved baseline percentage for all spectra instead of 267 adapting the parameter to each spectrum.

Conflict of Interest 292
The authors declare no conflict of interest.