In situ measurement of absolute concentrations by Normalized Raman Imaging

We introduce Normalized Raman Imaging (NoRI), a Stimulated Raman Scattering (SRS) microscopy method that computationally removes the effect of tissue light scattering. NoRI provides high resolution measurements of the absolute concentration of total protein, total lipid and water from live or fixed thick tissue samples with single cell resolution, and can also be applied to other Raman bands. NoRI enables study of the protein, lipid, and water concentration variation associated with development and diseases.


Introduction
Accurate measurement of size and chemical composition at single cell resolution is critical for understanding cell physiology including size regulation, growth, differentiation, and homeostasis at the cell, tissue, and organism level, as well as, in the context of disease and degeneration 1 . Numerous methods have been developed to accurately measure single cell mass or volume, ranging from the use of fluorescence reporters 2 , Coulter counter 3 , Suspended Microchannel Resonator (SMR) 4 , Quantitative Phase Microscopy (QPM) 5,6 , and Fluorescence eXclusion Microscopy (FXM) 7 . However, these methods require cell suspension or 2D cell culture, and consequently anatomical and tissue-context information is lost. Moreover, these methods do not distinguish between the mass contributions of proteins, lipids and other materials. Stimulated Raman scattering (SRS) microscopy can measure Raman signals of proteins and lipids in thick tissue specimens without any use of staining 8 . SRS intensity is linearly proportional to the concentration of target chemical structures, which enables many quantitative applications. However, in tissue samples SRS intensity is not fully quantitative due to the irregular light scattering caused by the sample. This problem is partially overcome by slicing the tissue into thin sections or by reducing the tissue light scattering by optical clearing. However, the 3-dimensional information is lost with thin sectioning and conventional optical clearing is incompatible for imaging lipids 9 . To overcome these limitations, we have instead computationally removed the effect of light scattering in thick biological samples by directly measuring the attenuation due to light scattering, providing a powerful and versatile new method for assessing protein and lipid mass in situ.

Methods
Absolute quantification requires a highly stable SRS microscope that can measure SRS intensity with excellent repeatability. To that end, we built a custom SRS microscope comprised of a tunable femtosecond pulse laser and dense flint dispersion element for spectral-focusing (Fig.1a) 10 . This design eliminates any need for optical adjustment such as OPO tuning, and therefore significantly enhances the repeatability of SRS intensity measurements. Different Raman bands can be selected by tuning the wavelength of the pump beam and the optical delay position. When the pump beam center wavelength is fixed, spectral scanning can be acquired up to ~250 cm -1 bandwidth by scanning the motorized optical delay (Fig. 1b). Raman bands at 2935 and 2853 cm -1 show strong SRS signals originating from methyl groups and methylene groups. Water has a strong peak at 3420 cm -1 from oxygen-hydrogen stretching modes. We acquire SRS images at these Raman bands as referred as the CH3 band, the CH2 band and the H2O band in the rest of this manuscript. The SRS intensity of the CH3, CH2 and H2O Raman bands are mapped to protein, lipid, and water fractions through 3 component spectral decomposition 11 .
To establish spectral decomposition process for absolute quantification, we examined the quantitative aspect of the SRS intensities of pure water and solution samples. We selected bovine serum albumin (BSA) and dioleoyl-phosphocholine (DOPC) as "calibration standard" for protein and lipid respectively. BSA solution in water and DOPC solution in 4-deuterated methanol were mixed by weighing the components. Spectral decomposition relies on the linear relation of the SRS signal intensity and the concentration of the measured molecule, which is constant in a solution sample. However the signal intensity from a solution sample varies with the x, y and z position of the image. The variation in x-y was simply corrected by flat field correction. However, the z-dependence was from the refractive index mismatch of the samples (Fig. 1c). When we use a water immersion objective lens, SRS intensity of water sample is independent of the z position, as the objective lens is perfectly corrected for water. But the intensity is strongest at the surface immediately past the cover glass for BSA and DOPC samples. This is because the focus degrades as the light propagates further through the index-mismatched solutions. Therefore, all calibration standards are assembled into a single sample holder and imaged at the maximum intensity z position where the optical aberration from the sample is minimal. By measuring the SRS intensity at the CH3, CH2 and H2O bands of the BSA, DOPC and water calibration standards, decomposition matrix is solved by a simple matrix inversion M = ∑ C S −1 (Eq. 1), where S is the SRS signal at the i-th Raman band (i=CH3, CH2 and H2O) of the k-th standard sample (k=BSA, DOPC and water samples) and C is the volume concentration of the j-th component (j=protein, lipid and water) of the k-th sample in the unit of volume fraction (ml/ml) (Fig. 1d) 12 .
We next analyzed the effect of sample light scattering in the spectral decomposition of sample images. The SRS intensity images of a biological sample show intensity variation of both the abundance of the corresponding chemical groups and the attenuation by the sample (Fig. 1e). Spectral decomposition is processed by matrix multiplication of the decomposition matrix with SRS images at the CH3, CH2 and H2O Raman bands: R ( ⃗) ∶= ∑ M S ( ⃗) (Eq. 2). The inverse of decomposition matrix M −1 is interpreted as the SRS intensity of i-th Raman band measured from a pure material of the j-th component. It is proportional to the Raman cross section of the molecule σ as well as to the efficiency of the imaging system A 0 as optimized for the imaging of calibration standards; that is, M −1 = σ A 0 , or S = ∑ σ A 0 C (Eq. 3). While the Raman cross section σ is an intrinsic property of the chemical constituent, the efficiency of the microscope A 0 is affected by the optical aberration of the imaging system and the detector efficiency. When imaging samples, the light scattering and aberration caused by the sample introduces an extra attenuation factor A ( ⃗) to the signal, which can be expressed as S ( ⃗) = ∑ σ A 0 A ( ⃗)C ( ⃗) (Eq. 4). As we argue below, we assume that the sample induced attenuation factor is independent of the Raman bands. When spectral decomposition is applied to sample images, by multiplying the decomposition matrix with the SRS intensity, we do not get the true concentration because of this factor. Instead, the output of spectral decomposition is proportional to the concentration of respective chemical components and also to the spatially heterogeneous attenuation (Fig.1f): ∑ M S ( ⃗) = A ( ⃗)C ( ⃗) (Eq. 5).
The unknown attenuation factor varies with position and with each sample and limits the quantitative interpretation of spectral decomposition. We found that the unknown A ( ⃗) can be calculated by making a reasonable approximation that the tissue is mostly composed of protein, lipid and water. In other words, the sum of P/L/W volume fraction is 1 ml/ml, or ∑ C = 1 (Eq. 6) where j=protein, lipid and water (P/L/W in abbreviation). In fact, this is the reason we defined the decomposition matrix using volume fractions in Eq. 1. Under this assumption, the sum of spectral decomposition over P/L/W provides A ( ⃗): A ( ⃗) = ∑ M S ( ⃗) (Eq. 7). A ( ⃗) is defined throughout the sample imaging volume and serves as the normalization mask (Fig. 1g). It measures the collective sample property that attenuates the SRS intensity at each voxel which includes the presence of scatterers or absorbers above and below the imaging plane. Absolute concentration of P/L/W in volume fraction is obtained by dividing the spectral decomposition from Eq. 5 with the normalization mask (Fig. 1h). Mass concentration of protein and lipid can be estimated by multiplying the volume fraction with the mass density of pure protein or lipids (Fig. 1i). We used 1.364 g/ml and 1.0101 g/ml as representative density for protein and lipid respectively. In addition to absolute P/L/W concentration, light scattering normalization can be applied to other Raman bands or raw SRS intensity data thanks to the general nature of the linear relationship in Eq. 4.

Results
To demonstrate the light scattering normalization in thick tissues, we acquired an xyz scan from a 90 m thick section of mouse growth plate cartilage (Fig. 1j). Both raw SRS intensity (data not shown) and relative protein and lipid concentration obtained from spectral decomposition decrease with imaging depth. By contrast, NoRI provides an absolute concentration measurement that does not decrease with depth (Fig. 1k). The noise level of absolute concentration measurement increases when the raw SRS intensity decreases, as in the deep layer of a tissue. Imaging depth is limited by the detection sensitivity of SRS signals and varies with tissue types and optical clearing.
We validated the accuracy of absolute concentration measurements by NoRI by imaging bovine serum albumin (BSA) solutions of known concentrations (Fig. 1l). NoRI measurement shows excellent agreement with the actual concentration of solutions with sensitivity of approximately 15 mg/ml as measured by the standard deviation of protein concentration at each pixel within the image. Next, we demonstrated in situ mass measurement by quantifying chondrocyte hypertrophy with NoRI. The growth plate of postnatal day 5 mouse was fixed and sectioned to 100 m thickness for imaging. Individual cells were segmented in 3D and the protein density and the lipid density were integrated within each cell volume to calculate the total protein and the total lipid mass of single cells (Fig. 1m). Since there is no existing technique to compare the protein mass and the lipid mass separately, we compared the sum of protein and lipid mass density with the dry mass density measured by refractive index tomography (Tomocube, HT-2). The in situ measurement by NoRI shows good agreement with the measurement from dissociated chondrocytes (Fig. 1n).

Application examples
NoRI makes it possible, for the first time, to measure the absolute protein and lipid mass concentration of unlabeled cells, subcellular compartments and extracellular tissue. Here we include an analysis of a few cell and tissue types to highlight the range of potential applications of NoRI microscopy (Fig.2).
Cells and organelles can be distinguished by the quantitative value of their distinctive protein and lipid concentrations. For example, myelin sheath ( Fig. 2a and 2e) and lipid droplets in hepatocytes showed high . CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted May 8, 2019. ; https://doi.org/10.1101/629543 doi: bioRxiv preprint lipid concentration over 200 mg/ml (Fig. 2g). Cytoplasmic lipid density, excluding lipid droplets, also showed variation with cell types. Individual myofibers were recognizable by their distinct cytoplasmic lipid levels ( Fig.  2b and 2f) even at much lower concentration around 10-30 mg/ml. Kidney tubules also showed variation of cytoplasmic lipid density (Fig. 2d). Cell nuclei are distinguished by the complete absence of lipid (Fig. 2c, h, i,  k). Nucleoli showed higher protein concentration than nucleoplasm (Fig. 2h). Non-cellular material such as extracellular matrix or yolk appeared to have characteristic protein and lipid concentrations (Fig. 1k and 2i). Unexpectedly, in some cases protein and lipid concentration varied between cells of the same cell type (Fig.  2h). Asynchronous culture of HeLa cells and cerebellar Purkinje neurons showed greater than 2-fold variation in protein concentration in both cytoplasm and nucleoplasm. This variation is reproducible under varying conditions and intriguing.
To demonstrate live imaging capability, we imaged live zebrafish embryo at 6 somite stage (Fig. 2i). The protein and lipid concentrations at this stage did not vary much with different embryonic cell types. As expected, yolk granules are packed with proteins and lipids. Individual yolk granules could be recognized adjacent to the embryo as they had lower protein and lipid concentrations, their stores having been partially depleted by the embryo.
Different concentration of protein and lipid can serve as image contrast for histological analysis 13 . To demonstrate the potential for quantitative histology by NoRI, we acquired large area scans from mouse cerebellum and pancreas (Fig. 1j-k). The time to image 1 cm x 1 cm area was 6 hours at 60x magnification with 0.38 m xy resolution and approximately 1.5 m axial resolution and 2 hours at 30x magnification at the cost of decreased axial resolution. Various known structures of cerebellum could be recognized. Large Purkinje neurons of various cytoplasmic concentration line between granular layer and molecular layer. The neuropil that is the tangled mass of submicroscopic dendrites and unmyelinated axons appear as diffuse matrix with relatively low protein and lipid concentrations in the molecular layer. Cells of the granular layer could be recognized by their compact nuclei. Dendritic glomeruli between the granular cells showed elevated lipid levels as expected from their dense membrane structure. White matter was marked by the high lipid content of myelin. (Fig. 2j) NoRI image of the mouse pancreas also shows recognizable histological features. Acinar cells contained large number of high protein vesicles, which are likely zymogen granules for storing and secreting digestive enzymes. Protein dense membrane surrounds blood vessels and duct. (Fig. 2k) To demonstrate the utility of NoRI quantitative histology, we imaged the brain of an Alzheimer's disease model mouse (APP-PS1) (Fig. 2l). Compared to the normal histology, the APP-PS1 mouse brain showed large number of lesions in both protein and lipid images 14 . The protein image showed amyloid plaques in high protein concentration in the extracellular space, which were surrounded by patches of high lipid concentration (Fig. 2m). The lipid rich patches were greater in number than amyloid plaques in the APP-PS1 mouse brain. By contrast, higher protein concentrations were localized to nucleoli or the cell body in the wild type brain and high lipid signal exclusively correlated with the myelin (Supplementary Fig. 5).

Discussion
The accuracy of the absolute concentration measurement by NoRI depends on these key assumptions and approximations: (1) the tissue sample is made up of protein, lipid and water, (2) the Raman spectra of BSA and DOPC is a good representation of the Raman spectra of the average proteome and lipidome of the sample, (3) wavelength dependency of optical aberration and light scattering is weak, (4) absence of non-Raman processes.
(1) The first assumption is a good approximation in many mammalian cells where the combined fraction of protein, lipid and water is greater than 90% of the total chemical composition 15 . Exceptions would occur in calcified tissue with high content of hydroxyapatite, bacterial cells with high content of nucleic acids, and cells . CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted May 8, 2019. ; https://doi.org/10.1101/629543 doi: bioRxiv preprint that store large amount of polysaccharide. Polysaccharides and nucleic acids have a large number of carbonhydrogen bonds which significantly overlaps with that of the CH3 Raman band. As a result, the current 3 component spectral decomposition of NoRI places nucleic acids and polysaccharides into the protein fraction. By measuring additional Raman bands, it is possible to generalize NoRI to measure nucleic acids or polysaccharides in addition to proteins, lipids and water 16 . (2) BSA is chosen as calibration standard for the protein Raman spectrum because it is economical and available in large quantity, easily dissolves in water, and has a similar Raman spectrum in the CH2 and CH3 bands compared to the average vertebrate proteome. The majority of carbon-hydrogen bonds of proteins are in the side chains. Therefore, the size of the CH2 and CH3 Raman band is determined by the average amino acid frequency of the proteome. The frequency of methyl groups, methylene groups and other hydrocarbons of BSA is very similar with that of average vertebrate proteome (Supplementary Table 1) 17 . DOPC is the most abundant lipid in cell membranes and its methyl and methylene group frequency is representative of a typical lipid with fatty acid chains. However, lipids without long carbon chains such as sterols have the CH2 Raman band shifted from the position defined by DOPC. As a result, under current spectral decomposition scheme sterols are partially miscategorized to protein fraction, and the interpretation of lipid vs protein fraction should be done with this caution. It is a subject of further investigation to identify a Raman band to enable differential quantification of sterols from lipids with fatty acid chains.
(3) We hypothesized that the intensity attenuation caused by the sample light scattering does not change with the Raman bands. By using a near-infrared light source, the intensity attenuation from tissue light scattering and absorption is indeed nearly independent of wavelength. The Stokes beam wavelength is fixed at 1045 nm and the pump beam wavelength changes between 770 nm and 805 nm to select different Raman bands. Reduced scattering coefficient of typical tissue in 770805 nm is on the order of 2 mm -1 and its change over 770805 nm range is typically less than 0.5 mm -1 18 . Hence, the length scale of tissue thickness by which light scattering property will show significant (1/e) deviation between 770 and 805 nm is on the order of 2 mm. Because the SRS process requires a tight focus of light, the depth range for SRS signal detection is <200 m, which is an order of magnitude smaller than the length scale of wavelength dependence of tissue light scattering. Optical clearing of the tissue can increase the imaging depth limit to several hundreds of microns but the light scattering will decrease as well.
(4) SRS imaging is generally free of non-Raman background compared to other nonlinear Raman imaging modalities 19 . However, pigment molecules in the tissue, which absorb infrared light, will give rise to strong non-Raman signals and may even damage the sample by heating. We observed that hemoglobin in red blood cells has a strong non-Raman signal resulting in erroneous absolute concentration output 12 . On the other hand, infrared absorption by melanin in retinal pigmented epithelium or skin led to sample damage by burning, which was prevented by bleaching with hydrogen peroxide.

Conclusion
Compared to the existing methods for total protein and lipid quantification, NoRI's unique benefit is that absolute quantification can be done with high spatial resolution in intact tissues. Conventional quantification methods are either bulk assays, like Bradford assay, or semi-quantitative approaches using dyes. Existing cutting-edge techniques such as refractive index tomography and suspended microchannel resonator can provide absolute mass measurement. But these are limited in the breadth of samples to which they can be applied, and cannot distinguish protein and lipid. NoRI expands the technical capability for protein and lipid mass quantification by utilizing broadly applicable assumptions. We expect that NoRI will enable diverse research topics where the amount and distribution of protein and lipid mass is central, including and not limited to cell growth, skeletal development, and neurodegenerative diseases. For example, NoRI measurements of protein density may provide a label-free way to visualize condensation of proteins from liquid-liquid phase separation. The concentration of protein and lipid can serve as a histological marker 20,21 . Unlike conventional histological images that may have different staining intensities from batch to batch, NoRI histology takes advantage of the protein and lipid concentrations, which are intrinsic properties of tissues that can be directly compared between different samples.

Conflicts of interest
The authors filed a patent application of NoRI. XSX has a financial interest in Invenio Imaging, Inc.   Light scattering normalization pipeline for in situ measurement of absolute concentration of protein, lipid and water.  Fig. 1e. P stands for protein, L for lipid, and W for water. g. Light scattering normalization mask is calculated from the pixel-by-pixel sum of f. h. NoRI image of absolute concentration is obtained by dividing f with g. Concentration is measured as volume fraction in ml/ml. i. Mass density of protein (P, magenta) and lipid (L, green) is calculated by multiplying volume fraction with the density ρ of pure protein or lipid. j. Spectral decomposition shows decrease of signal with imaging depth. Sample is the rib growth plate cartilage of mouse on postnatal day 5. P (magenta) is protein signal. L (green) is lipid signal. k. NoRI image of protein and lipid mass density from j after light scattering normalization.
Readout is absolute concentration in mg/ml and does not attenuate with imaging depth. l. Protein concentration of BSA solution and NoRI measurement show good agreement.
Error bar is standard deviation from all pixels in each image. m. Single cell 3D segmentation of chondrocytes in a mouse growth plate. n. Single cell volume and dry mass density scatter plot of growth plate chondrocytes.
Larger cells have lower mass density. NoRI data is acquired from a single thick tissue section (red). Refractive index tomography data is acquired from dissociated chondrocytes pooled from multiple animals (gray). Scale bars are 40 m in e-I and 20 m in j and k.
. CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted May 8, 2019. ; https://doi.org/10.1101/629543 doi: bioRxiv preprint . CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted May 8, 2019. ; https://doi.org/10.1101/629543 doi: bioRxiv preprint