Virtual biomarkers: predicting immune status using label-free holotomography of individual human monocytes and machine learning analysis

Sepsis is an abnormally dysregulated immune response against infection in which the human immune system ranges from a hyper-inflammatory phase to an immune-suppressive phase. Current assessment methods are limiting owing to time-consuming and laborious sample preparation protocols. We propose a rapid label-free imaging-based technique to assess the immune status of individual human monocytes. High-resolution intracellular compositions of individual monocytes are quantitatively measured in terms of the three-dimensional distribution of refractive index values using holotomography, which are then analyzed using machine-learning algorithms to train for the classification into three distinct immune states: normal, hyper-inflammation, and immune suppression. The immune status prediction accuracy of the machine-learning holotomography classifier was 83.7% and 99.9% for one and six cell measurements, respectively. Our results suggested that this technique can provide a rapid deterministic method for the real-time evaluation of the immune status of an individual.


Introduction
Sepsis is a life-threatening condition related to an imbalanced immune response against pathogens 1 .In the early stage of sepsis, the serum levels of several proinflammatory cytokines increase and multiple vital signs change in response to the inflammatory response, called the hyper-inflammatory phase 2 .A prolonged hyper-inflammatory phase can result in patient mortality due to septic shock and multiple organ failure.The human immune system suppresses the inflammatory response to prevent these fatal consequences; however, excessive suppression of the immune response can result in secondary opportunistic infections directly related to high mortality 3 .Such an immune condition that falls below a normal immune response is considered to be an immune paralysis, also known as immune suppression state 4 .In this case, patients may be treated with immunostimulants such as granulocytemacrophage colony-stimulating factor, interferon gamma, or anti-programmed cell death protein 1 which have been tried to increase the capacity of immune response [5][6][7] .Thus, accurate evaluation and understanding of dynamic immune status can significantly affect the treatment and prognosis of patients with sepsis 8,9 .
Although biochemical blood tests are commonly conducted to evaluate the immune status of a sepsis patient, the limited specificity hinders an accurate assessment 10,11 .Moreover, the variations in interpatient reference values of inflammatory biomolecules such as C-reactive protein and procalcitonin limit their validity as universal diagnostic biomarkers 12 .In an effort to find more immunologically specific biomarkers of sepsis, scientists and researchers have highlighted the role of monocytes in the pathogenesis of sepsis.For instance, the level of human leukocyte antigen DR, expressed on circulating monocytes (mHLA-DR), is highly associated with the immune status of sepsis patients 13 .Several studies show that the mHLA-DR expression levels could be used as a deterministic biomarker for evaluating the immune status in sepsis patients, as a decrease in mHLA-DR expression showed an immune paralysis followed by poor prognosis [14][15][16] .Nonetheless, the measurement of mHLA-DR expression levels entails several hours of experimental procedures, and the resulting signal is often qualitative.For these reasons, investigating monocytes has not been suitable for determining the immune status of sepsis patients in real time.
Here we addressed the current limitations of evaluating the immune status by exploiting holotomography, a three-dimensional (3D) quantitative phase imaging (QPI) technique, and machine learning approaches in assessment of individual label-free monocytes.Holotomography enabled 3D reconstruction of refractive index (RI) distribution of individual live unlabeled monocytes without exogenous labeling 17 .Using the correlations between 3D RI distributions of subcellular structures, we extracted a single monocyte's morphological and RIdriven biophysical features, which we termed virtual biomarkers, in three different stages of sepsis: control, hyper-inflammation, and immune paralysis.We found significant alterations in the virtual biomarkers including intracellular mean RI of monocytes as the septic stage progresses.Further analysis, based on artificial intelligence, enabled end-to-end predictions of the immune status of individual monocytes with an accuracy of 83.7%.Taking the ensemble average of predictions from six individual monocytes secured >99% prediction accuracy of immune status, suggesting this study will provide significant evidence for determining the therapeutic scheme of sepsis, which requires a rapid immune status evaluation.

Holotomograms of monocytes in different immune status
We measured and analyzed a total of 4059 holotomograms of monocytes from healthy donors, and induced different immune status to the monocytes.To isolate monocytes from the peripheral blood mononuclear cells (PBMCs) of 11 healthy donors, we used a magnetic sorting method (Fig. 1A; for details see also Methods).After isolating primary monocytes, we cultured and differentiated them into three immune statuses according to a wellestablished in vitro sepsis model (control, hyper-inflammation, and immune paralysis model) [18][19][20] .The viability of individual monocytes was greater than 95% (data not shown).Single monocytes of each immune status were imaged using a holotomography system (Fig. 1B; HT-2H, Tomocube Inc., Daejeon, Korea).In our experiments, 49 off-axis holograms were acquired at varying illumination angles, from which 3D RI holotomograms of monocytes were reconstructed using the principle of optical diffraction tomography (Fig 1C) 21 .The resolution of the used holotomography system were 110 nm and 330 nm laterally and axially, respectively.The reconstructed holotomograms of monocytes clearly exhibited detailed subcellular structures in agreement with our knowledge of intracellular organelles such as nucleus, nucleoli, and intracellular granules (Fig. 1D).The RI range of monocytes was 1.34-1.41,which can be found in typical biological samples 22 .Representative holotomograms showed marked structural differences between the control and the other groups (hyper-inflammation and immune paralysis).The differences were in the cytoplasmic RI and the number of high-RI granules (Fig. 1D).

Quantitative analysis of a single monocyte
The monocytes of varying immune status were quantitatively compared, using cellular properties that were acquired from each holotomogram: the cell volume, surface area, aspect ratio, mean RI, and dry mass.We first assessed morphometries of monocyte by segmenting the holotomogram into single cells and background using the Otsu method (Fig. 2A) [23][24][25] .We extracted several morphological features, including cell volume and surface area, and several RI-related biophysical features, including mean RI and the standard deviation of RI distribution (Fig. 2B and Table 1).Moreover, partial properties of single monocyte for specific RI values were extracted by setting the RI range at a resolution of 0.005 from 1.36 to 1.39 (Table 1 and Extended Data Fig. 1).Since the RI values of biological samples are linearly proportional to their protein concentrations (and most subcellular components can be regarded as proteins), we could determine the dry mass of the monocytes using mean RI values and the RI increment (0.185 ml/g) 26,27 .The monocytes of immune paralysis model displayed a significantly smaller volume on average, compared to those of the control and hyper-inflammation model.In the case of the hyper-inflammation model, it showed significant decrease in surface area and aspect ratio.The mean RI exhibited a decrease in the hyper-inflammation model and an increase in the immune paralysis model as compared to that in the control.The dry mass showed no significant difference according to the immune status.These results suggested that the biophysical features derived from intracellular RI distribution reflect the immune status of the monocytes.

Identification of high-RI granules as LD
Holotomograms of the monocytes in septic condition revelaed an increased number of intracellular granules, which represent particularly high-RI value compared to those of the control.With the evidence of previous studies showing intracellular high-RI granules as lipid droplets (LDs) 27 , we stained monocytes with BODIPY and compared fluorescent areas and segmented areas of intracellular granules obtained from maximum intensity projection (MIP) images with varying RI threshold value (Extended Data Fig. 2A).We could identify the high-RI granules as LDs and determine RI threshold value in consistency with a previous study in which LDs were segmented from holotomograms (Fig. 2C and Extended Data Fig. 2B) 28 .Next, we examined LD volume in single monocytes, and the result showed an increase in LD volume as the septic condition progressed (Fig. 2D).The average RI value of LDs was constant regardless of the immune status, implying that the concentrations of LDs were not significantly altered according to the immune status (Fig. 2D).We then calculated the intracellular LD mass of single monocytes using the linearly proportional relationship between lipid density and RI (the average RI increment of LDs used in this study was 0.135 ml/g) 27 .The results showed that the intracellular LD mass was increased under prolonged septic condition (Fig. 2D).

The correlation between the immune status and the virtual biomarkers
To identify the correlation between the virtual biomarkers and the immune status, we performed feature-based machine learning.We calculated Z-scores of the 64 virtual biomarkers between subgroups of monocytes that were in different immune status models, as well as from varying donors, and summarized it as a heat map (Fig. 3A and Table 1).The heat map revealed unique holotomographic feature distribution of each immune status including the features that we verified by quantitative analysis.To analyze the dominant features over the immune status, we averaged and normalized each holotomographic feature of total monocytes in the same immune status; this was substantiated by a polar plot of each facet that was representative of all 64 features (Fig. 3B).The results showed that the dominant features differed among the immune statuses (control; F1-F4 and F58-F64, hyper-inflammation; F14-F31, and immune paralysis; F39-F56).In particular, we observed that the LD-related features (F39-F41) in the immune paralysis model were high, whereas the hyper-inflammation model exhibited larger RI standard deviations in all RI subranges (F23-F27) (Fig. 3B and Table 1).These findings suggested that the difference in features appeared for each immune status was reflected in the spatial distribution of the RI based on different physiologies and depended on the immune status.
To visualize the overall distribution of the total monocytes in the 64-dimensional feature space, we used the tstochastic neighbor embedding algorithm (t-SNE) that reduces the multi-dimensional feature space into a threedimensional feature space (Fig. 3C).We compartmentalized the t-SNE space into three regions with spherical gate and compared the ratio of immune status contained in each gate (Fig. 3D).The results showed that monocytes belonging to the same immune status were located close to each other in the holotomographic feature space.To verify that the correlation between the immune status and virtual biomarkers is consistent regardless of monocytes from different donors, we calculated and normalized the Euclidean distances between the averaged Z-score of each feature and presented them as a heat map (Fig. 3E).The inter-immune status Euclidean distance was larger than the inter-individual Euclidean distance, indicating that the variations of virtual biomarkers between individual donors of the same immune status were small relative to variations between immune statuses.These findings suggest that the virtual biomarkers of monocytes can be a general biomarker applicable to various individuals.

Immune status prediction using deep learning classification
Motivated by the differences in monocytes among immune statuses, statistical prediction of the immune status from the holotomograms of a monocyte was implemented via a deep learning classifier for holotomograms of monocytes with determination of the immune status at an accuracy of 99.9%, which was achieved by using six holotomograms for prediction.The deep image-processing architecture used in this study includes four dense blocks comprising 12, 24, 64, and 64 convolutional layers, with each dense block comprising a series of 3D convolutional layers (Extended Data Fig. 3A).The connections between distant layers of the architecture facilitate the construction of a deep network that effectively utilizes features of various scales.From a total of 4,059 monocyte holotomograms, 407 holotomograms were separated for blind tests, and another set of 407 holotomograms was separated to validate the generalization ability of the trained models.The remaining holotomograms were used for training, with the loss function calculated from the classification result of these holotomograms.The blind test accuracy based on the holotomogram of an individual monocyte was 83.7% (Fig. 4A).Among the three immune statuses, the control group was the most accurately differentiated at an accuracy of 89.1%.In the immune paralysis and hyper-inflammation models, the accuracy was 82.1% and 79.4%, respectively (Fig. 4B).The most frequent error was the misclassification of the hyper-inflammatory monocytes into the control class, occupying 12.7% of the total test data in the hyper-inflammatory class.In addition to the accurate prediction based on a single monocyte, superior performance was obtained by reflecting multiple monocyte holotomograms in the prediction (Fig. 4C).The output of the deep learning classifier is averaged over multiple holotomograms, securing high accuracy.In Figure 4C, the immune status was correctly determined as hyper-inflammatory, even when two out of three holotomograms of hyper-inflammatory state might have led to incorrect decisions.By taking the ensemble average of the classifier outputs resulting from multiple holotomograms, the accuracy in predicting the immune status increased considerably with the number of holotomograms, and the overall accuracy reached 99.9% with six holotomograms (Fig. 4D and Extended Data Fig. 4B-C).

Discussion
In summary, this study suggested that 3D RI imaging using holotomography can be leveraged to identify and predict the immune status of sepsis without exogenous labeling.Holotomography enabled quantifications of intracellular structural changes of single monocyte under pathological conditions, which is revealed by 3D RI distribution.Multi-variate analysis based on a machine learning approach indicated that monocytes were clustered into subsets for each immune status and were independent of subjects.Most importantly, deep learning classification method could be implemented to enable label-free prediction of the immune status of monocytes with an accuracy higher than 99%.Future work will be focused on improving the examination speed for applications in clinical settings.Combined with immune cell enrichment technology, more immediate immune status identification is possible in clinical settings using the proposed strategy.A recent technology for rapidly separating PBMCs from whole blood using a microfluidics chip can reduce the sample preparation time to approximately few minutes 29 ; however, there is currently no label-free method for isolating specific subtype of the immune cells.The application of image-based deep learning methods can potentially address these limitations to support the exclusive selection of monocytes from a cocktail of several immune cells.
Further investigations to prove the correlation between biomolecular information and virtual biomarkers would clarify the underlying mechanism of immune status change at the single-cell level.Except for a few subcellular organelles that have distinct RIs (e.g., LDs), the RI of subcellular structures mostly composed of proteins is affected only by protein concentration; biomolecular changes in subcellular structure will not affect RI data unless protein concentration changes.However, the deep learning can potentially address this ambiguity of biomolecular information.A previous study reported accurate prediction of subcellular organelles using holotomograms by applying deep learning algorithms to "learn" pairwise RI and fluorescence-labeled data of several cell lines 28 .Similarly, by training paired data of holotomograms and fluorescent images of biomolecules related to the immune status such as mHLA-DR, a mechanism of the immune status change can be explained and analyzed in a biomolecular specific manner.
Evaluating the immune status of patients requires promptness and accuracy for clinicians to establish an appropriate treatment plan.However, there has been a lack of methods for both rapidly and accurately measuring the immune status so far.With the potential diagnostic capacity of holotomography, we believe that our study would provide an important technique for the identification of the patient's immune status and establishment of a treatment strategy in cases of sepsis.

Blood collection and monocyte enrichment
This study was approved by the Internal Review Board (IRB) of KAIST (Approv.No. KH2017-004).All procedures were in accordance with the Helsinki Declaration of 2000 and informed consent was obtained from all participants.Peripheral blood obtained from healthy volunteers was collected in EDTA tubes.After peripheral blood mononuclear cells (PBMCs) are collected by density gradient centrifugation, monocytes are enriched using Pan Monocyte Isolation Kit (Miltenyi Biotec, Bergisch Gladbach, Germany) according to the manufacturer's protocol.In brief explanation, 10mL of whole blood was diluted 1:1 with commercialized buffer, autoMACS® Rinsing Solution (Miltenyi Biotec, Bergisch Gladbach, Germany) containing bovine serum albumin (Miltenyi Biotec, Bergisch Gladbach, Germany).20 mL of diluted blood was gently layered on top of 15 mL of Histopaque®-1077 (Sigma-Aldrich, MO, USA) and centrifuged at 700 g for 30 minutes at room temperature.The mononuclear layer was carefully collected and washed twice.Collected PBMCs are resuspended with MACS buffer and appropriate antibodies conjugated with magnetic beads (Miltenyi Biotec, Bergisch Gladbach, Germany).Subsequent magnetic cell separation proceeded with a column in the magnetic field followed by a collection of unlabeled cells, representing the enriched monocytes.Collected monocytes were resuspended in RPMI-1640 (Gibco®, CA, USA) supplemented with 10% Fetal bovine serum (Gibco®, CA, USA) and 1% penicillin/streptomycin (Sigma-Aldrich, MO, USA) and aliquoted.

In vitro sepsis model preparation
A method for inducing the in vitro sepsis model can be found in several pieces of literatures 19,30 .To induce the Immune paralysis model, the prepared monocytes were treated with LPS at a concentration of 10ng/ml and humidified temperature containing 5% carbon dioxide (CO2) at 37°C for 4 hours.Afterward, the monocytes were washed twice with PBS and resuspended with complete media.After 16 hours, LPS was treated to a concentration of 10ng/ml, and after 4 hours, monocytes were imaged.To induce the hyper-inflammation model, the prepared monocytes were cultured for 20 hours in a 37°C 5% CO2 incubator.Thereafter, LPS was treated to a concentration of 10ng/ml, and after 4 hours, monocytes were loaded into the imaging dish at a concentration of 10^6 and measured.The control model was measured by loading the enriched monocytes at a concentration of 10^6 in the imaging dish after culturing in complete media for 24 hours without any treatment.

Lipid droplet staining and fluorescence imaging
To verify the high RI value representing granules, BODIPY™ 493/503 (Invitrogen, Carlsbad, California) was stained.After mixing with complete media at 1:1000 (v/v ratio), monocytes were cultured at humidified temperature containing 5% carbon dioxide (CO2) at 37°C for 5 minutes.Then stained monocytes were washed with PBS to remove the remaining staining agent and resuspended with complete media.To image stained lipid droplets, we utilized the fluorescence channel to measure the 3D tomograms using stacked wide-field acquisition with a 0.313μm step size followed by 3D deconvolution (excitation center wavelengths of 470nm).Excitation light intensity and exposure time were manually adjusted to visualize the target structure at consistent intensity levels.

Holotomography imaging
Once monocyte isolation and sample preparation are done, prepared monocyte suspensions were loaded onto special imaging dishes (Tomodish, Tomocube Inc., Daejeon, Korea) and covered with a coverslip.After the monocytes had sunk into the monolayer, 3D RI tomograms of single monocyte were obtained using standardized microscopes (HT-2H, Tomocube Inc., Daejeon, Korea).The principles and implementations of holotomography have been extensively reviewed elsewhere 31 .The specific implementation used here was based on holographic field retrieval with multi-angle Mach-Zehnder interferometry using coherent 532-nm laser light steered by a digital micromirror device (DMD).The 3D RI tomograms were reconstructed by mapping the measured field information to the 3D Fourier space and filling the missing cone using 40 iterations of the non-negativityconstrained missing cone recovery algorithm.The acquisition time for gaining 49 interferograms for 3D RI tomogram reconstruction was about 0.5 seconds.All the measurements were obtained according to monocyte viability within 30 min for each dish, after which the sample was loaded into a new dish and the process repeated.

Gross morphological features extraction
Using the holotomography microscope (Tomocube Inc, Daejeon, Korea), we took several cells and quantitatively analyzed the morphological features.After region of interest (ROI) is segmented, we could calculate the volume, surface area, solidity, aspect ratio, and other morphological features depicted in SI using MATLAB internalized function of region properties.
A cell can be assumed to be suspensions composed of proteins in the lipid membrane and the cytoplasm, we can infer the protein concentrations through the RI distribution of the cell.When the concentration of media is known, the concentration of protein can be calculated by using the following equation.
Here,   is the protein concentration,   is the RI of segmented cell,   is the RI of the suspension media, and  is the refractive index increment, which is the same as the value of 0.185 ml/g for all proteins 26,27 .The protein mass of a cell can also be calculated from the protein concentration and volume.The mass of the lipid droplet can be obtained in the same way.
Here,   is the concentration of LDs,   is the RI distribution of segmented LDs,   is the RI of the suspension media, and  is the refractive index increment, which is the same as the value of 0.135 ml/g for all lipid component 27 .The LD mass of a cell can also be calculated from the LDs concentration and volume.

Deep learning algorithm
The stochastic gradient descent algorithm was used for optimization at a mini-batch size of 32.To prevent the model from halting at the local minimum of the loss, the step size of the optimization was varied with cosine annealing at an initial step size of 0.001 and a period of 32 epochs.To mitigate overfitting, the training dataset was numerically augmented every epoch.This random augmentation involved horizontal crop, horizontal rotation, and the addition of Gaussian noise.The artificial neural network and the optimization were implemented using PyTorch (v1.0.0; https://pytorch.org/).We boosted the classification performance by integrating the predictions of several of the best-performing classifier models.Additionally, we chose the models with the highest accuracies for the training and validation datasets to exploit rich features and reduce oversights in individual models.Four parameters related to the multi-model integration were scanned to seek the best integration strategy: the number of integrated models, weighting between the accuracies for the training and validation datasets, result normalization, and the approaches to integrate the predictions.Four approaches to integrate the predictions included taking the average, taking the exponential average, voting, and taking the maximum projection of the output.We chose the combination of these parameters that resulted in the highest validation accuracy.We optimized the classifier using gradient-based reduction of the cross-entropy loss function.

Fig. 1
Fig. 1 The experimental process and 3D holotomograms of monocytes.A Experimental scheme of in vitro sepsis model.B Schematic representation of holotomography.C Reconstruction principles of 3D holotomograms.Interference pattern of the hologram is shown in yellow box.D Representative images of monocytes.3D rendered images are shown in a dotted cube.2D maximal intensity projection images are shown in gray scale.Scale bar: 5 μm.Peripheral blood mononuclear cell: PBMC; Magnetic-activated cell sorting: MACS; Lipopolysaccharide: LPS; L1-4: Lens 1-4; M1-2: Mirror 1-2; CL: Condenser lens; OL: Objective lens; BS: Beam splitter; FC: Fiber coupler.

Fig. 2
Fig. 2 Quantitative analysis of individual monocytes.A Cell boundary was determined using the Otsu method.The blue boundary represents a cell boundary.Scale bar: 5 μm.B Statistical comparison of the morphological features of the monocytes according to the immune status.C Comparison of BODIPY staining location with high-RI granules greater than the RI threshold (1.39).High-RI granules and BODIPY staining are identical in location, with high-RI granules marked in crimson.Scale bar: 5 μm.D Comparison of LD contents according to immune status.Lipid droplet: LD; Refractive index: RI; Maximal intensity projection: MIP.Unpaired two-sided Student's t-test was used to calculate the P values.P < 0.05 was considered statistically significant.Central black lines depict the median, upper and lower hinges depict the interquartile range (IQR), and whiskers depict the range of data within hinges ± 1.5 × IQR.ns: P ≥ 0.05; *: P < 0.05; **: P < 0.01; ***: P < 0.001.

Fig. 3
Fig. 3 Monocytes exhibit different morphological parameters by immune status.A Z-score heat map of the morphological features of monocytes according to the immune status.Rows represent the Z-scores of morphological features extracted from the monocyte holotomograms, whereas columns represent blood donors.Each pixel is an individual average value of a morphological feature.The feature number is indicated by F. The dendrogram shows the Euclidian distance between features.Color bar represents Z-score.B Average polar distribution of features according to immune status.Radial value represents normalized Z-score of features.The angular value indicates the feature number.C Sixty-four morphological features were reduced to two dimensions using t-stochastic neighbor embedding (t-SNE).Each dot represents a single monocyte with each color representing immune status.The spherical gates are shown as a dotted line.D The composition ratio by immune status in each gate.Each color represents the immune status same as (C).E Inter-individual Euclidian distance heat map.Green mark indicates the control model, red mark indicates hyper-inflammation, and blue mark indicates immune paralysis.Color bar represents normalized Euclidian distance.

Fig. 4
Fig. 4 Deep learning-based determination of the immune state A, B The performance of determining the immune status from a single monocyte holotomogram is illustrated in (A) a confusion matrix and (B) a bar plot.C Strategy for a more accurate determination of the immune state, based on multiple monocyte tomograms.All the three MIP images of holotomograms are from the hyperinflammatory model.Scale bar: 5 μm.D The accuracy of determining the immune for various numbers of available tomograms.

Fig. 1
Partial RI features extraction.A Segmentation scheme for each partial RI range.Purple indicates RI range from 1.37 to 1.375 and orange indicates RI range from 1.38 to 1.385.B Distance skewness and kurtosis are defined as the voxel distribution for the distance from the centroid (R) of the segmented cell to the voxel corresponding to the selected RI ranges.C The number of voxels segmented by the RI threshold according to the distance from the centroid is shown in histogram.Purple indicates RI range from 1.37 to 1.375 and orange indicates RI range from 1.38 to 1.385 Extended Data Fig. 2 Identification of high RI granules as LDs.A Comparison of segmented and fluorescent areas according to RI thresholds.It can be seen that the segmented region is different according to the RI threshold on the upper right panel.BODIPY labeled LDs are fluorescence in green, and regions with values exceeding the RI threshold are indicated in pink.B The voxel counts with an RI value of 1.39 or higher in total monocyte voxel.C LDs segmented to the RI 1.39 threshold are shown in pink in MIP image.