Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Data-driven multiplexed microtomography of endogenous subcellular dynamics

View ORCID ProfileYoungJu Jo, Hyungjoo Cho, Wei Sun Park, Geon Kim, Donghun Ryu, Young Seo Kim, Moosung Lee, Hosung Joo, HangHun Jo, Sumin Lee, Hyun-seok Min, Won Do Heo, YongKeun Park
doi: https://doi.org/10.1101/2020.09.16.300392
YoungJu Jo
1Department of Physics, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
2KAIST Institute for Health Science and Technology, KAIST, Daejeon 34141, Republic of Korea
6Present address: Departments of Applied Physics and of Biology, Stanford University, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for YoungJu Jo
Hyungjoo Cho
3Tomocube Inc., Daejeon 34051, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wei Sun Park
1Department of Physics, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
2KAIST Institute for Health Science and Technology, KAIST, Daejeon 34141, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Geon Kim
1Department of Physics, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
2KAIST Institute for Health Science and Technology, KAIST, Daejeon 34141, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Donghun Ryu
1Department of Physics, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
2KAIST Institute for Health Science and Technology, KAIST, Daejeon 34141, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Young Seo Kim
1Department of Physics, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
2KAIST Institute for Health Science and Technology, KAIST, Daejeon 34141, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Moosung Lee
1Department of Physics, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
2KAIST Institute for Health Science and Technology, KAIST, Daejeon 34141, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hosung Joo
3Tomocube Inc., Daejeon 34051, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
HangHun Jo
3Tomocube Inc., Daejeon 34051, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sumin Lee
3Tomocube Inc., Daejeon 34051, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hyun-seok Min
3Tomocube Inc., Daejeon 34051, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Won Do Heo
4Department of Biological Sciences, KAIST, Daejeon 34141, Republic of Korea
5Center for Cognition and Sociality, Institute for Basic Science (IBS), Daejeon 34141, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
YongKeun Park
1Department of Physics, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
2KAIST Institute for Health Science and Technology, KAIST, Daejeon 34141, Republic of Korea
3Tomocube Inc., Daejeon 34051, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: yk.park@kaist.ac.kr
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Simultaneous imaging of various facets of intact biological systems across multiple spatiotemporal scales would be an invaluable tool in biomedicine. However, conventional imaging modalities have stark tradeoffs precluding the fulfilment of all functional requirements. Here we propose the refractive index (RI), an intrinsic quantity governing light-matter interaction, as a means for such measurement. We show that major endogenous subcellular structures, which are conventionally accessed via exogenous fluorescence labeling, are encoded in 3D RI tomograms. We decode this information in a data-driven manner, thereby achieving multiplexed microtomography. This approach inherits the advantages of both high-specificity fluorescence imaging and label-free RI imaging. The performance, reliability, and scalability of this technology have been extensively characterized, and its application within single-cell profiling at unprecedented scales has been demonstrated.

Introduction

Imaging is the process of mapping a variable, called contrast, in space and time. The tradeoffs between different contrast mechanisms fundamentally determine the distinct characteristics of each imaging modality (Mertz, 2019). In biomedicine, fluorescence (FL) has been a canonical imaging contrast over several decades for visualizing specific elements within biological systems, powered by chemical, immunological, and genetic labeling strategies (Lichtman and Conchello, 2005). Despite the excellent biochemical specificity, however, a number of drawbacks are associated with FL: photobleaching and phototoxicity (limited temporal window), spectral overlap (limited multiplexing), variability of labeling quality (limited reproducibility and potential bias), and exogenous labeling-induced side effects (potential perturbation of endogenous biology).

Recent advances in machine learning triggered an interesting approach, known as cross-modality inference or in silico labeling. This approach achieves FL contrast by measuring another contrast with complementary characteristics (Christiansen et al., 2018; Nygate et al., 2020; Ounkomol et al., 2018; Rivenson et al., 2019). Most notably, the paring of images from simultaneous FL and bright-field (BF) microscopy, based on light absorption contrast, was used to train neural networks to convert BF images into FL images (Christiansen et al., 2018; Ounkomol et al., 2018) (“BF2FL”). Despite the ingenuity of this method enabling computational staining of unlabeled samples, the minimal absorption at the cellular and subcellular levels raises a fundamental question about BF2FL: is light absorption an optimal contrast mechanism for cross-modality inference?

An alternative approach utilizes optical phase delay as the measured contrast (Nygate et al., 2020; Rivenson et al., 2019). Unlike absorption, phase distributions have a significant contrast in space even at the subcellular level, and forms the basis of phase-contrast microscopy. Emerging quantitative phase imaging (QPI) technologies measure the phase images of unlabeled samples with high sensitivity (Park et al., 2018), which could be paired with FL images for cross-modality inference. Although the improved contrast mechanism is expected to facilitate robust performance, so far, this method has not been extended beyond 2D imaging, which has significant limitations for the detailed characterization of subcellular dynamics.

In this study, we report a new technology for data-driven multiplexed microtomography of endogenous subcellular structures and dynamics across various spatiotemporal scales. We fundamentally improved the cross-modality inference framework by introducing the contrast mechanism of the 3D refractive index (RI), which is an endogenous quantity governing light-matter interaction including both absorption and phase delay. The diffraction-limited 3D RI tomograms, measured by high-speed multi-angle holography (or holotomography), enabled the scalable inference of multiple 3D FL tomograms for the corresponding subcellular targets (“RI2FL”). Importantly, we found that RI2FL generalizes across cell types, outperforming BF2FL in all quantitative measures. Moreover, Bayesian uncertainty quantification schemes provided measure of spatiotemporal reliability. We demonstrated a proof-of-concept application of RI2FL in cell biology and high-throughput screening via unprecedented single-cell profiling.

Results

We sought to determine the quantitative relations linking RI distribution to subcellular targets in a data-driven manner by training 3D convolutional neural networks to translate a RI tomogram into FL tomograms corresponding to multiple subcellular targets (Figure 1A). We first created a large-scale dataset consisting of ~1,600 3D RI tomograms (at 532 nm wavelength) and the corresponding 3D FL tomograms from 6 subcellular targets (actin, mitochondria, lipid droplets, plasma membranes, nuclei, and nucleoli) and 6 eukaryotic cell types (NIH3T3, COS-7, HEK 293, HeLa, MDA-MB-231, and astrocyte) using standardized holotomographic microscopes equipped with FL channels (Kim et al., 2017) (Methods). To train the networks, we used a subset of NIH3T3 tomograms only and held out all other data, in order to test the generalization of the discovered RI-target relations across cell types (Table S1).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table S1.

Dataset summary.

Figure 1.
  • Download figure
  • Open in new tab
Figure 1. Data-driven microtomography of multiple subcellular structures.

(A) Concept of RI2FL. (B) Examples of RI2FL across cell types and subcellular targets. All bounding boxes represent a volume of 76.8 × 76.8 × 12.8 μm3.

The distinct nature of RI and FL presents several experimental and computational challenges. Firstly, while RI is an absolute and unbiased quantity independent of the experimenter or instrument, FL signals are heavily dependent on labeling quality, illumination power, and exposure time (Lichtman and Conchello, 2005; Mertz, 2019). To amend this, we implemented tight quality control procedures carried out by trained cell biologists throughout the data acquisition and processing pipeline, thereby establishing ground truth subcellular targets defined by 3D FL (Methods). Secondly, the drastic differences between the FL channels require the painstaking optimization of individual target-specific network architectures. We instead utilized a single, highly flexible network architecture for all subcellular targets, powered by a large-scale neural architecture search, which was also potentially advantageous for an extension to additional subcellular targets. (Figure S1). Thirdly, high-resolution 3D RI tomograms have enormous memory demands infeasible for most GPUs. To avoid this, we assumed that the target-specific patterns could be identified from local (~10 μm) distribution of RI, and implemented patch-based parallel processing (Figure S2). With these strategies in hand, we successfully trained the networks for RI2FL inference (Figure 1B, Figure S3, and Video S1).

Figure S1.
  • Download figure
  • Open in new tab
Figure S1. Network architecture.

We used a single encoder-decoder network, systematically discovered by SCNAS, for all subcellular targets.

Figure S2.
  • Download figure
  • Open in new tab
Figure S2. Patch-based processing.

An example inference from an RI tomogram to the corresponding actin tomogram is illustrated as a flow chart, highlighting the patch-based processing for GPU memory management. All images represent maximum intensity projections of 3D data.

Figure S3.
  • Download figure
  • Open in new tab
Figure S3. Comparison between RI2FL and BF2FL.

An example inference with ground truth actin tomogram is shown. Clearly, RI tomograms enable more accurate FL inference compared to BF stacks, as quantified in Figure 2C. The bounding boxes represent 76.8×62.1 ×12.8 μm3 volumes corresponding to an identical FOV.

We characterized the performance of RI2FL by quantitatively comparing the inferred and ground truth FL in the held-out dataset. The prediction accuracy across cell types is illustrated in Figure 2A. Strikingly, not only NIH3T3 cells but also all other cell types, which were never presented to the networks during training, showed high performance. In particular, an excellent accuracy for astrocytes, which were obtained from primary cultures unlike other immortalized cell lines, strongly supported that RI2FL captured fundamental RI-target relations generalizing across cell types. The per-target performance is presented in Figure 2B. While the high accuracy for nuclei and lipid droplets is consistent with the high RI contrast of these targets (Lee et al., 2019; Park et al., 2020), all the remaining targets, which are hardly recognizable via the visual inspection of RI tomograms, showed comparable performances. In addition, in order to rule out the labeling-induced bias of the dataset (which is unlikely due to the low density of fluorophores (Yoon et al., 2018)), we performed RI2FL with unstained samples, and then stained the samples to obtain the corresponding FL ground truth, which showed consistent results (Figure S4). Together, our results confirmed the seamless identification of endogenous subcellular targets by RI2FL (see Table S2 for additional performance metrics).

Figure S4.
  • Download figure
  • Open in new tab
Figure S4. Comparison between pre- and post-staining data.

In order to validate the inference of endogenous subcellular targets, we imaged identical cells before and after staining. The qualitative correspondence between the pre- and post-staining data, despite the irreducible discrepancy due to the temporal difference (from minutes to an hour) and fixation, further validates the successful operation of RI2FL in unlabeled cells.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table S2.

Performance metrics.

Figure 2.
  • Download figure
  • Open in new tab
Figure 2. Performance, reliability, and scalability.

(A-C) RI2FL performance quantification across (A) cell types, (B) subcellular targets, and (C) methods. All quantifications were conducted on the held-out data. Mean±S.D.; further statistics are provided in Table S2. (D) Uncertainty quantification example. Top: The dynamics of cell division observed with timelapse RI tomograms. White arrows indicate the dividing cell and its daughter cells. Bottom: Corresponding nuclei inference and its data uncertainty maps. The red arrow highlights the larger uncertainty due to the putative breakdown of the nuclear envelope, which is specific to this cell state. The bounding boxes represent a volume of 49.2 × 49.2 × 12.8 μm3. (E) Scaling up RI2FL to large FOV tomograms exploiting the shift-invariance of the RI-target relations. The larger bounding boxes represent a volume of 480 × 480 × 13 μm3. (F) Scaling up RI2FL in space and time, further exploiting the time invariance of the RI-target relations. The bounding boxes represent a volume of 307×307×13 μm3. In (E and F), only four FL (actin, mitochondria, nuclei, and lipid droplets) channels are shown for visual clarity.

Next we quantified the advantage of RI over BF (Christiansen et al., 2018; Ounkomol et al., 2018). Because RI includes both absorption and phase delay information, one can reconstruct stacked BF intensity images from RI tomograms using Fourier optics (Mertz, 2019) (Methods). We trained BF2FL networks with the reconstructed BF images and the corresponding FL tomograms (Figure S3), and compared the performances of BF2FL and RI2FL (Figure 2C). Clearly, there was a considerable margin between the two methods. This is not surprising because the optical information in stacked BF images is only a tiny subset of the full RI information, particularly in cellular systems where phase delay dominates over absorption (Park et al., 2018).

Despite the exciting opportunities provided by RI2FL in addition to previous cross-modality inference approaches, the extent to which we can “trust” the model predictions in space and time has not been clear. We aimed to elucidate this by applying the recent advances in Bayesian deep learning (Gal and Ghahramani, 2016; Wang et al., 2019). To this end, we quantified the uncertainty maps to guide the endusers with “error bars” accompanying the FL predictions. Intuitively, uncertainty can be estimated as the voxel-wise variability of predictions upon perturbation of the data or model (Figure S5). An example demonstrating uncertainty quantification in RI2FL is presented in Figure 2D (also see Video S2). During animal cell division, the nuclear envelope breaks down to facilitate the separation of aligned chromosomes by the spindle apparatus. This specific event, which is rare in the training dataset, makes nuclei prediction by RI2FL particularly challenging around the periphery of the nuclei. The specific increase in uncertainty at this stage (red arrow) casts a cautionary signal for downstream analyses. As such, uncertainty quantification can provide spatiotemporal reliability measures for the end-users, on top of the holistic accuracy metrics. The uncertainty maps can also guide data collection to strengthen the model.

Figure S5.
  • Download figure
  • Open in new tab
Figure S5. Uncertainty quantification schemes.

(A) Data uncertainty and (B) model uncertainty quantifications were conducted by test-time augmentation and Monte Carlo dropout, respectively. All images represent maximum intensity projections of 3D data.

RI2FL is intrinsically scalable in space and time because the RI-target relations are shift- and time-invariant. The trained models, together with the patch-based processing (Figure S2), can be readily applied to the large field-of-view (FOV) RI tomograms obtained by image stitching or high space-bandwidth product techniques (Baek et al., 2019; Zheng et al., 2013). We successfully operated RI2FL for tomograms with large FOVs up to 480×480×13 μm3 without trading off the high spatial resolution (Figure 2E). In addition, RI2FL can be sequentially applied to time-lapse large-FOV tomograms, as we demonstrated in the recordings up to 72 hours (Figure 2F and Video S3). Importantly, there is no theoretical upper limit in the spatial and temporal scales by virtue of holotomography that is free from photobleaching and phototoxicity; the only practical limitations are computing time and memory, which linearly scale with the data dimensions.

One exciting application of RI2FL is time-resolved hybrid single-cell profiling for use in cell biology and high-throughput screening. The image-based profiling of single cells with standardized data acquisition and interpretable feature extraction has provided insights into new phenotypes and cellular heterogeneity, thereby complementing genomics (e.g., Cell Painting with CellProfiler (Bray et al., 2016)). While this capability is critically dependent on highly multiplexed FL imaging, the spectral overlap issue limits such measurement to fixed cells via multi-round imaging. On the other hand, multiplexed microtomography with RI2FL in intact living cells allows for the time-resolved profiling of single-cell phenotypes. Furthermore, RI contributes additional information orthogonal to FL. Traditionally, RI has been a uniquely suitable modality for ultrasensitive quantification of subcellular mass (Barer, 1953), which is particularly relevant for the study of cell cycle and growth (Cooper et al., 2013; Mir et al., 2011). This quantitative nature of RI can be synergistically combined with the specificity of FL to access new dimensions for single-cell profiling (Figure 3A; Methods).

Figure 3.
  • Download figure
  • Open in new tab
Figure 3. Application to single-cell profiling at unprecedented spatiotemporal scales.

(A) Fully automated single-cell profiling pipeline based on RI2FL. The bounding boxes represent a volume of 76.8 × 76.8 × 12.8 μm3. (B) Unsupervised embedding of single cells based on the extracted single-cell features. (C) Visualization of three representative features mapped onto single cells. (D–G) Perturbation experiments with (D and E) growth factor and (F and G) chemogenetic stimulation. (D and F) Time-lapse RI2FL. RI and three FL (actin, nuclei, and plasma membranes) channels are shown. The bounding boxes represent a volume of (D) 76.8 × 55.4 × 12.8 μm3 and (F) 76.8 × 76.8 × 12.8 μm3. (E and G) The corresponding feature dynamics over time. Only three out of 65 features are shown for visual clarity.

For a sanity check, we profiled ~4,000 single cells detected in the dataset with the six cell types. Our fully automated pipeline robustly segmented individual cells and extracted a variety of interpretable features defined by morphology, FL, RI, or their combinations (Table S3). We defined a minimally redundant set of 65 features spanning the representative facets of the single cells, while one can readily define thousands of single-cell features with the current level of multiplexing (Bray et al., 2016). The unsupervised low-dimensional embedding of the single-cell features revealed an intriguing variability across and within the cell types (McInnes et al., 2018) (Figure 3B). Three exemplary features underlying this variability are shown in Figure 3C. Astrocytes generally had large cell volumes, consistent with their complex stellar morphology. NIH3T3 cells, which are fibroblasts with characteristic actin structures, showed a high actin density. HEK 293 cells had a high mass density, which can be attributed to the high rates of protein production by this cell type. Notably, for these three, as well as all the other features, an enormous variability was observed within each cell type, which is partially dependent on the cell cycle, making single-cell profiling attractive.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table S3.

Single-cell feature statistics.

Next, we proceeded to time-resolved interrogations. For proof-of-concept, we carried out a series of perturbation experiments. Firstly, NIH3T3 fibroblasts were stimulated with platelet-derived growth factor (PDGF) to promote cell growth in a physiological manner (Martin et al., 2014), while observing the cells with a high volume rate (0.8 sec/volume) exploiting the high speed of holotomography without photobleaching and phototoxicity. The inferred FL channels clearly visualized lamellipodia formation (white arrows) and actin reorganization in 3D at a time scale of minutes (Figure 3D and Video S4). The time-resolved profiling was able to measure the fast dynamics of the features in response to PDGF stimulation (Figure 3E). This measurement represents a new regime in cell biology that has been previously inaccessible due to technical limitations. The temporal resolution can be readily improved beyond the video rate (Kim et al., 2013).

To determine the roles of the distinct signaling pathways, we specifically targeted RhoA, a Rho family small GTPase downstream of PDGF, using chemogenetics (Inoue et al., 2005). The rapamycin-induced formation of the FKBP-FRB-rapamycin complex recruited constitutively active RhoA to the plasma membrane (Figure 3F and Video S5). Unlike PDGF, RhoA stimulation specifically promoted the formation of actin stress fibers, resulting in characteristic cell morphology. Consistently, only a subset of the features illustrated in Figure 3E showed similar dynamics to PDGF stimulation (Figure 3G). The whole variety of features, including those based on multiple FL channels, may be relevant for studying inter-organelle interactions during these cellular responses. Taken together, our results demonstrate that RI2FL provides a powerful means by which to quantitatively profile single living cells via simultaneous access to a variety of information.

Discussion

In summary, we developed and extensively characterized RI2FL, a scalable framework with which to infer endogenous subcellular structures and dynamics from 3D RI tomograms. The high performance of this approach was found to result from the nature of RI, which encompasses both absorption and phase delay information. Together with the uncertainty quantification schemes to measure the prediction reliability in space and time, RI2FL represents a powerful platform technology for cell biology and high-throughput screening, as demonstrated by its capacity for time-resolved single-cell profiling.

The high-dimensional observation and perturbation of single-cell dynamics at scale would facilitate the systems-level understanding of cellular behavior and decision-making. So far, most studies in systems biology have been relying on snapshot (e.g., transcriptomics or multi-round imaging) or low-dimensional time-series (e.g., one or two fluorescence biosensors) measurement of cellular states, which had significant limitations to infer the network-based logic governing cellular dynamics. With the highdimensional time-series measurement (or single-cell state space trajectories) in hand, one could directly infer the underlying dynamical systems at the single-cell level, analogously to systems neuroscience (Pandarinath et al., 2018). We are currently applying this approach to search for the dynamical phenotypes specific to clinically-relevant cellular malfunctions (e.g., cancer) or to high-efficacy drug actions (Topol, 2019).

We believe that this study, at least partly, addresses a long-sought goal of the label-free imaging community: biochemical specificity by RI. While the overlapping RI values of many subcellular structures have precluded RI-based imaging beyond nuclei and lipid droplets (Figure S6), we previously proposed that the spatial distribution of RI may encode enough information to infer these structures (Jo et al., 2019), as experimentally demonstrated in the present study. We look forward to extending this technology to tissues and ultimately to in vivo applications, synergizing with new approaches to 3D QPI in highly scattering systems (Chen et al., 2020; Lim et al., 2019).

Figure S6.
  • Download figure
  • Open in new tab
Figure S6. RI ranges of the subcellular targets.

The target-specific RI ranges were estimated using the inferred FL data. For each FL channel, the voxels corresponding to the targets were determined by Otsu’s method. Mean±S.D.; the results were calculated from 703 NIH3T3 tomograms.

An important next step would be reverse-engineering the trained models to interpret the discovered RI-target relations (Doshi-Velez and Kim, 2017; Sussillo and Barak, 2013) (Figure S7). At the moment, the feasibility of RI2FL for a new target can be tested only empirically by target-by-target training and characterization. The interpretability might reveal general principles governing light-matter interaction in biological systems and clarify the fundamental limits of RI2FL as well as other cross-modality approaches.

Figure S7.
  • Download figure
  • Open in new tab
Figure S7. Deep feature visualization.

In order to facilitate interpreting the operation of the trained networks, feature map activations for a single input tomogram were visualized. (A) Average feature map at each layer. (B and C) Individual feature maps at the last layers of encoder (B) and decoder (C) parts of the network inferring plasma membranes. All images represent maximum intensity projections of 3D data.

Author Contributions

Y.-J.J. and Y.-K.P. conceived the idea. Y.-J.J. coordinated the project and carried out all analyses. Y.-J.J., H.C., and H.-S.M. designed the deep learning pipeline, and H.C. implemented and optimized the pipeline. Y.-J.J., W.S.P., and W.D.H. designed the main experiments, and W.S.P. collected the data. W.S.P. and W.D.H. established the molecular biology and imaging protocols. G.K., D.R., Y.S.K., and M.L. contributed to processing the data. H.J., H.-H.J., and S.L. collected and processed the large FOV data. Y.-J.J., H.C., W.S.P., and Y.-K.P. wrote the manuscript with input from all co-authors. Y.-K.P. supervised all aspects of the work.

Declaration of Interests

H.C., M.L., H.-H.J., S.L., H.-S.M., and Y.-K.P. have financial interests in Tomocube Inc., a company that commercializes optical diffraction tomography and quantitative phase imaging instruments and is one of the sponsors of the work.

Methods

Sample preparation

NIH3T3 (ATCC CRL-1658), COS-7 (ATCC CRL-1651), HEK 293 (ATCC CRL-1573), and HeLa (ATCC CCL-2) cells, as well as primary-cultured astrocytes, were maintained in Dulbecco’s modified Eagle’s medium (DMEM; ATCC 30-2002) supplemented with 10% fetal bovine serum (Life Technologies) and 100 U/mL penicillin-streptomycin at 37°C in a 5% CO2 incubator. For MDA-MB-231 (ATCC HTB-26) cells, DMEM was replaced with DMEM/F12 medium (Gibco). Astrocytes were obtained by cortical and hippocampal dissection of embryos from C57BL/6J mice (Jackson Laboratory), followed by astrocyte enrichment with a glial culture medium. All animal procedures were performed according to the guidelines of the Animal Care and Use Committee at KAIST.

The fluorescence labeling strategies were as follows. Actin was stained by phalloidin (A12379; Invitrogen) or genetically labeled by expressing mCherry-Lifeact (constructed by inserting the F-actin peptide-encoding sequence into the mCherry-C1 vector). Mitochondria were stained by MitoTracker Red CMXRos (M7512; Invitrogen). Lipid droplets were stained by LipiDye (FDV-0010; Funakoshi). Plasma membranes were stained by CellMask (C10046; Invitrogen) or genetically labeled by expressing GFP-MEM. Nuclei were stained by Hoechst 33342 (H3570; Life Technologies). Nuclei were genetically labeled by expressing FBL-mCherry. For the chemogenetic stimulation experiment, Lyn-FRB and YFP-FKBP-RhoA(CA) were co-expressed in NIH3T3 cells. All genetic labeling processes were driven by CMV promoters. The cells were transfected via electroporation (Neon Transfection System; Invitrogen) under the following conditions: voltage, 1,280 V; pulse width, 20 ms; number of pulses, 2.

Holotomography-optimized cell culture dishes (Tomodish; Tomocube Inc.) were seeded with approximately 450,000 cells per dish. The culture dishes were coated with 0.01% poly-D-lysine for 15 minutes, washed three times with distilled water, and fully dried before usage.

Imaging and perturbation

Holotomography was conducted using standardized microscopes (HT-2; Tomocube Inc.) implementing optical diffraction tomography (ODT) to solve the RI-thickness coupling problem in phase images. The principles and implementations of ODT have been extensively reviewed elsewhere (Park et al., 2018). The specific implementation used here was based on holographic field retrieval with multi-angle Mach-Zehnder interferometry using coherent 532 nm laser light steered by a digital micromirror device. The 3D RI tomograms were reconstructed by mapping the measured field information to the 3D Fourier space and filling the missing cone using the non-negativity-constrained iterative algorithm. The acquisition time for a single volume was less than a second. In addition, we optionally utilized the three FL channels to measure the ground truth 3D FL tomograms using stacked widefield acquisition with a ~0.3 μm step size followed by 3D deconvolution (Kim et al., 2017) (excitation center wavelengths: 385 nm, 470 nm, and 565 nm). Excitation light intensity and exposure time were manually adjusted by trained cell biologists to clearly visualize the target structures. For the perturbation experiments, the concentrations of PDGF (PDGF-BB, PeproTech) and rapamycin (Calbiocam) in the imaging medium were 10 nM and 0.5 μM, respectively.

Data processing

All tomograms were resized to have a voxel size of 0.15×0.15×0.2 μm3 before inference. The default FOV used in training and evaluation was 512×512×64 voxels, which corresponds to a volume of 76.8×76.8×12.8 μm3 volume. This volume was then further subdivided for patched-based processing. The large FOV RI tomograms were obtained by offline 3D stitching after the rapid acquisition of slightly overlapping tiles of FOVs. Specifically, we estimated the tile-to-tile displacement using the phase correlation of the single-FOV RI tomograms, and the overlapping volumes were processed by image blending. The RI values were clipped into the range between 1.337 and 1.390. While we targeted the identical FOV for the RI and FL tomograms by sharing most optical path, we noticed a small axial discrepancy due to the intrinsic differences between the modalities (e.g., aberration and latency). In order to secure voxel-wise correspondence to facilitate supervised learning and evaluation, we estimated and corrected this discrepancy using the axial cross-correlation of the RI and FL tomograms. All the tomograms were manually inspected after the registration to be included in the dataset. It is worth noting that this procedure is not necessary after training and evaluation.

To reconstruct the stacked BF intensity images corresponding to the RI tomograms, we took advantage of the full optical field information measured by multi-angle interferometry (Mertz, 2019; Park et al., 2018). We first mapped the fields to a 2D Fourier space corresponding to the focal plane, thereby obtaining an amplitude and phase image with suppressed coherent noise due to the synthetic aperture. This 2D field was numerically refocused by convolving the propagation kernel with a varying distance of propagation. The BF intensity at each plane was calculated as the square of the amplitude image. We used MATLAB (MathWorks) to implement the BF reconstruction script.

Model design, training, and inference

We used a single network architecture for all subcellular targets in order to avoid optimizing individual target-specific architectures. We automatically designed a highly flexible architecture through a scalable neural architecture search (Kim et al., 2019) (SCNAS). Specifically, SCNAS utilizes a stochastic sampling algorithm in a gradient-based bi-level optimization framework to jointly search for the optimal network parameters at multiple levels with generic 3D medical imaging datasets. As a result, a U-Net-like encoder-decoder structure with skip connections was discovered (Figure S1). At the end of every micro-level architecture, known as a motif, we added a dropout operation. The network parameters were as follows: activation function, leaky ReLU; normalization function, instance normalization; size of initial feature map, 12; number of layers, 8; feature map multiplier, 3.

To promote precise inference of both the large- and small-scale structures in the FL tomograms, for network training, we used a loss function, l, with both mean squared error (MSE), lMSE, and gradient difference loss (GDL), lGDL, terms: l = lMSE + lGDL. Each term is defined as follows: Embedded Image where y and Embedded Image are the ground truth and inferred FL channel, respectively, h(·) is the 3D Sobel operator, and Embedded Image is the expectation over voxels and operations.

To train the networks, we used an Adam optimizer with an initial learning rate of 0.001, where the learning rate was reduced by a factor of 5 if there was no improvement in the validation metrics for 30 epochs. Randomly sampled parameters were used for data augmentation techniques such as flip, rotation, cropping, elastic deformation, and gamma correction. Hyperparameter optimization was based on a grid search algorithm whose search space consisted of hyperparameter combinations with similar memory and FLOPS requirements (Tan and Le, 2019). We used PyTorch in Python 3 to implement the deep learning pipeline.

Due to the memory constraints of GPU computing, we trained the networks using 3D patches instead of the whole tomograms. During training, the patches were randomly cropped from regions with registered FL data. For post-training inference, an RI tomogram was symmetrically padded, divided into overlapping patches with regular spacing, individually processed by the networks, and then stitched into whole FL tomograms with a spline kernel-based blending (Figure S2). The default size of a patch was 256×256×64 voxels.

Performance and uncertainty quantification

Three performance metrics were used, namely peak signal-to-noise ratio (PSNR), Pearson correlation coefficient (PCC), and structural similarity index (SSIM). Each metric is defined as follows: Embedded Image where μ and σ are the mean and standard deviation, respectively, covariance Embedded Image, c1 = 0.012, and c2 = 0.032 by default. Following convention, we used a non-zero minimum standard deviation for PCC and a 3D Gaussian kernel with a size of 7 voxels for SSIM. The metrics were complementary to each other. PSNR is relevant for MSE, but suffers from poor perceptual performance and noise vulnerability. PCC has a decent perceptual performance, but hardly captures local differences. SSIM, which is relatively complicated, is a comprehensive metric designed to overcome the shortcomings of PSNR or PCC. SSIM can be factored into three terms called luminance, contrast, and structure. The luminance term is similar to MSE but uses mean values instead of voxel values. The contrast term quantifies the similarity of high-frequency components relevant to the GDL. The structure term is nearly identical to PCC. Various quantifications of the three metrics are shown in Table S2.

Following recent Bayesian deep learning approaches for computer vision, two types of uncertainty were considered, namely data (aleatoric) and model (epistemic) uncertainty. While the precise origins and mathematical derivations have been extensively reviewed elsewhere (Kendall and Gal, 2017), here, we describe the uncertainty quantification schemes well-suited for RI2FL. Data uncertainty was quantified by test-time augmentation (Wang et al., 2019) with image transforms, such as flip and rotation, which was compatible with the aforementioned loss function. Model uncertainty was quantified using a Monte Carlo dropout (Gal and Ghahramani, 2016). In both cases, we quantified the mean and standard deviation in the FL output space upon the perturbation of either the data or model. The two calculated standard deviation maps defined the data and model uncertainty. The average of the two mean prediction maps defined the final inferred FL, which slightly increased the performance as well. We did not apply these schemes to the stitching or time-series (except for the cell division example) data due to high computational costs.

Single-cell profiling

Upon inference with RI2FL, a variety of open-source computational tools developed for FL data could be readily utilized. To segment the single cells and nuclei in the tomograms, we first trained a random forest voxel classifier, provided by Ilastik (Berg et al., 2019), based on the inferred nuclei and plasma membranes channels. The voxels were sparsely annotated as background, cytoplasm, or nucleus for a handful of tomograms, and the trained classifier generated the voxel-wise class probability maps for the entire dataset. Then the single nuclei could be readily segmented by thresholding the nuclei probability. The tentative cells, obtained by thresholding the summation of the cytoplasm and nuclei probability, were segmented by marker-controlled watershed segmentation, provided by CellProfiler (Bray et al., 2016), using the identified nuclei as segmentation markers. The segmentation performance was robust due to the high specificity of the inferred FL channels.

We extracted a variety of single-cell features from the segmented single cell/nuclei volume masks, the additional inferred FL channels (actin, mitochondria, lipid droplets, and nucleoli), and the measured RI channel aligned in a common coordinate system. The calculation of the mass-related features was based on the well-characterized linear dependence of RI, n(x,y,z), to the dry mass density, C(x,y,z), for biological samples (Barer, 1953; Park et al., 2018): n(x,y,z) = nm + αC(x,y,z), where nm and α are the RI of the imaging medium (nm = 1.337 at λ = 532 nm) and the RI increment (α = 0.190 mL/g at λ = 532 nm), respectively. The mass density and actin density described in the main text indicate the cellular dry mass density and cytoplasmic actin mean, respectively, in Table S3. For the time-lapse experiments, we used the framewise application of the segmentation and feature extraction procedures to quantify the feature dynamics. We used MATLAB (MathWorks) to implement the feature extraction script.

Uniform manifold approximation and projection (UMAP), implemented in Python 3, was used for the unsupervised nonlinear embedding of the features for 2D visualization (McInnes et al., 2018). All features were z-scored before UMAP, and the hyperparameters were as follows: minimum distance, 0.5; number of neighbors, 5.

Data and code availability

Data and materials availability: RI2FL source code and data are available at https://github.com/NySunShine/ri2fl.

Video Captions

Video S1. RI2FL across cell types and subcellular targets.

Video S2. Dynamics of cell division and RI2FL.

Video S3. Large-scale imaging and RI2FL.

Video S4. Growth factor stimulation.

Video S5. Chemogenetic stimulation.

Acknowledgments

We thank the members of KAIST Biomedical Optics Laboratory for helpful discussions. This work was supported by KAIST, BK21+ program, Tomocube Inc., and National Research Foundation of Korea (2017M3C1A3013923, 2015R1A3A2066550, and 2018K000396). Y.-J.J. was supported by a KAIST Presidential Fellowship, an Asan Foundation Biomedical Science Scholarship, and a Stanford Bio-X Bowes Fellowship.

Footnotes

  • ↵8 Lead Contact

REFERENCES

  1. ↵
    Baek, Y., Lee, K., Shin, S., and Park, Y. (2019). Kramers–Kronig holographic imaging for high-space-bandwidth product. Optica 6, 45–51.
    OpenUrl
  2. ↵
    Barer, R. (1953). Determination of dry mass, thickness, solid and water concentration in living cells. Nature 172, 1097–1098.
    OpenUrlCrossRefPubMedWeb of Science
  3. ↵
    Berg, S., Kutra, D., Kroeger, T., Straehle, C.N., Kausler, B.X., Haubold, C., Schiegg, M., Ales, J., Beier, T., and Rudy, M. (2019). Ilastik: interactive machine learning for (bio) image analysis. Nature Methods, 1–7.
  4. ↵
    Bray, M.-A., Singh, S., Han, H., Davis, C.T., Borgeson, B., Hartland, C., Kost-Alimova, M., Gustafsdottir, S.M., Gibson, C.C., and Carpenter, A.E. (2016). Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nature protocols 11, 1757.
    OpenUrl
  5. ↵
    Chen, M., Ren, D., Liu, H.-Y., Chowdhury, S., and Waller, L. (2020). Multi-layer Born multiplescattering model for 3D phase microscopy. Optica 7, 394–403.
    OpenUrlCrossRef
  6. ↵
    Christiansen, E.M., Yang, S.J., Ando, D.M., Javaherian, A., Skibinski, G., Lipnick, S., Mount, E., O’Neil, A., Shah, K., and Lee, A.K. (2018). In silico labeling: predicting fluorescent labels in unlabeled images. Cell 173, 792–803. e719.
    OpenUrlCrossRefPubMed
  7. ↵
    Cooper, K.L., Oh, S., Sung, Y., Dasari, R.R., Kirschner, M.W., and Tabin, C.J. (2013). Multiple phases of chondrocyte enlargement underlie differences in skeletal proportions. Nature 495, 375–378.
    OpenUrlCrossRefPubMed
  8. ↵
    Doshi-Velez, F., and Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:170208608.
  9. ↵
    Gal, Y., and Ghahramani, Z. (2016). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. Paper presented at: international conference on machine learning.
  10. ↵
    Inoue, T., Do Heo, W., Grimley, J.S., Wandless, T.J., and Meyer, T. (2005). An inducible translocation strategy to rapidly activate and inhibit small GTPase signaling pathways. Nature methods 2, 415–418.
    OpenUrl
  11. ↵
    Jo, Y., Cho, H., Lee, S.Y., Choi, G., Kim, G., Min, H., and Park, Y. (2019). Quantitative Phase Imaging and Artificial Intelligence: A Review. IEEE Journal of Selected Topics in Quantum Electronics 25, 1–14.
    OpenUrlCrossRef
  12. ↵
    Kendall, A., and Gal, Y. (2017). What uncertainties do we need in bayesian deep learning for computer vision? Paper presented at: Advances in neural information processing systems.
  13. ↵
    Kim, K., Kim, K.S., Park, H., Ye, J.C., and Park, Y. (2013). Real-time visualization of 3-D dynamic microscopic objects using optical diffraction tomography. Optics express 21, 32269–32278.
    OpenUrl
  14. ↵
    Kim, K., Park, W.S., Na, S., Kim, S., Kim, T., Do Heo, W., and Park, Y. (2017). Correlative three-dimensional fluorescence and refractive index tomography: bridging the gap between molecular specificity and quantitative bioimaging. Biomedical optics express 8, 5688–5697.
    OpenUrl
  15. ↵
    Kim, S., Kim, I., Lim, S., Baek, W., Kim, C., Cho, H., Yoon, B., and Kim, T. (2019). Scalable Neural Architecture Search for 3D Medical Image Segmentation. Paper presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer).
  16. ↵
    Lee, J., Kim, H., Cho, H., Jo, Y., Song, Y., Ahn, D., Lee, K., Park, Y., and Ye, S.-J. (2019). Deep-learning-based label-free segmentation of cell nuclei in time-lapse refractive index tomograms. IEEE Access 7, 83449–83460.
    OpenUrl
  17. ↵
    Lichtman, J.W., and Conchello, J.-A. (2005). Fluorescence microscopy. Nature methods 2, 910–919.
    OpenUrl
  18. ↵
    Lim, J., Ayoub, A.B., Antoine, E.E., and Psaltis, D. (2019). High-fidelity optical diffraction tomography of multiple scattering samples. Light: Science & Applications 8, 1–12.
    OpenUrl
  19. ↵
    Martin, K., Vilela, M., Jeon, N.L., Danuser, G., and Pertz, O. (2014). A growth factor-induced, spatially organizing cytoskeletal module enables rapid and persistent fibroblast migration. Developmental cell 30, 701–716.
    OpenUrlCrossRefPubMed
  20. ↵
    McInnes, L., Healy, J., and Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:180203426.
  21. ↵
    Mertz, J. (2019). Introduction to optical microscopy (Cambridge University Press).
  22. ↵
    Mir, M., Wang, Z., Shen, Z., Bednarz, M., Bashir, R., Golding, I., Prasanth, S.G., and Popescu, G. (2011). Optical measurement of cycle-dependent cell growth. Proceedings of the National Academy of Sciences 108, 13124–13129.
    OpenUrlAbstract/FREE Full Text
  23. ↵
    Nygate, Y.N., Levi, M., Mirsky, S.K., Turko, N.A., Rubin, M., Barnea, I., Dardikman-Yoffe, G., Haifler, M., Shalev, A., and Shaked, N.T. (2020). Holographic virtual staining of individual biological cells. Proceedings of the National Academy of Sciences 117, 9223–9231.
    OpenUrlAbstract/FREE Full Text
  24. ↵
    Ounkomol, C., Seshamani, S., Maleckar, M.M., Collman, F., and Johnson, G.R. (2018). Label-free prediction of three-dimensional fluorescence images from transmitted-light microscopy. Nature methods 15, 917–920.
    OpenUrl
  25. ↵
    Pandarinath, C., O’Shea, D.J., Collins, J., Jozefowicz, R., Stavisky, S.D., Kao, J.C., Trautmann, E.M., Kaufman, M.T., Ryu, S.I., Hochberg, L.R., et al. (2018). Inferring single-trial neural population dynamics using sequential auto-encoders. Nature Methods 15, 805–815.
    OpenUrl
  26. ↵
    Park, S., Ahn, J.W., Jo, Y., Kang, H.-Y., Kim, H.J., Cheon, Y., Kim, J.W., Park, Y., Lee, S., and Park, K. (2020). Label-Free Tomographic Imaging of Lipid Droplets in Foam Cells for Machine-Learning-Assisted Therapeutic Evaluation of Targeted Nanodrugs. ACS nano.
  27. ↵
    Park, Y., Depeursinge, C., and Popescu, G. (2018). Quantitative phase imaging in biomedicine. Nature Photonics 12, 578–589.
    OpenUrl
  28. ↵
    Rivenson, Y., Liu, T., Wei, Z., Zhang, Y., de Haan, K., and Ozcan, A. (2019). PhaseStain: the digital staining of label-free quantitative phase microscopy images using deep learning. Light: Science & Applications 8, 1–11.
    OpenUrl
  29. ↵
    Sussillo, D., and Barak, O. (2013). Opening the black box: low-dimensional dynamics in high-dimensional recurrent neural networks. Neural Computation 25, 626–649.
    OpenUrlCrossRefPubMed
  30. ↵
    Tan, M., and Le, Q.V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:190511946.
  31. ↵
    Topol, E.J. (2019). High-performance medicine: the convergence of human and artificial intelligence. Nature medicine 25, 44–56.
    OpenUrlCrossRefPubMed
  32. ↵
    Wang, G., Li, W., Aertsen, M., Deprest, J., Ourselin, S., and Vercauteren, T. (2019). Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks. Neurocomputing 338, 34–45.
    OpenUrl
  33. ↵
    Yoon, J., Jo, Y., Kim, Y.S., Yu, Y., Park, J., Lee, S., Park, W.S., and Park, Y. (2018). Label-Free Identification of Lymphocyte Subtypes Using Three-Dimensional Quantitative Phase Imaging and Machine Learning. JoVE (Journal of Visualized Experiments), e58305.
  34. ↵
    Zheng, G., Horstmeyer, R., and Yang, C. (2013). Wide-field, high-resolution Fourier ptychographic microscopy. Nature photonics 7, 739.
    OpenUrl
Back to top
PreviousNext
Posted September 18, 2020.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Data-driven multiplexed microtomography of endogenous subcellular dynamics
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Data-driven multiplexed microtomography of endogenous subcellular dynamics
YoungJu Jo, Hyungjoo Cho, Wei Sun Park, Geon Kim, Donghun Ryu, Young Seo Kim, Moosung Lee, Hosung Joo, HangHun Jo, Sumin Lee, Hyun-seok Min, Won Do Heo, YongKeun Park
bioRxiv 2020.09.16.300392; doi: https://doi.org/10.1101/2020.09.16.300392
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Data-driven multiplexed microtomography of endogenous subcellular dynamics
YoungJu Jo, Hyungjoo Cho, Wei Sun Park, Geon Kim, Donghun Ryu, Young Seo Kim, Moosung Lee, Hosung Joo, HangHun Jo, Sumin Lee, Hyun-seok Min, Won Do Heo, YongKeun Park
bioRxiv 2020.09.16.300392; doi: https://doi.org/10.1101/2020.09.16.300392

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioengineering
Subject Areas
All Articles
  • Animal Behavior and Cognition (4237)
  • Biochemistry (9158)
  • Bioengineering (6797)
  • Bioinformatics (24054)
  • Biophysics (12149)
  • Cancer Biology (9563)
  • Cell Biology (13816)
  • Clinical Trials (138)
  • Developmental Biology (7653)
  • Ecology (11731)
  • Epidemiology (2066)
  • Evolutionary Biology (15535)
  • Genetics (10664)
  • Genomics (14351)
  • Immunology (9504)
  • Microbiology (22883)
  • Molecular Biology (9119)
  • Neuroscience (49088)
  • Paleontology (357)
  • Pathology (1487)
  • Pharmacology and Toxicology (2576)
  • Physiology (3851)
  • Plant Biology (8349)
  • Scientific Communication and Education (1473)
  • Synthetic Biology (2299)
  • Systems Biology (6203)
  • Zoology (1302)