Automatic whole cell organelle segmentation in volumetric electron microscopy

Cells contain hundreds of different organelle and macromolecular assemblies intricately organized relative to each other to meet any cellular demands. Obtaining a complete understanding of their organization is challenging and requires nanometer-level, threedimensional reconstruction of whole cells. Even then, the immense size of datasets and large number of structures to be characterized requires generalizable, automatic methods. To meet this challenge, we developed an analysis pipeline for comprehensively reconstructing and analyzing the cellular organelles in entire cells imaged by focused ion beam scanning electron microscopy (FIB-SEM) at a near-isotropic size of 4 or 8 nm per voxel. The pipeline involved deep learning architectures trained on diverse samples for automatic reconstruction of 35 different cellular organelle classes - ranging from endoplasmic reticulum to microtubules to ribosomes - from multiple cell types. Automatic reconstructions were used to directly quantify various previously inaccessible metrics about these structures, including their spatial interactions. We show that automatic organelle reconstructions can also be used to automatically register light and electron microscopy images for correlative studies. We created an open data and open source web repository, OpenOrganelle, to share the data, computer code, and trained models, enabling scientists everywhere to query and further reconstruct the datasets.

datasets covering four different cell types (Extended Data Fig. 2). We further included 30 and 15 blocks that only contained extracellular space and nucleus, respectively.
The training blocks were used to train large, multi-channel 3D-U-Net architectures 29,30 to predict signed tanh boundary distances of the binary labels for each organelle class. This was shown to be beneficial compared to using a cross-entropy loss in prior work on synaptic cleft segmentation in EM 10 . We trained all networks to minimize the sum of mean squared errors with respect to the signed tanh boundary distance of each binary label (one per output channel).
Each takes on values between [-1, 1] and can be converted into binary labels by thresholding at 0. To test the effect of jointly training varying numbers of classes, one network was trained on all classes jointly ("all"), one network on the 14 classes that had positive annotations in at least 10 blocks ("many") and 14 specialized networks on up to four closely related classes ("few") (Supplementary Table 1). The specialized networks for some of the classes with very few positively annotated voxels in the ground truth data (i.e. ribosomes, vesicles, and Golgi) were trained with a sampling scheme that prioritized blocks containing instances of those labels.
Using the trained networks, we predicted organelle segmentation on whole cells excluding regions that contain only resin to save computation time . Depending on the network architecture, inference speeds varied between 0.5 and 4.8 on GeForceRTX2080Ti cards. For example, on a single GPU, inference for the 14 organelles in the type "many" network on the cell shown in Fig. 1a, jrc_hela-2 (74 gigavoxels) took approximately 12.5 hours.
We developed a manual evaluation method based on pairwise comparisons of whole-cell predictions (Supplementary Methods: Evaluations) to determine the optimal training iteration and network architecture for unseen datasets without the costly generation of validation blocks. Fig. 1b are the resulting raw predictions, smoothed with σ = 18 nm and rendered with cubic interpolation (Fig. 1b). The inset of Fig. 1b shows a region ~ 3 µm away from the inset in To quantitatively assess the performance of the machine learning method, as well as validate the manual approach, we annotated additional 4 μm 3 holdout blocks (Extended Data Fig. 2) in four of the datasets that had annotations for training. The holdout blocks did not contain instances of all organelles, so we only report results for classes present in all four. A comparison between test and validation performances is shown in Fig. 2a. We chose the Dice coefficient (F1 Score) as our performance measure (for alternative metrics see Extended Data  Note the close correlation between ground-truth-based (red) versus automatically reconstructed microtubules (cyan) (Fig. 2c, d).

Shown in
The complete post-refinement predictions for each dataset is presented in Fig. 2e Table 5). Using the holdout blocks, we confirmed that these estimates agree with those extracted from ground truth (Extended Data Fig. 7). Fig. 3a plots the volume proportion occupied by 8 different organelle classes throughout an entire cell, comparing two HeLa cells, a Jurkat cell and a macrophage cell. Of note, the jrc_jurkat-1 cell, an immortalized T lymphocyte, had a higher fraction of its total cell volume occupied by the nucleus (~40%)   compared to the other cell types, likely reflecting its origin as a leukaemic T-cell poised for proliferation. The jrc_macrophage-2 cell, a cell type involved in detection, phagocytosis, and destruction of bacteria and other agents, had an increased volume ratio of ER, which could be relevant for its extensive membrane trafficking activity.
Microtubules play a critical role in enabling organelles to distribute throughout the cell 33 . Our ability to segment out membrane-bound organelles as well as single microtubules enabled us to investigate the question of how many different organelles could be contacting a single . Comparing the ratio of planar and tubular regions between these two cell types (Fig. 3f), we found the jrc_hela-2 cell had fewer 2D planar regions than the jrc_macrophage-2 cell. In line with this qualitative assessment, a larger fraction of the planar ER regions were supported by mitochondria in the jrc_hela-2 cell (Fig. 3e, bottom right panel and Fig. 3g). These findings underscore the complexity of ER morphology and its relationship with organelles like mitochondria in different cell types.
Given the above differences between planar and tubular ER, we next examined the relative abundance of ribosomes that were associated with these ER domains. Using a contact site distance of 10 nm, we categorized ribosomes as either ER-bound or cytosolic. ER-bound ribosomes ( Fig. 3h and Supplementary Video 2) were further categorized based on whether they were bound to planar ER, tubular ER, or nuclear envelope regions ( Fig. 3h-i). The percentage of ribosomes bound to tubular peripheral ER and the nuclear envelope remained consistent across the data sets for the four different cells. By contrast, the relative percentage of ribosomes bound to planar peripheral ER was 3-fold higher in the jrc_macrophage-2 cell. As the jrc_macrophage-2 cell also had greater volume of ER than the other three cells examined (see satisfying, especially considering that our training data did not contain examples from these datasets. Fig. 4a shows the segmentation of another HeLa cell, which should be fairly well represented by our training data. Notably, our training data currently contains no tissue data which makes the early experiments on a sample of a chemically fixed, mouse choroid plexus 38 (shown in Fig. 4b) particularly interesting for a future extension of the approach to tissue data and different fixation protocols. For a quantitative assessment we reused the holdout blocks with downsampled raw data as we did for training. As shown in Fig. 4c, for most organelles, the quality of segmentation suffers only slightly or, for larger organelles such as mitochondria, improves a bit compared to full resolution input.
Fluorescent light microscopy (LM) in combination with highly specific molecular markers provides complementary information to high resolution non-specific EM. To associate correlative LM and EM (CLEM) acquisitions of the same sample, the acquisitions need to be registered. This is challenging due to resolution and contrast differences between the modalities, and nonlinear transformations induced during sample preparation and can take a skilled user several days to accomplish. Using our predictions for mitochondrial membranes, we automated the registration process for CLEM datasets. We demonstrate this using a previously published COS-7 cell transiently expressing ER luminal and mitochondria membrane markers, mEmerald-ER3 and Halo/JF525-TOMM20, respectively 39 , imaged by both PALM and SIM, and FIB-SEM Supplementary Video 3 show the registration results. We report the Jacobian determinant of the transformation field to measure local distortion resulting from registration. Fig. 4e shows a slice through these measurements, the cell volume is outlined for reference. Fig. 4f shows a histogram computed over the whole volume. The narrow standard deviation of this distribution (0.05) indicates that the transformation field is smooth.
We evaluated the registration accuracy with respect to human-generated ground truth by using box in Fig. 4e. These landmarks were placed at corresponding points in the ER light channel and ER predictions of the EM image that were not used for automatic registration. This enables us to measure errors in an unbiased way, with respect to the true underlying transformation, not only the "part" of the transformation that can be inferred from the mitochondria membrane channel. Horizontal white lines in Fig We compared the independent registrations of the EM to both PALM and SIM. Fig. 4h shows a map of the spatial error between the two transformations. Fig. 4i shows a histogram of spatial error over a region where cells are present, indicated by the white dotted line in Fig. 4h. Errors are small, especially near mitochondria, indicating that the registration is consistent across modalities. Altogether we show that using automatic organelle segmentations can be used to successfully register CLEM images, making it accessible to less experienced users, and reducing the time from days to an hour or less.
In conclusion, large-scale reconstruction of complete cells and tissues imaged with FIB-SEM 2 promises to greatly expand our understanding of the structure and organization of cellular organelle interactions. Fully automatic reconstruction of cellular organelle ensembles is the only economically feasible way to exhaustively analyze these large datasets. To be readily applicable to new questions and domains, it is necessary that automatic reconstruction methods generalize robustly across cell-types, tissues, and preparation methods. Until today, state-of-the-art methods are specialists that are designed for, trained on, and applied to very specific, yet often very large datasets and a small number of cellular organelles. These methods fail when presented unseen samples and require costly re-design and re-training.
We described here a first successful method to fully automatically reconstruct a large number of cellular organelles from FIB-SEM volumes of diverse cell types. We generated carefully We demonstrated that we can extract quantitative information about the distribution, size, shape and other structural properties as well as the interaction between different cellular organelles directly from automatic reconstructions of complete cells. This comprehensive information, enabling the global comparison of various cell types at the ultra-structural level, has not been previously accessible through either sparsely labeled light microscopy or focused local reconstructions on electron microscopy data. It will enable us to establish a better integrated view of cellular function and interactions between cells in tissues. We further showed that automatic organelle reconstructions enable fully automatic registration of light and electron microscopy for correlative studies, previously an error-prone manual process.
The capability of our networks to generalize across cell-types is a large step towards a pushbutton solution to analyze whole cell FIB-SEM volumes fully automatically. However, it is important to understand that more work is necessary to achieve better performance in diverse tissue samples and cell types that are vastly different from those included in our training data, and for organelles and structure that are currently underrepresented. In the Supplementary Discussion, we analyze this in detail and outline our future plans.   (b) Comparison of networks using different multi-class strategies. Each data point represents the F1 score (test performance) on a holdout block with the color denoting the multi-class strategy ("all"/"many"/"few").
(c) 2D FIB-SEM slice with ground truth and reconstructed microtubules after refinement in the plane, 3D renderings of the ground truth (red) and reconstructed microtubules (cyan) in a selected test block on jrc_hela-2.   (g) Quantification of the peripheral ER curvature at contact sites between peripheral ER and mitochondria, for jrc_hela-2 and jrc_macrophage-2.
(h) 3D rendering of the peripheral ER-bound ribosomes (those within 10 nm of the ER) in jrc_hela-2. Ribosomes are color coded based on whether they are contacting planar ER (planar measure ≥ 0.6, green) or tubular ER (planar measure < 0.6, blue).

Training data and machine learning
Organelles were manually identified using morphological features established in the literature 6 .
Because every voxel within an annotated block must be classified, 'hollow' organelles e.g. endoplasmic reticulum, which contains lumen bound by a membrane, are defined by 'lumen' (lum) and 'membrane' (mem), or 'inside' (in) and 'outside' (out) depending on organelle composition. Examples of each class can be found in Extended Data Fig. 1.
We manually segmented blocks at 2 nm resolution from datasets of HeLa, Jurkat, Macrophage and SUM159 cells acquired at 4 nm resolution using Amira-Avizo (ThermoFisher) and BigCat 7 .
The expert annotators used various image processing filters to enhance raw data, miscellaneous selection tools for label assignment and preliminary 3D rendering as a contextual guide. After final cross-annotator quality checks each block was stored as a N5 dataset 5 in preparation for model training. To optimize the type of network as well as the training iteration for each label and dataset we used a manual evaluation method based on repeated pairwise comparisons of random crops from two predictions. We validated this method using holdout blocks from four datasets on which we computed validation and test F1 scores by comparing to the thresholded predictions.

Refinements and quantifications
To obtain organelle instance information from the predictions, a number of refinements are performed on the predictions.
In most cases, we first smoothed the predictions with a Gaussian filter (σ = 12 nm) before

CLEM registration
We downsampled and blurred mitochondria membrane predictions to generate a "synthetic light" image. We registered this image to light images labelling Halo/JF525-TOMM20. First, a human annotator coarsely masks corresponding cells in both images for automatic processing.
We used elastix 18 to register these images in two steps: affine-only followed by deformable. For additional details, see Supplementary Methods: CLEM Registration.

Open data and open source
We are committed to sharing the raw and derived datasets presented in this paper with the broader scientific community, for educational purposes and further analysis and discovery.
Sharing data at this scale presents a technical challenge, which we addressed through a combination of technologies: we stored datasets on the cloud using Amazon Web Services (AWS) S3 storage platform using a variety of "cloud-friendly" data formats. To make this collection of datasets viewable, we built a web platform "OpenOrganelle" (openorganelle.janelia.org) that provides a catalog of datasets and access to tools for data visualization.

Cloud Storage
We used AWS S3 for hosting our public-facing datasets, as the AWS S3 cloud storage platform offers high scalability, availability, and extensively documented application programming

Data formats
We stored all of our datasets in "cloud-friendly" formats, i.e. file formats that enable web services to easily access data using standard web communication protocols like hypertext transfer protocol (HTTP). For large imaging datasets a cloud-friendly format is one that provides a web-compatible interface for making spatial queries against the data. This constraint ruled out traditional bioimaging formats like mrc, tiff and hdf5 because these formats do not expose their spatial structure to HTTP. Thus we exclusively stored volumetric image data using N5 and "Neuroglancer precomputed" file formats, both of which represent image volumes as collections of separate files, or "chunks". When hosted on a cloud storage platform like S3, each chunk has its own URL and thus can be individually accessed over HTTP. To save storage space and speed up data transmission each chunk is compressed with a browser-compatible compression scheme (GZip for lossless compression or JPEG for lossy compression).

OpenOrganelle
Although our data can be publicly accessed directly through the cloud storage API, this is not particularly user friendly or amenable to data exploration.

Data organization
We stored all of our large volumetric datasets as image pyramids, i.e. a given volume is stored as a collection of arrays at different levels of detail. Although this approach increased the amount of storage space needed, it is the preferred format for interactively visualizing large volumes. Creating an image pyramid involves repeatedly downsampling an image volume using increasing downsampling factors. We performed downsampling by partitioning the source data into a collection of contiguous cubes of voxels with edge length 2 N , where N ranged from 1 to 5 and computing a reducing function over the values within each cube. For image volumes with voxel values representing a scalar quantity, such as EM volumes or unrefined predictions, that reducing function was the arithmetic mean, while for volumes representing labels and segment IDs (e.g., segmentation volumes) we reduced by computing the modal value in each cube. Most of our volumes are stored in the N5 format, in which case we stored each image pyramid as a collection of datasets in a group. The group metadata contains an enumeration of the datasets that comprise the multiresolution pyramid, as well as a specification of the spatial transformation for each dataset. A minority of our volumes (those that warranted lossy jpeg compression) are stored in the Neuroglancer precomputed format, which is specific to the Neuroglancer viewer tool and natively supports image pyramids through its own metadata specification.

Data visualization and access
Data was made available for online viewing via Neuroglancer 19,21 , an open source WebGL data viewer. Neuroglancer provides intuitive navigation of 3D datasets and supports orthogonal slice views, image pyramids, mesh rendering and a variety of other features making it ideal for displaying the large 3D datasets produced by this project. For visualization, we provide the raw data, predictions, segmentations, contact sites, skeletons and medial surfaces discussed throughout the text. For select organelles and contact sites, we also include meshes 22 in the Neuroglancer legacy mesh format.
We provide two plugins for Fiji 23 to access and interact with the data we provide. N5-Viewer enables viewing these large datasets in their entirety with BigDataViewer 24 . N5-ij 25 allows users to open datasets or subsets of datasets in Fiji to perform their own analyses.
To download datasets, we recommend using the AWS command line interface (AWS CLI).
Instructions for using this tool can be found on their user guide 26 .
We have also made data analysis results available as a Neo4j graph database (downloadable from openorganelle.janelia.org). Neo4j allows users to easily interact with and visualize the data, including organelle properties like volume and surface area and contact site properties like contact area and planarity. Neo4J also allows users to create custom queries to search through and analyze data and relationships in novel ways.
We welcome feedback about how these data are used and your biological discoveries. We hope this will inspire new insights and expect they will generate more questions than answers. Please email us at cosemdata@janelia.hhmi.org

Data Availability
All data, software, and source code generated and analyzed during this study can be found at https://openorganelle.janelia.org/.

Code Availability
All data, software, and source code generated and analyzed during this study can be found at https://openorganelle.janelia.org/.