Abstract
Understanding the function of the nervous system necessitates mapping the spatial distributions of its constituent cells defined by function, anatomy or gene expression. Recently, developments in tissue preparation and microscopy allow cellular populations to be imaged throughout the entire rodent brain. However, mapping these neurons manually is prone to bias and is often impractically time consuming. Here we present an open-source algorithm for fully automated 3D detection of neuronal somata in mouse whole-brain microscopy images using standard desktop computer hardware. We demonstrate the applicability and power of our approach by mapping the brain-wide locations of large populations of cells labeled with cytoplasmic fluorescent proteins expressed via retrograde trans-synaptic viral infection.
Introduction
To understand the circuits underlying computations in the brain, it is necessary to map cell types, connections and activity across the entire structure. Advances in labelling (1–3), tissue clearing (4–6) and imaging (7–12) now allow for the meso- and microscopic study of brain structure and function across the rodent brain. Analysis of these whole-brain images has lagged behind the developments in imaging. Although there are many relevant commercial and open-source bio-image analysis packages available (13–16), these have traditionally been developed for 2D images or for 3D volumes much smaller than a rodent brain.
In rodent studies, an increasingly common whole-brain image analysis task is the identification of individual, labelled cells across the entire brain. Traditionally, this was carried out manually (17–20), but this approach does not scale to all biological questions, particularly when many cells are labelled. Considering that a mouse brain has around 100 million neurons (21), even if only 0.01% of cells in the brain are labelled, a manual approach becomes impractical for any kind of routine analysis. Furthermore, considering the difficulty and computational complexity of 3D analysis, a number of studies have used automated approaches to segment neurons across the brain in serial 2D sections (22–27).
These existing methods can suffer from several sources of bias. Manual analysis is subjective, and 2D analysis can under, or overestimate cell numbers depending on sampling in the third dimension. To our knowledge, one study has shown unbiased 3D cell identification in whole-brain images (28), however in this case nuclear labels were used. Although nuclear labels are much simpler to detect than membrane or cytoplasmic markers (as they have a simple shape and can be approximated as spheres and are far less likely to be overlapping in the image) there are many applications in which a nuclear label is not practical or even useful, as in the case of in vivo functional imaging. Detecting cells using cytosolic reporters in 3D throughout an entire brain therefore remains an unmet and highly desired goal within systems neuroscience. To overcome the limitations of traditional computer vision, machine learning — and particularly deep learning (29) — has revolutionised the analysis of biomedical and cellular imaging (30). Deep neural networks (DNNs) now represent the state of the art for the majority of image analysis, and have been applied to analyse whole-brain images, to detect cells in 2D (22, 27) or to segment axons (31). However, they have two main disadvantages when it comes to 3D whole brain analysis. Firstly, they require large amounts of manually-annotated training data (e.g. for cell segmentation, this would potentially require the painstaking annotation of hundreds or thousands of cell borders in 3D). Secondly, the complex architecture of DNNs means that for big data (e.g. whole-brain images at cellular resolution), large amounts of computing infrastructure is required to train these networks, and then process the images in a reasonable time frame.
To harness the power of deep learning for 3D identification of labelled cells in whole-brain images, we developed a computational pipeline which uses classical image analysis approaches to detect potentially labelled cells with high sensitivity (cell candidates), at the expense of detecting false positives (i.e. geometrically similar objects). This is then followed by application of a DNN to classify cell candidates as either true cells, or artefacts to be rejected. Harnessing the power of deep learning for object classification rather than cell segmentation at a voxel level speeds up analysis (since there are billions of voxels, but many fewer cell candidates) and simplifies the generation of training data. Rather than annotating cell borders in 3D, cell candidates from the initial step can be further classified by the addition of a single (cell or artefact) label.
Results
To illustrate the problem and to demonstrate the software, whole mouse brain images were acquired following retrograde rabies virus labelling. Viral injections were performed into visual or retrosplenial cortex, causing many thousands of cells to be cytoplasmically labelled throughout the brain. Data was acquired using serial two-photon microscopy as previously described (18) (Fig. 1). Briefly, coronal sections are imaged at high-resolution (2 μm × 2 μm × 5 μm voxel size) and stitched to provide a complete coronal section. This is carried out for ten imaging planes after which a microtome removes the most superficial 50 μm of tissue and the process is repeated until the entire brain data set is collected. Light emitted from the specimen is filtered and detected via at least two channels, a primary signal channel containing the fluorescence signal from labelled target neurons and a secondary ‘autofluorescence’ channel that does not contain target signals but provides anatomical outlines. An example single-plane image is shown in Fig. 2a.
A: 50 μm of tissue (between 40 μm to 90 μm below the tissue surface) is imaged and then an in-built microtome physically removes a 50 micron thick section from the surface. This process is repeated to generate a complete 3D dataset of the brain. B: The emitted light is split into two channels whereby the primary channel contains the fluorescence signal of interest from labelled cells (e.g. mCherry at 610 μm) and the second channel (e.g. at 450 μm) contains tissue autofluorescence signal that reveals gross anatomical structure.
A: Raw data (single coronal plane) B: Enlarged insert of cortical region from A showing structural features (artefacts) often erroneously detected. C: Detected cell candidates overlaid on raw data. Labelled cells as well as numerous artefacts are detected. D: Illustration of training data. A small number of cells (yellow) and artefacts (purple) selected to train the network. E: Classified cell candidates. The cell classification network correctly rejects the initial false positives.
Cell candidate detection
When developing any object detection algorithm, a balance must be struck between false positives and false negatives. In traditional (two-dimensional) histology, simple thresholding (e.g. (32)) can often work well for cell detection. This does not necessarily apply to whole brain images. In samples with bright, non-cellular structures (artefacts, Fig. 2b) or lower signal to noise ratio, simple thresholding can detect many non-cellular elements. Image preprocessing and subsequent curation of detected objects can overcome some of these issues, but no single method works reliably across the brain in multiple samples. Either some cells are missed (false negatives), or many artefacts are also detected (false positives). To overcome this, a traditional image analysis approach was used to detect cell candidates, i.e. objects of approximately the correct brightness and size to be a cell. This list of candidates is then later refined by the deep learning step. Crucially, this refinement allows the traditional analysis to produce many false positives while minimising the number of false negatives. Images are median filtered and a Laplacian of Gaussian filter is used to enhance cell-like objects. The resulting image is thresholded, and objects of approximately the correct size are saved as candidate cell positions (Fig. 2c). The thresholding is tuned to pick up every detectable cell but this also results in the detection of many false positives that often appear as debris on the surface of the brain and in some cases unidentified objects within blood vessels (Fig. 2b-e).
Cell candidate classification using deep learning
A classification step, which uses a 3D adaptation of the ResNet (33) convolutional neural network (Figs. S1 & S2) is then used to separate true from false positives. To classify cell candidates, a subset of cell candidate positions were manually annotated (e.g. Fig. 2d). In total, ~100,000 cell candidates (50,653 cells and 56,902 non-cells) were labelled from five brains. Small cuboids of 50 × 50 × 100 μm around each candidate were extracted from the primary signal channel along with the corresponding cuboid in the secondary autofluorescence channel (Fig. 3a). This allows the network to “learn” the difference between neuron-based signals (only present in the primary signal channel), and other non-neuronal sources of fluorescence (potentially present in both channels).
A: The input data to the modified ResNet are 3D image cuboids (50 μm × 50 μm × 100 μm) centered on each cell candidate. There are two cuboids, one from the primary signal channel, and one from the secondary autofluorescence channel. The data is then fed through the network, resulting in a binary label of cell or non-cell. During training the network “learns” to distinguish true cells, from other bright non-cellular objects. See Figs. S1 & S2 for more details of the 3D ResNet architecture. B: Classification accuracy as a function of training data quantity.
The trained classification network is then applied to classify the cell candidates from the initial detection step (Fig. 2e). The artefacts (such as those at the surface of the brain and in vessels) have been correctly rejected, while correctly classifying the labelled cells. To quantify the performance of the classification network, and to assess how much training data is required for good performance, the manually annotated training data was split up into a new training dataset from four brains, and a test dataset from the fifth brain. A new network was trained on subsets of the training data, and performance tested on the fifth brain (15,872 cells and 18,168 non-cells). Fig. 3b shows that relatively little training data was required for good performance on unseen test data, with 95% of cell candidates classified correctly with ~7,000 annotated cell candidates.
Application
To illustrate the method, the cell detection software was applied to data which was not used to develop or train the classification network. Neurons presynaptic to inhibitory retrosplenial cortical cells were labeled using rabies virus tracing (expressing mCherry) in a GAD2-Cre mouse. On a desktop computer workstation (with 2 × 14 core Intel Xeon Gold 6132 CPUs and an NVIDIA TITAN RTX GPU) detection of 60,881 cell candidates took 65 minutes and classification (resulting in 11,469 cells) took 72 minutes (total of 2 hours and 17 minutes).
To assign detected cells to a brain region, the Allen Mouse Brain Reference Atlas (ARA (34)) annotations were registered to the secondary autofluorescence channel using brainreg (35), a Python port of the validated aMAP pipeline (36).
These annotations were overlaid on the raw data (Fig. 4a), and cell positions warped to the ARA coordinate space (Fig. 4b), to visualise the distribution of detected cells. The number of cells in each brain region were reported, allowing for quantitative analysis (Table 1).
A: Detected cells (yellow) overlaid on raw data, along with the brain region segmentation. B: Visualisation of detected cells warped to the ARA standard space in 3D (yellow), along with the injection site target (retrosplenial cortex, green).
Ten regions with the greatest number of cells shown (193 regions in total).
Discussion
Mapping the distribution of labelled neurons across the brain is critical for a complete understanding of the information pathways that underlie brain function. Many existing methods for cell detection in whole-brain images rely on classical image processing, which can be affected by noise, and may not detect complex cell morphologies. DNNs can be used for highly sensitive image processing, but often require laborious generation of training data and are prohibitively slow for the analysis of large, 3D images. The presented method here overcomes these limitations by combining traditional image processing methods for speed, with a DNN to improve accuracy.
Recent developments in microscopy technology (e.g. (12)) now allow for quicker, more routine acquisition of whole-brain datasets. It is important that the image analysis can be carried out in a timely fashion, and without relying on large-scale computing infrastructure. Processing time for the 240GB image in Fig. 4 on a desktop workstation was less than three hours, so eight datasets could be analysed in a single day, likely much quicker than the sample preparation and imaging steps. Once parameters are optimised, and the classification network is trained, the software can run entirely without user intervention.
In traditional DNN approaches for image analysis, generation of training data is often a major bottleneck. While largescale “citizen science” approaches can be used to generate large amounts of training data (37), this is not practical for the majority of applications, e.g. when anatomical expertise is required. Our method overcomes this by requiring only a binary label (cell or non-cell) for each cell candidate in the training dataset, rather than a painstaking 3D outline of each cell. Considering the classification performance is very good (95%, Fig. 3b) with ~7000 annotated cell candidates, sufficient training data can be annotated in a single day. The amount of training data required is also likely to be much lower if the new data is used to re-train the existing pre-trained network, rather than starting from scratch.
The ability to quickly detect, visualise and analyse cytoplasmically labelled cells across the mouse brain brings a number of advantages over existing methods. Analysing an entire brain rather than 2D sections has the potential to detect many more cells, increasing the statistical power and the likelihood of finding novel results, particularly when studying rare cell types. Whole-brain analysis is also less biased than analysing a series of 2D planes, especially in regions with low cell densities, or differing cell sizes.
This software is fully open-source, and has been written with further development and collaboration in mind. In future we aim to adapt the network to be flexible as to the number of input channels, and output labels. Analysing a single channel would allow half as much data to be collected (although autofluorescence channels are optimal for atlas registration). Training a network to produce multiple labels (rather than just cell or non-cell) would allow for cell-type classification based on morphology, or based on gene or protein expression levels if additional signal channels were supplied. Lastly, although this approach was designed for fast analysis of large whole-brain datasets, the proposed two-step approach could be used for any kind of large-scale 3D object detection.
DATA AVAILABILITY
The methods outlined in this manuscript are available within the cellfinder software, part of the BrainGlobe suite of computational neuroanatomy tools. The software is open-source, written in Python 3 and runs on standard desktop computing hardware (although a CUDA compatible GPU allows for a considerable reduction in processing time). Source code is available at github.com/brainglobe/cellfinder and pre-built wheels at pypi.org/project/cellfinder. Documentation, including tutorials for training the network, and for analysing the data in Fig. 4 is available at docs.brainglobe.info/cellfinder.
Materials and methods
All experiments were carried out in accordance with the UK Home Office regulations (Animal Welfare Act 2006) and the local Animal Welfare and Ethical Review Body.
Sample preparation
All mice used were transgenic Crereporter (Ntsr1-Cre, GAD2-IRES-Cre & Rbp4-Cre) mice bred on a C57BL/6 background. The mice were anesthetized and an AAV Cre-dependent helper virus encoding both the envelope protein receptor and the rabies virus glycoprotein was stereotactically injected into visual cortex or retrosplenial cortex. Four days later, a glycoprotein deficient form of the rabies virus expressing mCherry was delivered into the same site. After ten further days, the animal was deeply anaesthetized and transcardially perfused with cold phosphate buffer (0.1 M) followed by 4% paraformaldehyde (PFA) in PB (0.1 M) and the brain left overnight in 4% PFA at 4 °C.
Imaging
For imaging, the fixed brain was embedded in 4% agar and placed under a two-photon microscope containing an integrated vibrating microtome and a motorized x-y-z stage as previously described (18). Coronal images were acquired via two optical pathways (red and blue) as a set of 6 by 9 tiles with a voxel size of 1 μm × 1 μm obtained every 5 μm using an Olympus 10x objective (NA = 0.6) mounted on a piezoelectric element (Physik Instrumente, Germany). Following acquisition, image tiles were corrected for uneven illumination by subtraction of an average image from each physical section. Tiles were then stitched using a custom FIJI (13) plugin (modified from (38)) and downsampled to 2 μm × 2 μm × 5 μm voxel size.
Cell candidate detection
To detect cell candidates (broadly defined as anything of sufficient brightness and of approximately the correct size to be a cell), initially data from the primary signal channel was processed in 2D. Images were median filtered, and then a Laplacian of Gaussian filter was performed to enhance small, bright structures (e.g. cells).
This filtered image was thresholded, and passed to 3D filters. An ellipsoidal filter was first used to remove noise. Every position of this spatial filter in which the majority of the filter overlaps with thresholded voxels was saved as a potential cell candidate. This is used to remove noise (from e.g. neurites). Individually detected cell candidates were then merged (if close together) or split (if the total cell volume is too large), and the coordinates were saved as an xml file. These steps were all carried out using the default software parameters (Table 2).
Cell candidate classification using deep learning
Cell candidates were classified using a ResNet (33), implemented in Keras (39) for TensorFlow (40). 3D adaptations of all networks from the original paper are implemented in the software (i.e. 18, 34, 50, 101 and 152-layer) and can be chosen by the user, but the 50-layer network was used throughout this study. The general architecture of these networks is shown in Figs. S1 & S2. To generate training data, output from the candidate detection step (cell candidate coordinates) were manually classified using a custom FIJI (13) plugin (for users of the released software, an integrated tool using napari is supplied). Image cuboids of 50 μm × 50 μm × 100 μm (resampled to 50 × 50 × 20 voxels) were extracted from both the primary signal, and secondary autofluorescence channels, centered on the coordinates of the manually classified cell candidate positions. To increase the size of the training set, data were randomly augmented. Each of the following transformations were applied with a 10% likelihood:
Flipping in any of the three axes
Rotation around any of the three axes (between −45° to 45°)
Circular translation along any of the three axes (up to 5% of the axis length)
The networks were trained with a batch size of 32 and the Adam (42) method was used to minimise the loss (categorical cross entropy), with a learning rate of 0.0001. Cell candidates were classified using the trained network, and saved as an xml file with a cell or artefact label.
Image registration and segmentation
To allow detected cells to be assigned an anatomical label, and for them to be analysed in a common coordinate framework, a reference atlas (Allen Mouse Brain Atlas (34), provided by the BrainGlobe Atlas API (43)) was registered to the autofluorescence channel. This was carried out using brainreg (35), a Python port of the automatic mouse atlas propagation (aMAP) software (36), which itself relies on the medical image registration library, niftyreg (44). Firstly the sample brain was downsampled to the same voxel spacing as the atlas (10 μm isotropic) and was reoriented to the atlas template brain. These two images were then filtered to remove-high frequency noise (greyscale opening and flat-field correction). The images were firstly registered using an affine transform (reg_aladin (45)), followed by freeform non-linear registration (reg_f3d (44)). The resulting transformation was applied to the atlas brain region annotations (and a custom hemispheres atlas) to bring it into the same coordinate space as the sample brain.
Analysis
For numerical analysis the atlas annotations in the sample coordinate space were used to determine the number of cells in each brain region. For visualisation of data in standard space, detected cells must be transformed to the atlas coordinate space. Firstly, the affine transform from the initial registration was inverted (using reg_transform). The sample brain was then registered non-linearly to the atlas (again using reg_f3d) and a deformation field (mapping points in the sample brain to the atlas) was generated (using reg_transform). This deformation field was applied to the coordinates of the detected cells for each sample, transforming them into atlas coordinate space.
Supplementary figures
Architecture of the 3D resnet
Architecture of the bottleneck 3D resnet
ACKNOWLEDGEMENTS
This work was supported by grants from the Gatsby Charitable Foundation (GAT3361) and Wellcome Trust (090843/F/09/Z and 214333/Z/18/Z) to T.W.M.. This manuscript was typset using a modified version of the HenriquesLab bioRxiv template.