Breast Tumor Cellularity Assessment using Deep Neural Networks

Alexander Rakhlin; Alexey A. Shvets; Alexandr A. Kalinin; Aleksei Tiulpin; Vladimir I. Iglovikov; Sergey Nikolenko

doi:10.1101/628693

Abstract

Breast cancer is one of the main causes of death world-wide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumor’s response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patient’s survival. In current clinical practice, tumor cellularity is manually estimated by pathologists; this process is tedious and prone to errors or low agreement rates between assessors. In this work, we evaluated three strong novel Deep Learning-based approaches for automatic assessment of tumor cellularity from posttreated breast surgical specimens stained with hematoxylin and eosin. We validated the proposed methods on the BreastPathQ SPIE challenge dataset that consisted of 2395 image patches selected from whole slide images acquired from 64 patients. Compared to expert pathologist scoring, our best performing method yielded the Cohen’s kappa coefficient of 0.70 (vs. 0.42 previously known in literature) and the intra-class correlation coefficient of 0.89 (vs. 0.83). Our results suggest that Deep Learning-based methods have a significant potential to alleviate the burden on pathologists, enhance the diagnostic workflow, and, thereby, facilitate better clinical outcomes in breast cancer treatment.

1. Introduction

Breast cancer is one of the most common cancer types diagnosed in women in the United States and worldwide [1]. Biopsies and histological assessment allow pathologists to analyze microscopic structures of breast tissues and, in particular, assess the cancer’s aggressiveness.

Multiple options are available to manage and monitor the breast cancer treatment based on the information provided from the tumor’s response to it. In addition to the treatment effect on the tumor size, the therapy may also alter the tumor’s cellularity [2]. During anticancer therapy, the size of the tumor may remain the same, but the overall cellularity may be drastically reduced [3]. As a result, it makes the residual tumor cellularity an important factor in assessing the response treatment.

Currently, tumor cellularity is manually assessed by pathologists from hematoxylin and eosin (H&E)-stained slides [3]. The costs of such estimation are high, the process is tedious and subjective, and the quality and reliability might be also be affected by high inter-observer variability even among senior pathologists. This potentially may affect prognostic power assessment in clinical trials [4]. The subjectivity in visual tissue assessment motivates the use of computer-aided methods to improve the diagnosis accuracy, reduce human error and increase inter-observer agreement and reproducibility [5, 6]. Automated analysis of the H&E slide using computer vision could provide immediate benefits to patient care. Recent success in Deep Learning (DL) [7, 8], and in particular the advances in convolutional neural networks (CNN), have recently shown high potential in this realm [9].

In this work, we evaluate three DL-methods to score the cellularity of the breast tissue from histopathological images. In particular, our first approach employs a weakly-supervised segmentation model with Resnet-34 [10] encoder and Feature Pyramid Network (FPN)[11] and a second-stage regression network that predicts the cellularity score using the predicted segmentation maps. Our second approach is also based on segmentation, however, instead of using the segmentation maps directly, we extract various features from them and use the gradient boosting trees (GBT) [12] to predict the cellularity score. Finally, we also evaluate using H&E image patches directly to predict the cellularity score.

2. Related work

CNNs have recently been successfully applied to many tasks in biomedical image analysis, often outperforming conventional machine learning methods [9, 13, 14]. As such, they have successfully been utilized for digital pathology image analysis and have demonstrated great potential for improving breast cancer diagnostics [15, 16, 17, 18].

Although there are not many studies focusing directly on automated quantitative cellularity assessment, it has been shown that this task can be solved by first segmenting malignant cells and then computing the tumor’s area [19]. Many efforts have been devoted to developing supervised and unsupervised methods for automated cell and nuclear segmentation and detection [20, 21]. Supervised segmentation models have superior performance but require hand-labeled nuclear mask annotations [20]. In these approaches, segmented nuclear bodies are used to extract features that are typically inspired by visual markers recognized by pathologists. Commonly used features describe morphology, texture, and spatial relationships among cell nuclei in tissue [19, 22].

The conventional approach most relevant to our work is by Peikari et al. [19] who proposed an automated cellularity assesement protocol. First, they used smaller patches, or regions of interest (RoI), extracted from whole slide images to segment all present cell nuclei. Then they extracted a number of predefined features from segmented nuclei and used support vector machines to distinguish lymphocytes and normal epithelial nuclei from malignant ones. Cellularity estimation was done using distinguished malignant epithelial figures for every RoI.

Alternatively, segmentation-free methods that directly estimate cellularity from histopathology imaging data and nuclei locations annotated by human observers were also shown promising. In particular, Veta et al. [23] proposed a deep learning-based method that leverages an information from a tumor’s cells nuclei locations (centroids) and predicts the areas of individual nuclei and mean nuclear area without the intermediate step of nuclei segmentation. In particular, this approach was based on a 10-layer deep neural network predicting nuclear areas quantized into 20 histogram bins. The results showed that predicted measurements had substantial agreement with manual measurements, which suggests that it is possible to compute the areas directly from imaging data, without the intermediate step of nuclei segmentation. This is in spirit similar to one of our approaches, but we do not directly compare our methods to Veta et al. since we use different datasets and performance metrics.

Recent works by Akbar et al. [24, 25] have compared the conventional approach based on segmentation and feature extraction and direct applications of deep CNNs to image patches in both regression and classification settings. Overall, they showed that the DL-based approach outperformed hand-crafted features in both accuracy and intraclass correlation (ICC) with expert pathologist annotations. Specifically, their best result was achieved by using a pretrained Inception [26] model that reached ICC of 0.83 and 0.81 with two expert pathologists. In this study we evaluate even wider range of DL-based approaches, including segmentation-based and segmentation-free, in both regression and classification settings. We provide appropriate performance comparisons with previously reported results. All the methods developed in this study are fully automatic and do not require any involvement of the human annotators at the test time.

3. Methods

In this study, we propose and evaluate three different methods. The first two methods are based on the nuclei segmentation and the third method leverages the raw image without preceding segmentation step. Graphical illustration of our approach is presented in Figure 1

Figure 1.

Generic description of the methods developed and evaluated in this study. Our first approach leverages segmentation model, feature extraction and gradient boosted trees. The second approach directly predicts the cellularity from the raw data. Finally, in our third setting, we combine the first and the second approach and used a deep convolutional neural network to predict the cellularity score from segmentation mask.

3.1. Segmentation

Network Architecture

Most modern segmentation architectures inherit the encoder-decoder architecture similar to U-Net [27], where convolutional layers in the contracting branch (encoder) are followed by an upsampling branch that brings segmentation back to the original image size (decoder). In addition, skip connections are used between contracting and upsampling modules to help the localization information propagate through the complex multilayer structure and eventually improve segmentation accuracy [27]. U-Net and architectures inspired by this idea have produced state of the art results in various segmentation problems, and many improvements for the architecture and its training protocols have recently been proposed. In particular, Iglovikov et al. [28, 29] used batch normalization [30] and exponential linear unit (ELU) as the primary activation function and an ImageNet pre-trained VGG-11 network [31] as an encoder. Liu et al. [32] proposed an hourglass-shaped network (HSN) with residual connections, which is also very similar to the U-Net architecture. Rakhlin et al. [33] used the Resnet-34 network [34] as the encoder and the Lovász-Softmax loss function [35] along with Stochastic Weight Averaging (SWA) [36] for training.

In our proposed architecture, the segmentation module also inherits the U-Net architecture. The contracting branch (encoder) of our model is based on the Resnet-34 [34] network architecture where we have introduced several useful modifications. In particular, we have replaced ReLU activations with ELU that does not saturate gradients and keeps the output close to zero mean and have changed order of batch normalization [30] and activation layers. In Section 4 we compare encoders initialized with random He’s initialization [37] and pretrained on ImageNet.

A major difference between the proposed model and the classical U-Net architecture originates in the limited dataset size, characteristic for the BreastPathQ Cancer Cellularity Challenge and for medical imaging in general. We utilize two regularization techniques to alleviate the problem of overfitting to limited training data: (1) data augmentation and (2) spatial 2D dropout incorporated into the upsampling branch.

The upsampling branch is implemented as a Feature Pyramid Network (FPN) [38], reconstructing high-level semantic feature maps at four scales simultaneously. We implement a feature pyramid block as a convolutional layer with 64 activation maps followed by upsampling to the original resolution with upsampling rate of 8, 4, 2, or 1 depending on the feature map depth (see Fig. 2). In Section 4, we compare the performance of standard and FPN decoders.

Figure 2.

Encoder-decoder segmentation network architecture with Resnet-34 encoder and feature pyramid network decoder. Spatial Dropout 2D is added after multi-layer concatenation.

We concatenate upsampled maps into a single layer of 64 × 4 = 256 maps and add after it a spatial 2D dropout layer, which acts as a regularizer and prevents coadaptation of the network weights, but unlike conventional dropout it drops out not individual neurons but rather entire activation maps. Throughout the work, we use dropout rate 0.5, randomly dropping 128 out of 256 activation maps.

Finally, the output of the model is a 4-channel sigmoid layer that assigns every pixel with four values from 0 to 1 that represent the probabilities of belonging to the Normal, Lymphocyte, Malignant, and Background classes.

Loss functions

Binary cross entropy (BCE), while convenient for training, does not directly translate into Jaccard index, the metric commonly used to evaluate segmentation accuracy. Hence, as the loss function we use a weighted sum of BCE and the soft Jaccard loss for class c; in this work we set α = 0.15, a value found via cross-validation. The soft Jaccard loss is defined as where w are network parameters, is the binary label for pixel i and class is the predicted probability of c for pixel i, and N is the total number of pixels. The total loss function is a weighted sum of class losses: where v^c is a loss weight for class c. In this work we weigh Normal, Lymphocyte, and Background as 1 and Malignant, the class of primary importance in our problem, as 4.

3.2. Cellularity estimation from segmented cells

In this subsection, we describe the method for cellularity assessment that leverages the output of the trained segmentation network Fig. 2. We feed the segmented output into a Resnet-34 CNN model. The model automatically learns deep features from the 4-channel segmentation input and regresses it onto continuous cellularity score. In this approach, the segmentation model acts as a filter the aim of which is to extract only the information about the cell morphology. We hypothesized that this structured approach makes our method similar to methods employed by expert pathologists, makes it transparent and less sensitive to data acquisition settings.

3.3. Feature extraction-based cellularity estimation

The second type of model is Gradient Boosted Trees (GBT) [39]. This approach represents an example of inter-disciplinary connection in Machine Learning. The general idea and handcrafted features are borrowed from the second place solution for 2017 Kaggle contest for Sea Lion Population Count in aerial imaginary [40]. The authors would like to thank Konstantin Lopuhin for valuable discussion they had while incorporating his method. In this study, GBT operates on a vector of hand-crafted features extracted from nuclei segmentation maps, including:

activations and their areas aggregated over segmentation maps with different thresholds; for every segmentation map in Normal, Lymphocyte, Malignant and for 7 thresholds 0.02, 0.04, 0.08, 0.16, 0.24, 0.32, 0.5, we obtain 2 values: total area above threshold and total activation above threshold (see Fig. 4 for an illustration);

Figure 3.

Light micrograph of a histologic specimen of breast tissue stained with hematoxylin and eosin (top). The bottom row shows nuclei segmentation masks synthesized from weak labels: Malignant — red, Normal — green, Lymphocyte — blue.

Figure 4.

Segmentation results at thresholds (a) - (d) of Malignant channel superimposed with the original image. Masks generated after thresholding were used for feature extraction.

using the Laplacian of Gaussian (LoG) method as implemented in the OpenCV library [41], we find blobs in segmentation maps at 6 thresholds: 0.02, 0.04, 0.08, 0.16, 0.24, 0.5; for each threshold we find the number of blobs and total activation in blob centers (Fig. 5);

Figure 5.

Nuclei blobs detected from the Malignant segmentation maps using the Laplacian of Gaussian method at thresholds (a)-(d). The blobs were used for feature extraction.

total activation for every channel, computed as a sum of the activations at every pixel after sigmoid.

In total, we obtain 3 ×(7 × 2 + 6 × 2 + 1) = 81 features to train the GBT model.

3.4. Cellularity estimation from the raw images

The third type of model is a deep convolutional network implemented in regression or classification settings.

These models do not use intermediate segmentation and predict the cellularity score immediately from the microscopic image. For classification, we categorize cellularity into 101 class using regular bins with thresholds 0.00, 0.01, …, 1.00. The idea of direct regression of an image into continuous value using CNN is not new. In particular, it was implemented in [42] where the authors use CNN to predict bone age from radiograph.

3.5. Evaluation metrics

We assessed the results using several metrics. The main evaluation metric is the mean squared error (MSE) between the cellularity score obtained in our experiments and ground truth provided by an expert pathologist.

In order to make our results comparable with previous work, we also report Cohen’s kappa coefficient agreement and the intra-class correlation coefficient (ICC) between expert and automated methods, similar to [19]. In all experiments, we find our results superior to our predecessors; however, the cellularity score itself in [19] is evaluated based on binning it into four categories of 0–25%, 26–50%, 51–75%, and 76–100%. Such 4-class categorization is relatively coarse and, in our opinion, does not represent a suitable evaluation metric for continuous cellularity estimation that is our goal in this work.

4. Experiments and results

4.1. Data

The data used in this study had been acquired from the Sunnybrook Health Sciences Centre with funding from the Canadian Cancer Society and was made available for the BreastPathQ challenge sponsored by the SPIE, NCI/NIH, AAPM, and the Sunnybrook Research Institute [19].

In our experiments we used 2, 395 patches of 512 × 512 pixels in size, extracted from 96 haematoxylin and eosin (H&E) stained whole slide images (WSI) acquired from 64 patients. Each patch in the training set has been assigned a tumor cellularity score by an expert pathologist. In Figure 6, we present a distribution of the cellularity scores in the dataset.

Figure 6.

Cellularity score distribution.

Besides the image data, we used the annotations (X and Y coordinates) to identify lymphocytes, malignant epithelial, and normal epithelial cell nuclei in the additional 153 patches. Using these weak annotations, we generated the segmentation masks that were used in out experiments. Here, at each XY location, we simply fit a blob of 15 pixels in diameter. In Figure 3 we present the generated masks for various classes.

4.2. Weakly-supervised cell segmentation

Highly limited amount of nuclei annotation data poses a serious challenge for training segmentation models. In our ablation studies, we evaluated our model in four different settings to find how different design choices influence the segmentation accuracy and generalization. Namely, we compared the model as described in Section 3 with a standard U-Net decoder against an FPN decoder, and with the encoder initialized randomly against the encoder initialized with weights pretrained on ImageNet. In all settings, the model was trained for 150 epochs with the Adam optimizer and gradually decreasing learning rate from 10⁻⁴ to 10⁻⁵.

To obtain training patches, we downscaled the microscopy images × 2 times, randomly cropped a 256 × 256 area, and rescaled pixel values from [0, 255] to [− 1, 1]. As mentioned previously, segmentation targets were generated as 4-channel masks with round blobs, 15 pixels in diameter (the characteristic nucleus size), drawn in the nuclei centers. During training, we dynamically augmented images with vertical and horizontal flips, rotation, gamma, hue, and saturation utilizing the Albumentations library [43].

In the first series of experiments, we evaluated segmentation quality as an important intermediate metric for the evaluation of our methods. The segmentation performance as a function of the decoder, initialization, and training epoch is shown in Fig. 8 and Table 1. As we can see, the model with the feature pyramid decoder and encoder pretrained on ImageNet achieved significantly higher and more stable Jaccard index on the validation set than the alternatives.

View this table:

Table 1.

Segmentation results: the Jaccard index for different decoders and initializations.

Figure 7.

Examples of the generated segmentation masks in the Malignant channel. Left to right: (a) original patch; (b) ground truth segmentation superimposed on the original image; (c) activation map; (d) nuclei blobs reconstructed from the activation map with the LoG method.

Figure 8.

Evolution of segmentation performance during training for different decoders and initialization strategies.

4.3. Segmentation-based cellularity assessment

Prediction from the segmented cells

As mentioned previously, we used the output of the segmentation model as input for the cellularity regressor and then trained this cascade end-to-end. We froze the segmentation model and stack its 4-channel output with a randomly initialized Resnet-34 in the regression setting. We trained the regression part with cellularity targets and MSE loss until convergence. Then we unfroze segmentation weights and fine-tuned both modules in an end-to-end fashion, as a single model. We repeated this experiment with Resnet-34 pretrained on ImageNet. In the latter case, we excluded the background channel from segmentation output to comply with the vanilla Resnet-34 architecture that has a 3-channel input.

In these experiments, we found that after fine-tuning the accuracy of segmentation itself slightly decreases, while the accuracy of the overall cellularity scoring increases. This is in line with [44], which found that perfect segmentation of nuclei figures does not ensure better classification of malignant objects from breast cancer tissues. This finding suggests that the two branches of future work, tumor bed segmentation and cellularity assessment, are relatively independent.

Feature extraction-based method

In this series of experiments, we extracted the 81 features from segmentation masks as discussed in Section 3 and trained the LightGBM [39] regression model with mean squared error (MSE) objective. The model was trained for 600 epochs with learning rate 0.01. The maximum tree depth was set to 5; the number of leaves, to 8. These parameters have been selected through cross-validation.

We report LightGBM accuracy in Table 2 and show the resulting feature importance on Fig. 10. Feature importance was calculated based on the total gain of the loss function from the splits formed according to this feature. As expected, all highly important features come from the Malignant channel. The most important feature is the total activation above 0.5 threshold, and the second and third most important features are the activations above 0.32 and 0.24 thresholds, as expected since activations at different thesholds are highly correlated, and the segmentation quality at threshold 0.5 was the best, so the feature based on this mask is a natural candidate for the most important feature. Activations at lower thresholds provide additional value, but a big part of the information that they contain has already been conveyed via the 0.5 threshold feature. Interestingly, malignant cell count (detected at threshold 0.24) is only the 9th feature in order of importance.

View this table:

Table 2.

Cellularity MSE with 95% confidence intervals for the segmentation-based (first row) and for the end-to-end methods. Our results demonstrate the importance of ImageNet pre-training. C in the parentheses indicates classification, R – regression and S – segmentation.

Figure 9.

Cellularity MSE evolution during training.

Figure 10.

GBT top feature importance.

4.4. Direct cellularity assessment from the raw images

In our final experiments, we evaluated several deep neural architectures that take the original microscopy images as input and output the cellularity score without intermediate segmentation. Similarly to previous experiments, we trained the models with random He initialization [37] or initialized them with weights pretrained on ImageNet. In all cases, ImageNet initialization was superior to random, and the overall accuracy was slightly better than for the models with intermediate segmentation. The Xception model implemented in a classification setup with random initialization performed slightly better than its counterparts (MSE 0.012 vs. 0.017-0.025). The mean squared error of cellularity prediction as a function of the training epoch for different initializations is shown in Figure 9. All the performance evaluation metrics are presented in Table 2 and Table 3.

View this table:

Table 3.

Cellularity Kappa (4 class binning) and Intra-Class Correlation Coefficient (ICC) with 95% confidence intervals for the segmentation-based (1^st and 2^nd rows) and for the methods predicting cellularity directly, withou segmentation. All the models here utilize ImageNet pre-training. C in the parentheses indicates classification, R – regression and S – segmentation.

4.5. Discussion

As we can see in Table 2 and Table 3, direct cellularity assessment method slightly outperforms the segmentation-based approach, where the regression module works on top of the segmentation feature extractor. We believe that performance improves due to two main reasons. First, segmentation models were not trained on accurate segmentation masks but rather on approximate masks generated from weakly supervised labels. Second, the cellularity score depends not only on the tumor masks but also on a broader set of features, some of which could be lost during the segmentation step.

While we note the record results of our end-to-end models, we believe that the modular form of the prediction pipeline provides benefits that more than compensate for this small difference in the final score.

The segmentation-based approach has two significant advantages: generalizability and interpretability. In practice, the data used for medical imaging tasks comes from different hospitals and is collected by different hardware. Images may differ in quality, level of noise, color and brightness distributions. In [42], the authors proposed to use segmentation to clean and standardize the data, which helps with overall robustness and performance of various task-specific models.

Better interpretability is achieved by the fact that we can visually verify the quality of the intermediate step, i.e., segmented tumors. Furthermore, the decision trees model allows to estimate the feature importance for every feature based on the information gain. If segmented tumors are correct, and the most informative features make intuitive sense, we obtain additional confidence in our model, which is very important in the medical setting.

5. Conclusion

In this paper, we evaluate three automatic methods to assess the cellulalarity of residual breast tumors in H&E stained samples. Our first method leverages the weakly-supervised segmentation masks as inputs for deep CNN. We believe that this method will be more generalizable and robust towards the data acquisition and easier to interpret.

Our second method that leverages feature extraction from the weakly-supervised segmentation mask yields the highest score among the all previously published feature extraction-based methods [24, 19].

Finally, the third method in this study is an end-to-end approach that predicts the cellularity score without any intermediate segmentation step. Although it is attractive and produces the best results it lacks interpretability of the segmentation-based methods and could perform best due to the dataset bias.

The main limitation of this study is the dataset size and the weak labels for the segmentation model. We think that given a bigger dataset and good quality annotations, segmentation-based approach could produce better results that less deviate from the end-to-end trained models.

Footnotes

shvets{at}mit.edu, akalinin{at}umich.edu, aleksei.tiulpin{at}oulu.fi, iglovikov{at}gmail.com, snikolenko{at}neuromation.io

References

[1].↵
Rebecca L. Siegel, Kimberly D. Miller, and Ahmedin Jemal. Cancer statistics, 2018. CA: A Cancer Journal for Clinicians, 68(1):7–30, 2018.
OpenUrl CrossRef PubMed
[2].↵
Detailed pathology methods for using residual cancer burden. https://www.mdanderson.org/education-and-research/resources-for-professionals/clinical-tools-and-resources/clinical-calculators/calculators-rcb-pathology-protocol2.pdf.
[3].↵
Radhika Rajan, Anna Poniecka, Terry L Smith, Ying Yang, Deborah Frye, Lajos Pusztai, Derek J Fiterman, Eva Gal-Gombos, Gary Whitman, Roman Rouzier, et al. Change in tumor cellularity of breast carcinoma after neoadjuvant chemotherapy as a variable in the pathologic assessment of response. Cancer: Interdis-ciplinary International Journal of the American Cancer Society, 100(7):1365–1373, 2004.
OpenUrl
[4].↵
W Fraser Symmans, Florentia Peintinger, Christos Hatzis, Radhika Rajan, Henry Kuerer, Vicente Valero, Lina Assad, Anna Poniecka, Bryan Hennessy, Marjorie Green, et al. Measurement of residual breast cancer burden to predict survival after neoad-juvant chemotherapy. Journal of Clinical Oncology, 25(28):4414–4422, 2007.
OpenUrl Abstract/FREE Full Text
[5].↵
John S Meyer, Consuelo Alvarez, Clara Milikowski, Neal Olson, Irma Russo, Jose Russo, Andrew Glass, Barbara A Zehnbauer, Karen Lister, and Reza Par- waresch. Breast carcinoma malignancy grading by bloom–richardson system vs proliferation index: reproducibility of grade and advantages of proliferation index. Modern pathology, 18(8):1067, 2005.
OpenUrl CrossRef PubMed Web of Science
[6].↵
Joann G Elmore, Gary M Longton, Patricia A Carney, Berta M Geller, Tracy Onega, Anna NA Tosteson, Heidi D Nelson, Margaret S Pepe, Kimberly H Allison, Stuart J Schnitt, et al. Diagnostic concordance among pathologists interpreting breast biopsy specimens. Jama, 313(11):1122–1132, 2015.
OpenUrl CrossRef PubMed
[7].↵
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436, 2015.
OpenUrl CrossRef PubMed
[8].↵
Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85–117, 2015.
OpenUrl CrossRef PubMed
[9].↵
Travers Ching, Daniel S Himmelstein, Brett K Beaulieu-Jones, Alexandr A Kalinin, Brian T Do, Gregory P Way, Enrico Ferrero, Paul-Michael Agapow, Michael Zietz, Michael M Hoffman, et al. Opportunities and obstacles for deep learning in biology and medicine. Journal of The Royal Society Interface, 15(141), 2018.
[10].↵
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[11].↵
Tsung-Yi Lin, Piotr Dollàr, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2117–2125, 2017.
[12].↵
Alexey Natekin and Alois Knoll. Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7:21, 2013.
OpenUrl
[13].↵
Aleksei Tiulpin, Jérôme Thevenot, Esa Rahtu, Petri Lehenkari, and Simo Saarakkala. Automatic knee osteoarthritis diagnosis from plain radiographs: A deep learning-based approach. Scientific Reports, 8:1727, 2018.
OpenUrl
[14].↵
Alexey Shvets, Alexander Rakhlin, Alexandr A Kalinin, and Vladimir Iglovikov. Automatic instrument segmentation in robot-assisted surgery using deep learning. arXiv preprint arXiv:1803.01207, 2018.
[15].↵
Fabio Alexandre Spanhol, Luiz S Oliveira, Caroline Petitjean, and Laurent Heutte. Breast cancer histopathological image classification using convolutional neural networks. In Neural Networks (IJCNN), 2016 International Joint Conference on, pages 2560–2567. IEEE, 2016.
[16].↵
Teresa Araújo, Guilherme Aresta, Eduardo Castro, José Rouco, Paulo Aguiar, Catarina Eloy, António Polónia, and Aurélio Campilho. Classification of breast cancer histology images using convolutional neural networks. PloS one, 12(6):e0177544, 2017.
OpenUrl
[17].↵
Babak Ehteshami Bejnordi, Mitko Veta, Paul Johannes van Diest, Bram van Ginneken, Nico Karsse- meijer, Geert Litjens, Jeroen AWM van der Laak, Meyke Hermsen, Quirine F Manson, Maschenka Balkenhol, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. Jama, 318(22):2199–2210, 2017.
OpenUrl CrossRef
[18].↵
Stephanie Robertson, Hossein Azizpour, Kevin Smith, and Johan Hartman. Digital image analysis in breast pathology—from image processing techniques to artificial intelligence. Translational Research, 2017.
[19].↵
Mohammad Peikari, Sherine Salama, Sharon Nofech- Mozes, and Anne L Martel. Automatic cellularity assessment from post-treated breast surgical specimens. Cytometry Part A, 91(11):1078–1087, 2017.
OpenUrl
[20].↵
F. Xing and L. Yang. Robust nucleus/cell detection and segmentation in digital pathology and microscopy images: A comprehensive review. IEEE Reviews in Biomedical Engineering, 9:234–263, 2016.
OpenUrl
[21].↵
Daisuke Komura and Shumpei Ishikawa. Machine learning methods for histopathological image analysis. Computational and structural biotechnology journal, 16:34–42, 2018.
OpenUrl
[22].↵
Alexandr A Kalinin, Ari Allyn-Feuer, Alex Ade, Gordon-Victor Fon, Walter Meixner, David Dilworth, Syed S Husain, Jeffrey R de Wett, Gerald A Higgins, Gen Zheng, et al. 3D shape modeling for cell nuclear morphological analysis and classification. Scientific Reports, 8, 2018.
[23].↵
Mitko Veta, Paul J Van Diest, and Josien PW Pluim. Cutting out the middleman: measuring nuclear area in histopathology slides without segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 632–639. Springer, 2016.
[24].↵
Shazia Akbar, Mohammed Peikari, Sherine Salama, Azadeh Y Panah, Sharon Nofech-Momes, and Anne L Martel. Automated and manual quantification of tumour cellularity in digital slides for tumour burden assessment. bioRxiv, page 571190, 2019.
[25].↵
Shazia Akbar, Mohammad Peikari, Sherine Salama, Sharon Nofech-Mozes, and Anne L Martel. Determining tumor cellularity in digital slides using resnet. In Medical Imaging 2018: Digital Pathology, volume 10581, page 105810U, 2018.
OpenUrl
[26].↵
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.
[27].↵
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
[28].↵
Vladimir Iglovikov and Alexey Shvets. Ternausnet: U-net with vgg11 encoder pre-trained on imagenet for image segmentation. arXiv preprint arXiv:1801.05746, 2018.
[29].↵
Vladimir Iglovikov, Selim Seferbekov, Alexander Buslaev, and Alexey Shvets. Ternausnetv2: Fully convolutional network for instance segmentation. arXiv preprint arXiv:1806.00844, 2018.
[30].↵
Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pages 448–456, 2015.
[31].↵
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[32].↵
Yu Liu, Duc Minh Nguyen, Nikos Deligiannis, Wenrui Ding, and Adrian Munteanu. Hourglass-shapenetwork based semantic segmentation for high resolution aerial imagery. Remote Sensing, 9(6):522, 2017.
OpenUrl
[33].↵
Alexander Rakhlin, Alex Davydow, and Sergey Nikolenko. Land cover classification from satellite imagery with u-net and lovàsz-softmax loss. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018.
[34].↵
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[35].↵
Maxim Berman, Amal Rannen Triki, and Matthew B. Blaschko. The lovàsz-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[36].↵
Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson. Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407, 2018.
[37].↵
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
[38].↵
Tsung-Yi Lin, Piotr Dollàr, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie. Feature pyramid networks for object detection. CoRR, abs/1612.03144, 2016.
[39].↵
1. I. Guyon,
2. U. V. Luxburg,
3. S. Bengio H. Wallach,
4. R. Fergus,
5. S. Vishwanathan, and
6. R. Garnett
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree. In I. Guyon, U. V. Luxburg, S. Bengio H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 3146–3154. Curran Associates, Inc., 2017.
OpenUrl
[40].↵
Konstantin Lopuhin. Noaa fisheries steller sea lion population count. https://www.kaggle.com/c/noaa-fisheries-steller-sea-lion-population-count/discussion/35422, 2017, online; accessed April 18, 2019.
[41].↵
G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
[42].↵
Vladimir I Iglovikov, Alexander Rakhlin, Alexandr A Kalinin, and Alexey A Shvets. Paediatric bone age assessment using deep convolutional neural networks. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pages 300–308. Springer, 2018.
[43].↵
E. Khvedchenya V. I. Iglovikov A. Buslaev, A. Parinov and A. A. Kalinin. Albumentations: fast and flexible image augmentations. ArXiv e-prints, 2018.
[44].↵
Laura E. Boucheron, B Manjunath, and Neal Harvey. Use of imperfectly segmented nuclei in the classification of histopathology images of breast cancer. pages 666–669, 03 2010.

View the discussion thread.

Posted May 05, 2019.

Download PDF

Citation Tools

Subject Area

Pathology

Subject Areas

All Articles

Animal Behavior and Cognition (5215)
Biochemistry (11745)
Bioengineering (8752)
Bioinformatics (29200)
Biophysics (14972)
Cancer Biology (12096)
Cell Biology (17411)
Clinical Trials (138)
Developmental Biology (9421)
Ecology (14182)
Epidemiology (2067)
Evolutionary Biology (18308)
Genetics (12245)
Genomics (16803)
Immunology (11869)
Microbiology (28085)
Molecular Biology (11592)
Neuroscience (60969)
Paleontology (451)
Pathology (1871)
Pharmacology and Toxicology (3238)
Physiology (4959)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2885)
Systems Biology (7340)
Zoology (1651)

[1] [1].↵
Rebecca L. Siegel, Kimberly D. Miller, and Ahmedin Jemal. Cancer statistics, 2018. CA: A Cancer Journal for Clinicians, 68(1):7–30, 2018.
OpenUrl CrossRef PubMed

[2] [2].↵
Detailed pathology methods for using residual cancer burden. https://www.mdanderson.org/education-and-research/resources-for-professionals/clinical-tools-and-resources/clinical-calculators/calculators-rcb-pathology-protocol2.pdf.

[3] [3].↵
Radhika Rajan, Anna Poniecka, Terry L Smith, Ying Yang, Deborah Frye, Lajos Pusztai, Derek J Fiterman, Eva Gal-Gombos, Gary Whitman, Roman Rouzier, et al. Change in tumor cellularity of breast carcinoma after neoadjuvant chemotherapy as a variable in the pathologic assessment of response. Cancer: Interdis-ciplinary International Journal of the American Cancer Society, 100(7):1365–1373, 2004.
OpenUrl

[4] [4].↵
W Fraser Symmans, Florentia Peintinger, Christos Hatzis, Radhika Rajan, Henry Kuerer, Vicente Valero, Lina Assad, Anna Poniecka, Bryan Hennessy, Marjorie Green, et al. Measurement of residual breast cancer burden to predict survival after neoad-juvant chemotherapy. Journal of Clinical Oncology, 25(28):4414–4422, 2007.
OpenUrl Abstract/FREE Full Text

[5] [5].↵
John S Meyer, Consuelo Alvarez, Clara Milikowski, Neal Olson, Irma Russo, Jose Russo, Andrew Glass, Barbara A Zehnbauer, Karen Lister, and Reza Par- waresch. Breast carcinoma malignancy grading by bloom–richardson system vs proliferation index: reproducibility of grade and advantages of proliferation index. Modern pathology, 18(8):1067, 2005.
OpenUrl CrossRef PubMed Web of Science

[6] [6].↵
Joann G Elmore, Gary M Longton, Patricia A Carney, Berta M Geller, Tracy Onega, Anna NA Tosteson, Heidi D Nelson, Margaret S Pepe, Kimberly H Allison, Stuart J Schnitt, et al. Diagnostic concordance among pathologists interpreting breast biopsy specimens. Jama, 313(11):1122–1132, 2015.
OpenUrl CrossRef PubMed

[7] [7].↵
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436, 2015.
OpenUrl CrossRef PubMed

[8] [8].↵
Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85–117, 2015.
OpenUrl CrossRef PubMed

[9] [9].↵
Travers Ching, Daniel S Himmelstein, Brett K Beaulieu-Jones, Alexandr A Kalinin, Brian T Do, Gregory P Way, Enrico Ferrero, Paul-Michael Agapow, Michael Zietz, Michael M Hoffman, et al. Opportunities and obstacles for deep learning in biology and medicine. Journal of The Royal Society Interface, 15(141), 2018.

[10] [10].↵
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.

[11] [11].↵
Tsung-Yi Lin, Piotr Dollàr, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2117–2125, 2017.

[12] [12].↵
Alexey Natekin and Alois Knoll. Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7:21, 2013.
OpenUrl

[13] [13].↵
Aleksei Tiulpin, Jérôme Thevenot, Esa Rahtu, Petri Lehenkari, and Simo Saarakkala. Automatic knee osteoarthritis diagnosis from plain radiographs: A deep learning-based approach. Scientific Reports, 8:1727, 2018.
OpenUrl

[14] [14].↵
Alexey Shvets, Alexander Rakhlin, Alexandr A Kalinin, and Vladimir Iglovikov. Automatic instrument segmentation in robot-assisted surgery using deep learning. arXiv preprint arXiv:1803.01207, 2018.

[15] [15].↵
Fabio Alexandre Spanhol, Luiz S Oliveira, Caroline Petitjean, and Laurent Heutte. Breast cancer histopathological image classification using convolutional neural networks. In Neural Networks (IJCNN), 2016 International Joint Conference on, pages 2560–2567. IEEE, 2016.

[16] [16].↵
Teresa Araújo, Guilherme Aresta, Eduardo Castro, José Rouco, Paulo Aguiar, Catarina Eloy, António Polónia, and Aurélio Campilho. Classification of breast cancer histology images using convolutional neural networks. PloS one, 12(6):e0177544, 2017.
OpenUrl

[17] [17].↵
Babak Ehteshami Bejnordi, Mitko Veta, Paul Johannes van Diest, Bram van Ginneken, Nico Karsse- meijer, Geert Litjens, Jeroen AWM van der Laak, Meyke Hermsen, Quirine F Manson, Maschenka Balkenhol, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. Jama, 318(22):2199–2210, 2017.
OpenUrl CrossRef

[18] [18].↵
Stephanie Robertson, Hossein Azizpour, Kevin Smith, and Johan Hartman. Digital image analysis in breast pathology—from image processing techniques to artificial intelligence. Translational Research, 2017.

[19] [19].↵
Mohammad Peikari, Sherine Salama, Sharon Nofech- Mozes, and Anne L Martel. Automatic cellularity assessment from post-treated breast surgical specimens. Cytometry Part A, 91(11):1078–1087, 2017.
OpenUrl

[20] [20].↵
F. Xing and L. Yang. Robust nucleus/cell detection and segmentation in digital pathology and microscopy images: A comprehensive review. IEEE Reviews in Biomedical Engineering, 9:234–263, 2016.
OpenUrl

[21] [21].↵
Daisuke Komura and Shumpei Ishikawa. Machine learning methods for histopathological image analysis. Computational and structural biotechnology journal, 16:34–42, 2018.
OpenUrl

[22] [22].↵
Alexandr A Kalinin, Ari Allyn-Feuer, Alex Ade, Gordon-Victor Fon, Walter Meixner, David Dilworth, Syed S Husain, Jeffrey R de Wett, Gerald A Higgins, Gen Zheng, et al. 3D shape modeling for cell nuclear morphological analysis and classification. Scientific Reports, 8, 2018.

[23] [23].↵
Mitko Veta, Paul J Van Diest, and Josien PW Pluim. Cutting out the middleman: measuring nuclear area in histopathology slides without segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 632–639. Springer, 2016.

[24] [24].↵
Shazia Akbar, Mohammed Peikari, Sherine Salama, Azadeh Y Panah, Sharon Nofech-Momes, and Anne L Martel. Automated and manual quantification of tumour cellularity in digital slides for tumour burden assessment. bioRxiv, page 571190, 2019.

[25] [25].↵
Shazia Akbar, Mohammad Peikari, Sherine Salama, Sharon Nofech-Mozes, and Anne L Martel. Determining tumor cellularity in digital slides using resnet. In Medical Imaging 2018: Digital Pathology, volume 10581, page 105810U, 2018.
OpenUrl

[26] [26].↵
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.

[27] [27].↵
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.

[28] [28].↵
Vladimir Iglovikov and Alexey Shvets. Ternausnet: U-net with vgg11 encoder pre-trained on imagenet for image segmentation. arXiv preprint arXiv:1801.05746, 2018.

[29] [29].↵
Vladimir Iglovikov, Selim Seferbekov, Alexander Buslaev, and Alexey Shvets. Ternausnetv2: Fully convolutional network for instance segmentation. arXiv preprint arXiv:1806.00844, 2018.

[30] [30].↵
Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pages 448–456, 2015.

[31] [31].↵
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

[32] [32].↵
Yu Liu, Duc Minh Nguyen, Nikos Deligiannis, Wenrui Ding, and Adrian Munteanu. Hourglass-shapenetwork based semantic segmentation for high resolution aerial imagery. Remote Sensing, 9(6):522, 2017.
OpenUrl

[33] [33].↵
Alexander Rakhlin, Alex Davydow, and Sergey Nikolenko. Land cover classification from satellite imagery with u-net and lovàsz-softmax loss. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018.

[34] [34].↵
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.

[35] [35].↵
Maxim Berman, Amal Rannen Triki, and Matthew B. Blaschko. The lovàsz-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.

[36] [36].↵
Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson. Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407, 2018.

[37] [37].↵
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.

[38] [38].↵
Tsung-Yi Lin, Piotr Dollàr, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie. Feature pyramid networks for object detection. CoRR, abs/1612.03144, 2016.

[39] [39].↵
I. Guyon,
U. V. Luxburg,
S. Bengio H. Wallach,
R. Fergus,
S. Vishwanathan, and
R. Garnett
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree. In I. Guyon, U. V. Luxburg, S. Bengio H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 3146–3154. Curran Associates, Inc., 2017.
OpenUrl

[40] I. Guyon,

[41] U. V. Luxburg,

[42] S. Bengio H. Wallach,

[43] R. Fergus,

[44] S. Vishwanathan, and

[45] R. Garnett

[46] [40].↵
Konstantin Lopuhin. Noaa fisheries steller sea lion population count. https://www.kaggle.com/c/noaa-fisheries-steller-sea-lion-population-count/discussion/35422, 2017, online; accessed April 18, 2019.

[47] [41].↵
G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.

[48] [42].↵
Vladimir I Iglovikov, Alexander Rakhlin, Alexandr A Kalinin, and Alexey A Shvets. Paediatric bone age assessment using deep convolutional neural networks. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pages 300–308. Springer, 2018.

[49] [43].↵
E. Khvedchenya V. I. Iglovikov A. Buslaev, A. Parinov and A. A. Kalinin. Albumentations: fast and flexible image augmentations. ArXiv e-prints, 2018.

[50] [44].↵
Laura E. Boucheron, B Manjunath, and Neal Harvey. Use of imperfectly segmented nuclei in the classification of histopathology images of breast cancer. pages 666–669, 03 2010.

Breast Tumor Cellularity Assessment using Deep Neural Networks

Abstract

1. Introduction

2. Related work

3. Methods

3.1. Segmentation

Network Architecture

Loss functions

3.2. Cellularity estimation from segmented cells

3.3. Feature extraction-based cellularity estimation

3.4. Cellularity estimation from the raw images

3.5. Evaluation metrics

4. Experiments and results

4.1. Data

4.2. Weakly-supervised cell segmentation

4.3. Segmentation-based cellularity assessment

Prediction from the segmented cells

Feature extraction-based method

4.4. Direct cellularity assessment from the raw images

4.5. Discussion

5. Conclusion

Footnotes

References

Citation Manager Formats

Subject Area