Elsevier

Pattern Recognition

Volume 42, Issue 6, June 2009, Pages 1093-1103
Pattern Recognition

Computer-aided prognosis of neuroblastoma on whole-slide images: Classification of stromal development

https://doi.org/10.1016/j.patcog.2008.08.027Get rights and content

Abstract

We are developing a computer-aided prognosis system for neuroblastoma (NB), a cancer of the nervous system and one of the most malignant tumors affecting children. Histopathological examination is an important stage for further treatment planning in routine clinical diagnosis of NB. According to the International Neuroblastoma Pathology Classification (the Shimada system), NB patients are classified into favorable and unfavorable histology based on the tissue morphology. In this study, we propose an image analysis system that operates on digitized H&E stained whole-slide NB tissue samples and classifies each slide as either stroma-rich or stroma-poor based on the degree of Schwannian stromal development. Our statistical framework performs the classification based on texture features extracted using co-occurrence statistics and local binary patterns. Due to the high resolution of digitized whole-slide images, we propose a multi-resolution approach that mimics the evaluation of a pathologist such that the image analysis starts from the lowest resolution and switches to higher resolutions when necessary. We employ an offline feature selection step, which determines the most discriminative features at each resolution level during the training step. A modified k-nearest neighbor classifier is used to determine the confidence level of the classification to make the decision at a particular resolution level. The proposed approach was independently tested on 43 whole-slide samples and provided an overall classification accuracy of 88.4%.

Introduction

Neuroblastoma (NB) is a cancer of nerve cell origin and it commonly affects infants and children. Based on the American Cancer Society's statistics, it is by far the most common cancer in infants and the third most common type of cancer in children [1]. Every year approximately 650 patients are diagnosed with this disease in the United States. As in most cancer types, histopathological examinations are required to characterize the histology of the tumor for further treatment planning. The World Health Organization recommends the use of the International Neuroblastoma Pathology Classification (the Shimada system) for categorization of the patients into different prognostic groups [2], [3]. It is an age-linked classification system based on morphological characteristics of the tissue. Fig. 1 shows a relevant summary of this classification system as a tree diagram, where the grade of neuroblastic differentiation, the degree of Schwannian stromal development, and the mitosis and karyorrhexis index are the most salient features that contribute to the final tissue classification as favorable and unfavorable histology.

The histopathological examination guides the oncologists in making decisions on timing and therapy; hence the accuracy of the classification is important to prevent making any under or over treatment. Unfortunately, the qualitative visual examination performed by pathologists under the microscope is tedious and prone to error due to several factors. First, it is not practical to examine every region of the tissue slide under the microscope at high magnifications (e.g., 40×). For NB diagnosis, pathologists typically pick some representative regions at lower magnifications (e.g., 2×, 4×) and examine only those regions. The final decision about the entire slide is based on these sampled regions. Although this approach provides accurate decisions, it may be misleading for heterogeneous tumors. Second, the resulting diagnosis varies considerably between different readers. Experience and fatigue may cause significant inter- and intra-reader variations among pathologists. A recent study by Teot et al. shows that for NB diagnosis, this variation can be up to 20% between central and institutional reviewers [4]. There is no specific study that relates to such variations in classification of the degree of Schwannian stromal development in NB diagnosis. However, as shown in Fig. 1, this analysis is the first, hence, a critical step in NB prognosis.

To address these drawbacks, we are developing a computer-aided diagnosis (CAD) system for NB. The use of computers to assist physicians in their evaluations of medical images is not a new study. There are several commercially available CAD systems that have been proven to improve the clinical diagnosis of radiology images for several modalities such as mamagraphy and computed tomography (CT) [5], [6]. However, research on the development of such similar systems for whole-slide histopathological image analysis is relatively new. This is mostly due to the challenges in both the acquisition and the processing of histopathological images that are much larger in size, as opposed to radiology images. Parallel to the recent developments in whole-slide digital scanners, research studies on quantitative histopathological image analysis have been accelerated. Providing computational tools to extract measurable features, histopathological image analysis systems help extracting more objective and more accurate diagnostic clues that might not be easily observed by qualitative analysis performed by pathologists.

Research efforts on histopathological image analysis can be categorized into two main groups based on their objectives: (1) content-based image retrieval (CBIR) and (2) CAD systems. CBIR systems aim to retrieve clinical cases from a database, containing previously diagnosed representative cases similar to a query image; hence pathologists could make use of the established knowledge in their decision making process [7], [8]. On the other hand, CAD systems directly focus on making the diagnostic classifications of the tissues (e.g., malignant or benign; grade of differentiation) [9]. However, due to variations of tissue structures and different prognosis procedures in different disease types, it is impractical to develop a universal system that would work for all disease types, even if they show similar characteristics. Recently, Petushi et al. proposed an image analysis system to determine the grade of differentiation for breast cancer [10]. Their study provides automatic segmentation and classification of distinct cell nuclei that are used for identification of the grade of differentiation to indicate the degree of malignancy. Similar studies have been conducted to automate the Gleason grading system for prostate cancer. Khouzani et al. proposed an image analysis approach using a multi-wavelet-based approach to characterize the texture of the samples associated with different grades [11]. In addition to the textural features, Doyle et al. introduced the use of architectural features based on the spatial organization of cells, as well as their morphologies [12]. Tabesh et al. incorporated color, texture, and morphometric image features at the global and histological object levels for prostate tissue grading [13].

Similarly, several research studies have been conducted on quantitative analysis of NB. Gurcan et al. proposed a cell segmentation method from H&E stained pathology slides using morphological reconstruction followed by hysteresis thresholding [14]. Most recently, Kong et al. proposed a classification approach using texture and color information to determine the grade of differentiation for NB diagnosis [15], [16]. Both studies showed promising results for developing an automated framework to be used in clinical practice as assistance. However, to the best of our knowledge, there has not been any research conducted to analyze Schwannian stromal development for NB diagnosis.

In this study, our goal is to develop a CAD system to determine the degree of Schwannian stromal development as either stroma-rich or stroma-poor from digitized whole-slide NB slides. In Fig. 2 we show sample NB tissue images cropped at 40× magnification. Figs. 2(a) and (b) correspond to stroma-rich tissue samples that can be characterized by an extensive growth of Schwannian and other supporting elements. On the contrary, Figs. 2(c) and (d) demonstrate stroma-poor tissue samples that can be characterized by a diffuse growth of neuroblastic cells with various degrees of differentiation randomly distributed by thin septa of fibrovascular tissue and neurphil meshwork.

Using sophisticated computer vision and pattern recognition techniques, we introduced a multi-resolution image analysis approach to identify image regions associated with different histopathological components (i.e., stroma-rich and stroma-poor). The proposed multi-resolution approach mimics the way pathologists examine the tissue slides under the microscope such that the image analysis starts from the lowest resolution, which corresponds to the lower magnification levels in a microscope and uses the higher resolution representations for the regions where the decision for the classification requires more detailed information. We proposed a texture-based approach to differentiate Schwannian stroma-rich tissue from other cytological structures. We employed the rotation-invariant co-occurrence statistics and local binary patterns (LBP) to characterize the stroma septa with different organizations. Using representative samples, we constructed a training dataset and extracted textural features. We further employed an automated feature selection step in which the most discriminating subset of the features are determined at each resolution level to improve the classification performance. Finally, the classification has been performed using a statistical classifier.

In our study, we used 45 whole-slide tissue samples collected from Nationwide Children's Hospital. Tissue slides were obtained retrospectively according to an Institutional Review Board (IRB) protocol. Each slide was embedded in paraffin and was cut at a thickness of 5μm according to commonly used Children's Oncology Group protocols. After being stained by hematoxylin and eosin (H&E), each tissue slice was fixed on a slide and was digitized using a ScanScope T2 digitizer (Aperio, San Diego, CA) at 40× magnification. The digitized whole-slide tissue images typically have a spatial resolution up to 100k×120k with a disk size up to 40 GB. Therefore, at the time of processing, the whole-slide images are decomposed into smaller non-overlapping image tiles. The tiling of the whole-slide images not only made it practical to process the whole-slide images, but also allowed leveraging the parallelism in processing each image tile independently. Experimentally, we determined the image tile size as 896×896 in pixels. The average resolution of tissue slides used in our study is approximately 71,623×100,348 in pixels; with an average disk size of 20±8GB; hence the average number of image tiles in a whole-slide image is approximately 8900, which still requires significant computation time.

One representative whole-slide sample for each subtype (i.e., stroma-rich and stroma-poor) has been used to generate the training image tiles and the remaining 43 were used for whole-slide independent testing. Thirty-two of 43 whole-slide samples are associated with stroma-poor and the rest are associated with stroma-rich subtypes, as determined by an expert pathologist. Five of stroma-rich cases correspond to Ganglioneuroblastoma and six to Ganglioneuroma. The remaining 32 stroma-poor cases correspond to NB. Their morphological characteristics in terms of mitotic-karyorrhectic index (MKI) and the grade of differentiation, according to the International Neuroblastoma Pathology Classification (the Shimada system), are summarized in Table 1.

The image analysis routines were implemented using MATLAB (version 7.1.0.246, Natick, MA) and the experimental studies have been carried out on a 64-node PC cluster owned by the Department of Biomedical Informatics, The Ohio State University. Each computation node on the cluster is equipped with dual 2.4 GHz Opteron 250 processors and 8 GB of RAM. This parallel system uses a software developed in house, which distributes the processing of image tiles to each node, applies the required MATLAB routines on each image tile locally, collects the classification outputs and stitches them together to create the final classification map. Fig. 3 shows the computational infrastructure used to analyze the whole-slide images in parallel.

An additional extension of this software is a grid-based infrastructure where image analysis modules (i.e., MATLAB or C/C++ files) could be uploaded to the system by multiple developers and pipelined to be applied to the available whole-slide images stored in the common repository using a grid interface [17].

Section snippets

Classification of stromal development using texture information

We formulate the classification of digitized microscopy images of NB as a statistical pattern recognition problem [18]. The flowchart of the proposed whole-slide image analysis system is given in Fig. 4. Our system decomposes each whole-slide image into smaller image tiles and each image is processed independently. Given an image tile, we first check whether it contains sufficient amount of tissue component. To test this, we count the number of white pixels. We considered any pixel with

Experimental results

Our training dataset consists of 500 image tiles cropped from two whole-slide samples associated with a stroma-rich and a stroma-poor subtype, respectively. During the training step, we first applied the multi-resolution decomposition to the training images using the Gaussian pyramid approach. This is followed by color space conversion and texture feature construction at each resolution level, as shown in Fig. 7. Finally, we applied feature selection and stored the corresponding feature space

Conclusions

The evaluation of the entire tissue sample is critically important for NB prognosis where the stromal development is determined as the ratio of stroma-poor and stroma-rich regions. For practical reasons, pathologists typically examine representative regions in each slide before they come up with a prognostic decision (i.e., sampling). However, sampling may lead to prognostic errors, particularly in tumors that show heterogeneity. Since the computer analyzes the whole slide, it can potentially

Acknowledgments

This work is supported in part by the young investigator award from the Children's Neuroblastoma Cancer Foundation, the US National Science Foundation (#CNS-0643969, #CNS-0403342), and the NIH NIBIB BISTI (#P20EB000591).

About the Author—OLCAY SERTEL received his BS from Yildiz Technical University, Istanbul, Turkey, MSc degree from Yeditepe University, Istanbul, Turkey, in 2004 and 2006, both in Computer Engineering. He is currently a PhD student at the Department of Electrical and Computer Engineering and working as a Graduate Research Associate at the Department of Biomedical Informatics at The Ohio State University, Columbus, OH. His research interests include computer vision, image processing and pattern

References (28)

  • W. Chen et al.

    Image mining for investigative pathology using optimized feature extraction and data fusion

    Comput. Methods Programs Biomedicine

    (2005)
  • P. Pudil et al.

    Floating search methods in feature selection

    Pattern Recognition Lett.

    (1994)
  • American Cancer Society 〈http://www.cancer.org〉...
  • H. Shimada et al.

    The international neuroblastoma pathology classification (the Shimada system)

    Cancer

    (1999)
  • H. Shimada et al.

    Terminology and morphologic criteria of neuroblastic tumors: recommendation by the international neuroblastoma pathology committee

    Cancer

    (1999)
  • L.A. Teao et al.

    The problem and promise of central pathology review: development of a standardized procedure for the children's oncology group

    Pediatr. Dev. Pathol.

    (2007)
  • L. Burhenne et al.

    Potential contribution of computer-aided detection to the sensitivity of screening mammography

    Radiology

    (2000)
  • M.N. Gurcan et al.

    Lung nodule detection on thoracic computed tomography images: preliminary evaluation of a computer-aided diagnosis system

    Med. Phys.

    (2002)
  • D. Comaniciu et al.

    Image-guided decision support system for pathology

    Mach. Vision Appl.

    (1999)
  • S. Cross et al.

    Image analysis of low magnification images of fine needle aspirates of the breast produces useful discrimination between benign and malignant cases

    Cytopathology

    (1997)
  • S. Petushi et al.

    Large-scale computations on histology images reveal grade-differentiating parameters for breast cancer

    BMC Med. Imaging

    (2006)
  • K. Jafari-Khouzani et al.

    Multiwavelet grading of pathological images of prostate

    IEEE Trans. Biomedical Eng.

    (2003)
  • S. Doyle, M. Hwang, K. Shah, A. Madabhushu, M. Feldman, J. Tomaszeweski, Automated grading of prostate cancer using...
  • A. Tabesh et al.

    Multifeature prostate cancer diagnosis and Gleason grading of histological images

    IEEE Trans. Med. Imaging

    (2007)
  • Cited by (184)

    • When to checkpoint at the end of a fixed-length reservation?

      2023, ACM International Conference Proceeding Series
    View all citing articles on Scopus

    About the Author—OLCAY SERTEL received his BS from Yildiz Technical University, Istanbul, Turkey, MSc degree from Yeditepe University, Istanbul, Turkey, in 2004 and 2006, both in Computer Engineering. He is currently a PhD student at the Department of Electrical and Computer Engineering and working as a Graduate Research Associate at the Department of Biomedical Informatics at The Ohio State University, Columbus, OH. His research interests include computer vision, image processing and pattern recognition with applications in medicine.

    About the Author—JUN KONG received the BS in Information and Control and MS in Electrical Engineering from Shanghai Jiao Tong University, Shanghai, China, in 2001 and 2004. He is currently a PhD student in Department of Electrical and Computer Engineering at The Ohio State University, Columbus, OH. His research interests include computer vision, machine learning, and pathological image analysis.

    About the Author—HIROYUKI SHIMADA is currently an Associate Professor of clinical in Children's Hospital Los Angeles and Department of Pathology and Laboratory Medicine at The University of Southern California. He is the Pathologist-of-Record for the Children's Cancer Group (CCG) Neuroblastoma Studies. He is also a core member of International Neuroblastoma Pathology Committee and chairing workshops and other activities. In 1999 this Committee established the International Neuroblastoma Pathology Classification (the Shimada System) by adopting my original classification published in 1984. His research interests include investigating morphological characteristics of pediatric tumors.

    About the Author—UMIT V. CATALYUREK is an Associate Professor in the Department of Biomedical Informatics at The Ohio State University. His research interests include graph and hypergraph partitioning algorithms, grid computing, and runtime systems and algorithms for high-performance and data-intensive computing. He received his PhD, MS and BS in computer engineering and information science from Bilkent University, Turkey, in 2000, 1994 and 1992, respectively.

    About the Author—JOEL H. SALTZ is the Professor and Chair of the Department of Biomedical Informatics, Professor in the Department of Computer Science and Engineering at The Ohio State University (OSU), Davis Endowed Chair of Cancer at OSU, and a Senior Fellow of the Ohio Supercomputer Center. He received his MD-PhD degrees in Computer Science at Duke University. Dr. Joel Saltz has developed a rich set of middleware optimization and runtime compilation methods that target irregular, adaptive, and multi-resolution applications. Dr. Saltz is also heavily involved in the development of ambitious biomedical applications for high-end computers, very large-scale storage systems, and grid environments. He has played a pioneering role in the developing of Pathology virtual slide technology and has made major contributions to informatics applications that support point-of-care testing.

    About the Author—METIN N. GURCAN is an Assistant Professor at the Department of Biomedical Informatics at The Ohio State University (OSU). He received his MSc degree in Digital Systems Engineering from UMIST, England and his BSc and PhD degrees in Electrical and Electronics Engineering from Bilkent University, Turkey. He was a postdoctoral research fellow and later a research investigator in the Department of Radiology at the University of Michigan, Ann Arbor. Prior to joining OSU in October 2005, he worked as a senior researcher and product director at a high-tech company, specializing in computer-aided detection and diagnosis of cancer from radiological images. Dr. Gurcan's research interests include image analysis and understanding, computer vision with applications to medicine. He is the recipient of the young investigator award from the Children's Neuroblastoma Cancer Foundation, author of over 50 peer-reviewed publications and has a patent in computer-aided diagnosis in volumetric imagery.

    View full text