Automatic cervical cell segmentation and classification in Pap smears
Introduction
Cervical cancer is the fourth leading cause of cancer death in females worldwide [1]. The prognosis for cervical cancer depends on the stage of the cancer at the time of detection. The disease can be cured if diagnosed in the pre-cancerous lesion stage or earlier. Papanicolaou test or Pap test is a physical examination technique widely used to prevent cervical cancer by finding cells that have the potential to turn cancerous. It was estimated that, in the year 2006, systematic screening can reduce mortality rates from cervical cancer by 70% or more [2].
In Thailand, cervical cancer is the second most common cancer among women [3], with a high mortality rate of nearly 300,000 per year. The screening program administered by the Ministry of Public Health and the National Health Security Office suggested that women aged 35–60 years should undergo Pap smear examination every five years [4]. With an overall population number of 67 million in Thailand [5], the number of samples to be examined per year is large comparing to the number of cytologists who process the screening. The study of automatic cervical cell classification has been done for over 40 years to cope with this labor-intensive task of manual screening and also to reduce the error of screening result. A number of commercial automated screening systems have been approved by the FDA for quality control with examples including PAPNET (Neuromedical Systems Inc.), FocalPoint Slide Profiler™ (formerly AUTOPAP; BD TriPath), ThinPrep Pap Test, ThinPrep Imaging System (Hologic Inc.) and Imager™ (Cytyc). Several research works have shown that these automated systems indeed improve the accuracy of the screening result and reduce the false-negative rate [6], [7], [8]. However, cost effectiveness is a major drawback of these systems with the cost of PAPNET test far exceeds that of manual screening.
Uncertainty in diagnostic capability was also reported [9]. It is therefore suggested that the automated system should be used as an aiding tool in conjunction with the expert's opinion rather than relying on the system as a primary screening and diagnosing tool [10], [11]. Over the past few years, trend of research in automated cervical cancer screening has shifted from cytology screening to histology image [12], [13] and colposcopic image [14], [15]. Histology image is not only used in cervical cancer screening, but it is used in the other kinds of cancer screening also [16], [17], [18]. However, cytology screening is still a default screening method in most countries due to its relatively low cost and its effectiveness in cervical cancer prevention if the screening is routinely performed.
The screening process normally starts with gathering cervical cell samples from the uterine cervix and mounting it on a glass slide. The collected sample is visually inspected under a microscope to identify the target cell or classify each cell into categories. The basic characteristics used to classify the stage of cells are mainly the characteristics of cell nuclei and cytoplasm such as shape, size, texture, ratio of nucleus and cytoplasm. From image processing point of view, the first step in extracting information from cell components is to correctly identify a region of each component (nucleus, cytoplasm, and non-cell components) by segmentation process. There are several research works on nucleus segmentation [19], [20], [21], [22]. However, when one would like to classify each cervical cell into categories with only nucleus information, it might not yield a good performance. Hence, segmenting whole cell is more desirable [23], [24], [25], [26]. However, there is no classification result reported in these works. After the segmentation step, each cell is then classified using specific classifiers based on the extracted features from cell components as mentioned earlier or by using filters to discriminate classes without feature extraction process [27]. However, the classification performance in Ref. [27] is not quite high.
In this research, we propose a method for automatic segmentation and classification of cervical cell images. In the segmentation process, we use a patch-based fuzzy C-means (FCM) clustering technique. A cell image is segmented by using the over-segment FCM technique into nucleus, cytoplasm, and background. For comparison, we use the hard C-means clustering technique and the watershed segmentation technique in the segmentation step as well. The segmented image is then used to extract related features to be the input to classifiers. Five classifiers including Bayesian classifier, linear discriminant analysis, K-nearest neighbor, artificial neural networks, and support vector machine, are considered. The usefulness of features based on nucleus and entire cell is also investigated.
This paper is organized as follows: the following section describes basic knowledge of the segmentation technique, mathematical morphologies, feature extraction, classifiers, followed by the details of cell datasets used in this research. Section 3 provides the experimental results and discussion. The conclusion is drawn in Section 4.
Section snippets
Cervical cell segmentation using patch-based fuzzy C-means clustering
We invented the patch-based fuzzy C-means (FCM) clustering method to segment nuclei and cytoplasm of white blood cells [28]. It was later applied to segment nuclei of cervical cells from the conventional Pap test [29]. The FCM is good for clustering data with uncertainty. We, therefore, chose the FCM to cluster uncertain cell image intensity data. In this research, the segmentation method is tested with cervical cells from, not only the conventional Pap test, but also the ThinPrep® Pap test.
Results and discussion
For the sake of generalization of the classification results, the leave-one-out cross validation (LOOCV) was applied throughout the experiments. All results shown in this section are for the validation data in the LOOCV.
Conclusions
This research proposes a method of automatic cervical cell image segmentation and classification. We used the over-segment fuzzy C-means clustering technique to segment each cell into 2 or 3 regions. Three Pap smear datasets, i.e., ERUDIT, LCH, and Herlev, were tested. The 4-class problem consists of 4 cell classes, i.e., normal, low grade squamous intraepithelial lesion (LSIL), high grade squamous intraepithelial lesion (HSIL), and squamous cell carcinoma (SCC). When the last 3 classes are
Conflict of interest statement
The authors declare no conflict of interest.
Acknowledgements
Financial support from the Thailand Research Fund through the Royal Golden Jubilee Ph.D. Program (Grant No. PHD/0238/2550) to Thanatip Chankong and Nipon Theera-Umpon is acknowledged. We thank Dr. Taweethong Koanantakool, Department of Medical Services, Ministry of Public Health, Thailand, for introducing us the cervical cancer classification problem. We would like to thank Dr. Jan Jantzen for supporting the ERUDIT and Herlev Pap smear datasets. We are thankful to Lampang Cancer Hospital,
References (47)
- et al.
Is PAPNET suitable for primary screening?
Lancet
(1999) Evaluation of automated systems for the primary screening of cervical smears
Current Diagnostic Pathology
(1998)- et al.
A survey on histological image analysis-based assessment of three major biological factors influencing radiotherapy: proliferation, hypoxia, and vasculature
Computer Methods and Programs in Biomedicine
(2004) - et al.
Histology image analysis for carcinoma detection and grading
Computer Methods and Programs in Biomedicines
(2012) - et al.
Computer-aided evaluation of neuroblastoma on whole-slide histology images: classifying grade of neuroblastic differentiation
Pattern Recognition
(2009) - et al.
Segmentation of cervical cell nuclei in high-resolution microscopic images: a new algorithm and a web-based software framework
Computer Methods and Programs in Biomedicine
(2012) - et al.
Debris removal in Pap-smear images
Computer Methods and Programs in Biomedicine
(2013) - et al.
Combining shape, texture and intensity features for cell nuclei extraction in Pap smear images
Pattern Recognition Letters
(2011) - et al.
Nucleus and cytoplast contour detector of cervical smear image
Pattern Recognition Letters
(2008) - et al.
Cytoplasm and nucleus segmentation in cervical smear images using radiating GVF snake
Pattern Recognition
(2012)