This paper appears in: , Issue Date: , Written by:

© 2013 IEEE

By Topic

- Aerospace
- Bioengineering
- Communication, Networking & Broadcasting
- Components, Circuits, Devices & Systems
- Computing & Processing (Hardware/Software)
- Engineered Materials, Dielectrics & Plasmas

SECTION I

Pathology is the microscopic study of the cell morphology supplemented with in situ molecular information. The tissue sample is removed from the body and then prepared for viewing under the microscope by placing it in a fixative, which stabilizes the tissue to prevent decay. For the sake of visualizing under the microscope, different components of the tissue are dyed with different stains. Then, different staining techniques are applied to reveal specific tissue components under the microscope.

The pathologist plays a central role in therapeutic decision making [1], [2]. Accordingly, diagnosis from pathology images remains the “gold standard” in diagnosing a number of diseases including most cancers [3]. Diagnosing a disease after manually analyzing numerous biopsy slides represents a labor intensive work for pathologists. Thanks to recent advances in digital pathology, the automated recognition of pathology patterns in a high-content whole slide image (WSI) has the potential to provide valuable assistance to the pathologist in his daily practice.

Researchers in pathology have been familiar with the importance of quantitative analysis of pathological images. Quantitative analysis can be used to support pathologists decision about the presence or the absence of a disease, and also to help in disease progression evaluation. In addition, quantitative characterization is important, not only for clinical usage (e.g., to increase the diagnostic reliability), but also for research applications (e.g., drug discovery [4] and biological mechanisms of disease [5]). As a consequence, the use of computer-aided diagnosis (CAD) in pathology can substantially enhance the efficiency and accuracy of pathologists decisions, and overall benefit the patient.

Earliest works in the field date back to the early 90s [6]– [7][8] but are in relatively small number, presumably due to the limited penetration of digital equipment in pathology. Thanks to recent advances in digital pathology, numerous cancer detection and grading applications have been proposed, including brain [9]– [10][11], breast [12]– [13] [14] [15] [16] [17] [18] [19] [20][21], cervix [22], [23], liver [24], lung [25], and prostate [26]– [27][28] cancer grading.

Among the various studies, automated nuclei segmentation and classification is a recurring task, particularly difficult on pathology images. Indeed, the detection and segmentation of nuclei in cytopathology images are generally facilitated due to the well-separated nuclei and the absence of complicated tissue structures. In contrast, the segmentation of nuclei on histopathological images (tissue preserving its original structure) is more difficult since most of the nuclei are often part of histological structures presenting complex and irregular visual aspects.

A review on automated cancer diagnosis based on histopathological images [29], [30] and a review on histopathological image analysis [5] already exist in the literature, addressing different types of problems associated with different image modalities. This paper is intended as a comprehensive state-of-the-art survey on the particular issues of nuclei detection, segmentation, and classification methods restricted to two widely available types of image modalities: hematoxylin-eosin (H&E) and immunohistochemical (IHC). A list of symbols and notation commonly used in this paper is shown in Table I.

This paper is organized in six sections. Section II introduces the different image modalities in histopathology. In Section III, we highlight the challenges in nuclei detection, segmentation, and classification. Section IV illustrates the recent advances in nuclei detection, segmentation, and classification methods used in histopathology. Section V addresses the challenges in nuclei detection, segmentation, and classification, and suggests ways to overcome them. We conclude with a discussion, pointing to future research directions and open problems related to nuclei detection, segmentation, and classification.

SECTION II

Digital pathology is the microscopic investigation of a biopsy or surgical specimen that is chemically processed and sectioned onto glass slides to study cancer expression, genetic progression, and cellular morphology for cancer diagnosis and prognosis. For tissue components visualization under a microscope, the sections are dyed with one or more stains.

H&E staining is a widespread staining protocol in pathology. H&E staining has been used by pathologists for over a hundred years [31] and is still widely used for observing morphological features of the tissue under white light microscopes. Hematoxylin stains nuclei in dark blue color, while eosin stains other structures (cytoplasm, stroma, etc.) with a pink color [see Fig. 1(a)]. Nuclei are susceptible to exhibit a wide variety of patterns (related to the distribution of chromatin, prominent nucleolus) that are diagnostically significant.

IHC is a technique used for diagnosing whether a cancer is benign or malignant and for determining the stage of a tumor. By revealing the presence or absence of specific proteins in the observed tissue, IHC helps in determining which cell type is at the origin of a tumor. According to the proteins revealed by IHC, specific therapeutic treatments adapted for this type of cancer are selected. Fig. 1(b) shows an example of IHC under light microscopy.

After staining, fast slide scanners are used to generate digital images that contain relevant information about the specimen at a microscopic level. They embark one or multiple lenses to magnify the sample and capture digital images with a camera. They are capable of digitizing complete slides usually at $\times \hbox{20}$ or $\times \hbox{40}$ magnifications. The output of the digital scanners is multilayered images, stored in a format that enables fast zooming and panning.

For illumination, uniform light spectrum is used to highlight the tissue slide. The microscope setup, sample thickness, appearance, and staining may cause uneven illumination. In addition, most camera technologies have low response to short wavelength (blue) illumination and have a high sensitivity at long wavelength (red to infrared) regions. To reduce these differences in illumination, most slide scanners provide standard packages to normalize and correct spectral and spatial illumination variations. To address the problem of color nonstandardness, Monaco et al. [32] presented a robust Bayesian color segmentation algorithm that dynamically estimates the probability density functions describing the color and spatial properties of salient objects.

SECTION III

Among the different types of nuclei, two types are usually the object of particular interest: lymphocyte and epithelial nuclei. Nuclei may look very different according to a number of factors such as nuclei type, malignancy of the disease, and nuclei life cycle. Lymphocyte is a type of white blood cell that has a major role in the immune system. Lymphocyte nuclei (LN) are inflammatory nuclei having regular shape and smaller size than epithelial nuclei (EN) [see Fig. 2(a)]. Nonpathological EN have nearly uniform chromatin distribution with smooth boundary [see Fig. 2(b)]. In high-grade cancer tissue, EN are larger in size, may have heterogeneous chromatin distribution, irregular boundaries, referred to as nuclear pleomorphism, and clearly visible nucleoli as compared to normal EN [see Fig. 2(c)]. The variation in nuclei shape, size, and texture during nuclei life cycle, mitotic nuclei (MN), is another factor of complexity [see Fig. 2(d)].

Automated nuclei segmentation is now a well-studied topic for which a large number of methods have been described in the literature and new methodologies continue to be investigated. Detection, segmentation, and classification of nuclei in routinely stained histopathological images pose a difficult computer vision problem due to high variability in images caused by a number of factors including differences in slide preparation (dyes concentration, evenness of the cut, presence of foreign artifacts or damage to the tissue sample, etc.) and image acquisition (artifacts introduced by the compression of the image, presence of digital noise, specific features of the slide scanner, etc.). Furthermore, nuclei are often organized in overlapping clusters and have heterogeneous aspects. All these problems (highlighted in Fig. 3) make the nuclei detection, segmentation, and classification a challenging problem. A successful image processing approach will have to overcome these issues in a robust way in order to maintain a high level in the quality and accuracy in all situations.

SECTION IV

Nuclei detection and segmentation are important steps in cancer diagnosis and grading. The aspect of nuclei is critical for evaluating the existence of disease and its severity. For example, infiltration of LN in breast cancer is related to patient survival and outcome [33]. Similarly, nuclei pleomorphism has diagnostic value for cancer grading [34]– [35][36]. Furthermore, mitotic count is also an important prognostic parameter in breast cancer grading [34]. In Section IV-A, we introduce the most commonly used image processing methods. Numerous works, described in Sections IV-B, IV-C, IV-D, and IV-E, use a single or a combination of these image processing methods for preprocessing, detection, segmentation, and separation, respectively.

We begin with basic definitions. An image $I$ is a function TeX Source$$I: {\cal U} \longrightarrow \left[0, 1 \right]^{\rm c} \eqno{\hbox{(1)}}$$ where ${\cal U} = [[0;m-1]] \times [[0;n-1]]$ are the pixels, $m$ and $n$ are the number of rows and columns, and $c$ is the number of channels (also called colors), usually $c \in \{1,3\}$. $I(i)$ is the $i$th pixel value in the image $I$, where $i \in {\cal U}$. A part of image $I$ denoted $I_j$ is a restriction of $I$ to a connected subset of pixels.

Thresholding is a method used for converting intensity image $I$ into a binary image ${I}^{\prime }$ by assigning all pixels to the value one or zero if their intensity is above or below some threshold $T$. Threshold $T$ can be global or local. If $T$ is a global threshold, then ${I}^{\prime }$ is a binary image of $I$ as TeX Source$${I}^{\prime } (i) = \left\{\matrix{1, \hfill & \hbox{if }\, I(i) \ge T\hfill \cr 0, \hfill & \hbox{otherwise.}\hfill}\right. \eqno{\hbox{(2)}}$$

A threshold value can be estimated using computational methods like the Otsu method which determines an optimal threshold by minimizing the intraclass variance [37]. Another thresholding technique is local (adaptive) thresholding that handles nonuniform illumination. It can be determined by either splitting an image into subimages and calculating thresholds for each subimage or examining the image intensity in the pixel's neighborhood [38].

Morphology is a set-theoretic approach that considers an image as the elements of a set [39] and process images as geometrical shapes [40]. The basic idea is to probe an image $I$ with a simple, predefined shape, drawing conclusions on how this shape fits or misses the shapes in the image. This simple probe is called the structuring element and is a subset of the image. The typically used binary structuring elements are crosses, squares, and open disks.

The two basic morphological operators are the erosion $\ominus$ and the dilation $\oplus$. Let $I:{\cal U} \longrightarrow \{0,1\}$ be a binary image and ${\cal U}_f = I^{-1}(\{1\})$ be the foreground pixels. The erosion and dilation of the binary image $I$ by the structuring element $S \in {\bb Z} \times {\bb Z}$ are defined as TeX Source$$\eqalignno{\hbox{Erosion: } {\cal U}_f \ominus S &= \{x \vert \forall s \in S, x+s \in {\cal U}_f\} \cr \hbox{Dilation: } {\cal U}_f \oplus S &= \{x + s \vert x \in I \wedge s \in S \} . &\hbox{(3)}}$$

The basic effect of erosion (dilation) operator on an image is to shrink (enlarge) the boundaries of foreground pixels. Two other major operations in morphology are opening ° and closing •. Opening is an erosion of an image followed by dilation; it eliminates small objects and sharpens peaks in the object. Opening is mathematically defined as TeX Source$${\cal U}_f \circ S = [{\cal U}_f \ominus S] \oplus S. \eqno{\hbox{(4)}}$$

Closing is a dilation of an image followed by an erosion; it fuses narrow breaks and fills small holes and gaps in the image. Closing is mathematically defined as TeX Source$${\cal U}_f \bullet S = [{\cal U}_f \oplus S] \ominus S. \eqno{\hbox{(5)}}$$

White and black top-hat transforms are two other operations derived from morphology. They allow to extract small elements and details from given images. The white top-hat transform is defined as the difference between image $I$ and its opening as TeX Source$$T_w(I) = {\cal U}_f - [{\cal U}_f \circ S]. \eqno{\hbox{(6)}}$$

The black top-hat transform is defined as the difference between image $I$ and its closing as TeX Source$$T_b(I) = {\cal U}_f - [{\cal U}_f \bullet S]. \eqno{\hbox{(7)}}$$

In addition, morphological gradient, which is the difference between the dilation and the erosion of a given image, is useful for edge detection. It is defined as TeX Source$$G(I) = [{\cal U}_f \bullet S] - [{\cal U}_f \circ S]. \eqno{\hbox{(8)}}$$

Region growing [41] is an image segmentation method consisting of two steps. The first step is the selection of seed points and the second step is a classification of neighboring pixels to determine whether those pixels should be added to the region or not by minimizing a cost function. Let $Pr(I_i)$ is a logical predicate which measures the similarity of a region $I_i$. The segmentation results in a partition of $I$ into regions ($I_1,I_2, \dots, I_n$), so that the following conditions hold:

- $Pr(I_i) =$ TRUE for all $i = 1, 2, \dots, n$;
- $Pr(I_i \cup I_j)=$ FALSE, $\forall I_i, I_j (i\ne j)$ adjacent regions.

The $Pr$ that are often used are gray level (average intensity and variance), color, texture, and shape related.

Watershed is a segmentation method that usually starts from specific pixels called markers and gradually floods the surrounding regions of markers, called catchment basin, by treating pixel values as a local topography. Catchment basins are separated topographically from adjacent catchment basins by maximum altitude lines called watershed lines. It allows to classify every point of a topographic surface as either belonging to the catchment basin associated with one of the local minima or to the watershed line. Details about watershed can be found in [42]. The basic mathematical definition contains lower slope $\hbox{LS}(i)$ that is the maximum slope connecting pixel $i$ in the image $I$ to its neighbors of lower altitude as TeX Source$$\hbox{LS}(i) = \max_{j \in N(i)} \left({I(i) - I(j)\over d(i,j)} \right) \eqno{\hbox{(9)}}$$ where $N(i)$ is neighbors of pixel $i$ and $d(i,j)$ is the Euclidean distance between pixels $i$ and $j$. In case of $i=j$, the lower slope is forced to be zero. The cost of moving from pixel $i$ to $j$ is defined as TeX Source$$\hbox{cost}(i,j) = \left\{\matrix{LS(i) \cdot d(i,j), \hfill & \hbox{if } I(i) > I(j)\hfill \cr LS(j) \cdot d(i,j), \hfill & \hbox{if } I(i) < I(j) \hfill \cr {1\over 2} (LS(i) + LS(j)) \cdot d(i,j), \hfill & \hbox{if } I(i) = I(j).\hfill } \right. \eqno{\hbox{(10)}}$$

The topographical distance between the two pixels $i$ and $j$ is expressed as TeX Source$$\min_{(i_0, \dots, i_t) \in \Pi } \sum_{k=0}^{t-1} d(i_k, i_{k+1}) \cdot \hbox{cost} \left(i_k, i_{k+1}\right) \eqno{\hbox{(11)}}$$ where $\Pi$ is the set of all paths from $i$ to $j$. The watershed transformation is usually computed on the gradient image instead of the intensity image.

Active contour models (ACMs) or deformable models, widely used in image segmentation, are deformable splines that can be used to depict the contour of objects in an image using gradient information by seeking to minimize an energy function [43]. In case of nuclei segmentation, the contour points that yield the minimum energy level form the boundary of nuclei. The energy function is often defined to penalize discontinuity in the curve shape and gray-level discontinuity along the contour [12]. The general ACM is defined using the energy function ${\bb E}$ over the contour points $c$ as TeX Source$${\bb E} = \oint_{c} \left(\alpha {\bb E}_{\rm Int} (c) + \beta {\bb E}_{\rm Img} (c) + \gamma {\bb E}_{\rm Ext} (c) \right) dc \eqno{\hbox{(12)}}$$ where ${\bb E}_{\rm Int}$ controls the shape and length of the contour (often called internal energy), ${\bb E}_{\rm Img}$ influences adjustment of local parts of the contour to the image values regardless of the contour geometry (referring as image energy), and ${\bb E}_{\rm Ext}$ is the user-defined force or prior knowledge of object to control the contour (referring as external energy). $\alpha$, $\beta$, and $\gamma$ are empirically derived constants.

There are two main forms of ACMs. An explicit parametric representation of the contour, called snakes, is robust to image noise and boundary gaps as it constrains the extracted boundaries to be smooth. However, in case of splitting or merging of contours, snakes are restricted for topological adaptability of the model. Alternatively, the implicit ACM, called level sets, is specifically designed to handle topological changes, but they are not robust to boundary gaps and have other deficiencies as well [44]. The basic idea is to determine level curves from a potential function.

The K-means clustering [45] is an iterative method used to partition an image into $K$ clusters. The basic algorithm is as follows.

- Pick $K$ cluster centers, either randomly or based on some heuristic.
- Assign cluster label to each pixel in the image that minimizes the distance between the pixel and the cluster center.
- Recompute the cluster centers by averaging all the pixels in the cluster.
- Repeat steps 2) and 3) until convergence is attained or no pixel changes its cluster.

The difference is typically based on the pixel value, texture, and location, or a weighted combination of these factors. Its robustness depends mainly on the initialization of clusters.

Probabilistic models can be viewed as an extension of K-means clustering. Gaussian mixture models (GMMs) are a popular parametric probabilistic model represented as weighted sum of Gaussian cluster densities. The image is modeled according to the probability distribution TeX Source$$P(I(i)) = \sum_{k=1}^K w_k\ {\cal N}(I(i) \vert \mu_k,\sigma^2_k) \eqno{\hbox{(13)}}$$ where $K$ is the number of clusters (objects in the image), $\mu_k$, $\sigma^2_k$, and $w_k$ are mean, variance, and weight of cluster $k$, respectively. The $w_k$ are positive real values such that $\sum_{k=1}^K w_k=1$.

The parameters of GMM are estimated from training data using the computation method like expectation maximization (EM) [46] that iteratively finds maximum likelihood. The EM is based on the following four steps.

- Initialization: The parameters $\mu^{(0)}_k$, $\sigma^{2^{(0)}}_k$, and $w^{(0)}_k$ are randomly initialized for each cluster $C_k$.
- Expectation: For each pixel $I(i)$ and cluster $C_k$, conditional probability $P(C_k \vert I(i))$ is computed as TeX Source$$P(C_k\vert I(i))^{(t)} = {w^{(t)}_k {\cal N}(I(i) \vert \mu^{(t)}_k \sigma^{2^{(t)}}_k) \over \sum^K_{j=1} w^{(t)}_j {\cal N}(I(i) \vert \mu^{(t)}_j \sigma^{2^{(t)}}_j) } . \eqno{\hbox{(14)}}$$
- Maximization: The parameters $\mu^{(t)}_k$, $\sigma^{2^{(t)}}_k$, and $w^{(t)}_k$ of each cluster $C_k$ are now maximized using all pixels and the computed probabilities $P(C_k \vert {I})^{(t)}$ from expectation step as TeX Source$$\eqalignno{\mu^{(t+1)}_k &= {\sum^{{\rm U}}_i P(C_k \vert I(i))^{(t)} \cdot I(i)\over \sum^{{\rm U}}_i P(C_k \vert I(i))^{(t)} } &\hbox{(15)}\cr \sigma^{(t+1)}_k &= {\sum^{{\rm U}}_i P(C_k \vert I(i))^{(t)} \cdot (I(i) - \mu^{(t+1)}_k)^2\over \sum^{{\rm U}}_i P(C_k \vert I(i))^{(t)}} \quad &\hbox{(16)}\cr w_k^{(t+1)} &= {\sum^{{\rm U}}_i P(C_k \vert I(i))^{(t)}\over {{\cal U}}} . &\hbox{(17)}}$$ with ${{\cal U}}$, the total number of pixels in $I$.
- Termination: Steps 2) and 3) are repeated until parameters converge.

Instead of pixel values, other features can be used like texture. Carson et al. [47] described the use of a new set of texture features polarity, anisotropy, and contrast. Polarity is a measure of a gradient vector for all neighborhood pixels, anisotropy is a ratio of the eigenvalues of the second moment matrix, and contrast is a measure of homogeneity of pixels.

Graph cuts (Gcuts) refers to a wide family of algorithms, in which an image is conceptualized as weighted undirected graph $G(V,E)$ by representing nodes $V$ with pixels, weighted edges $E$ with similarity (affinity) measure between nodes $W:V^2 \longrightarrow {\bb R}^+$. A similarity measure is computed from intensity, spatial distribution, or any features between two pixels. The Gcuts method partitions the graph into disjoint subgraphs so that similarity is high within the subgraphs and low across different subgraphs. The degree of dissimilarity between two subgraphs $A$ and $B$ can be computed as the sum of weights of the edges that must be removed to separate $A(V_A, E_A)$ and $B(V_B, E_B)$. This total weight is called a cut TeX Source$$\hbox{cut}(A,B) = \sum_{u \in V_A, v \in V_B} w(u,v). \eqno{\hbox{(18)}}$$

An intuitive way is to look for the minimum cut in the graph. However, the minimum cut criterion favors small isolated regions, which are not useful in finding large uniform regions. The normalized cut (Ncut) solves this problem by computing the cut cost as a fraction of total edge connections to all the nodes in the graph. It is mathematically defined as TeX Source$$\hbox{Ncut}(A,B) = {\hbox{cut}(A,B)\over \sum_{u\in V_A, t \in V} w(u,t)} + {\hbox{cut}(A,B)\over \sum_{v\in V_B, t \in V} w(v,t)} . \eqno{\hbox{(19)}}$$

Ncut value would not be small for the cut that partitions isolating points, because the cut value will be a large percentage of the total connection from that set to the others. The basic procedure used to find the minimum Ncut is explained here [48].

These image processing methods are extensively used in recently proposed frameworks for preprocessing, nuclei detection, segmentation, separation, and classification. Based on these image processing methods, we compiled a list of existing frameworks for nuclei detection, segmentation, separation, and classification in histopathology as shown in Table II. In the following sections, we discuss how different image processing methods have been used.

Preprocessing can be performed to compensate for adverse conditions such as the presence of batch effects. Batch effect refers to unevenness in illumination, color, or other image parameters recurring across multiple images. Noise reduction and artifacts elimination can also be performed prior to detection and segmentation. Additionally, region of interest (ROI) detection can also be performed in order to reduce the processing time.

The illumination can be corrected either by using white shading correction or by estimating the illumination pattern from a series of images. In white shading correction, a blank (empty) image is captured and used to correct images pixel by pixel [73]. A common equation is TeX Source$$\hbox{Transmittance} = {\hbox{Specimen value} - \hbox{Background value}\over \hbox{White Reference value} - \hbox{Background value}} . \eqno{\hbox{(20)}}$$

A downside of this method is that a blank image must be acquired for each lens magnification whenever the microscope illumination settings are altered.

An alternative normalization method is based upon the intrinsic properties of the image which are revealed through Gaussian smoothing [74]. Another possible way is to estimate background by exploiting the images of the specimen directly, even in the presence of the object [75], [76]. Can et al. [77] introduced a method to correct nonuniform illumination variation by modeling the observed image $I(i)$ as product of the excitation pattern, $E(i)$, and the emission pattern, $M(i)$ as TeX Source$$I(i) = E(i) \times M(i). \eqno{\hbox{(21)}}$$

While the emission pattern captures the tissue-dependent staining, the excitation pattern captures the illumination. From a set of $J$ images, $I_j(i)$ denotes an ordered set of pixels. Assuming that a certain percentage, $g$, of the image is formed from stained tissue (nonzero background), then a trimmed average of the brightest pixels can be used to estimate the excitation pattern TeX Source$$E^{{\prime}}_{{\rm AVE}}(i) = {1\over J-K+1} \sum^J_{j=K} I_j(i) \eqno{\hbox{(22)}}$$ where $K$ is set to an integer closest to $J(1-g)+1$.

Many color normalization techniques have been proposed [78]– [79] [80][81], including histogram or quantile normalization in which the distributions of the three color channels are normalized separately. Kothari et al. [81] used histogram-based normalization in histopathological images. They proposed a rank function which maps the intensity ranges across all pixels. Alternatively, Reinhard et al. [82] proposed a method for matching the color distribution of an image to that of the reference image by use of a linear transform in a perceptual color model (Lab color space). Magee et al. [83] extended Reinhard's normalization approach to multiple pixel classes by using a probabilistic (GMM) color segmentation method. It applies a separate linear normalization for each pixel where class membership is defined by a pixel being colored by a particular chemical stain or being uncolored, i.e., background.

In order to deal with stains colocalization, a very common phenomenon in histopathological images, color deconvolution is effective in separation of stains [84]. Ruifrok et al. [84] explains how virtually every set of three colors can be separated by color deconvolution and reconstructed for each stain separately. It requires prior knowledge of color vectors (RGB) of each specific stain. Later, Macenko et al. [80] proposed the automatic derivation of these color vectors, a method further refined by Niethammer et al. [85] and Magee et al. [83]. Several nuclei detection and segmentation methods [25] [49], [59], [67], [86] are using color deconvolution-based separation of stains in histopathological images.

Different color models can be used. Most detection and segmentation methods [9]– [10] [17] [24] [25] [50][64] use the RGB color model, although the RGB model is not a perceptually uniform color model. Other more perceptual color models such as HSV, Lab, and Luv are sometimes used [11]– [18] [19] [27] [51] [70] [72] [86] [87] [88][89].

Thresholding is used for noise reduction that usually follows filtering and background correction in order to minimize random noise and artifacts [22], [90]. The pixels that lie outside threshold values are often determined using intensity histogram are considered to be noisy. Alternatively, applying the threshold function on a group of pixels instead of an individual pixel eliminates a noisy region. While such techniques are successful to eliminate small spots of noise, they fail at eliminating large artifacts [91].

Alternatively, morphological operations can also be used for noise reduction. Noise and artifacts are eliminated using morphological operations like closings and openings [59]. Morphological gray-scale reconstruction methods are used to eliminate noise while preserving the nuclei shape [24] [54], [55], [70]. While thresholding and filtering reduce noise according to pixel intensities, morphology reduces noise based on the shape characteristics of the input image, as characterized by a structuring element. Morphology cannot distinguish the nuclei areas and artifacts having a nuclear-like shape but different intensity values. Thresholding (prior or subsequent to applying the morphological operations) removes such artifacts.

Adaptive filters [92], Gamma correction [17], and histogram equalization [52] have been used to increase the contrast between foreground (nuclei) and background regions. Anisotropic diffusion is used to smooth nuclei information without degrading nuclei edges [52], [86]. Gaussian filtering is also used to smooth nuclei regions [18] [26], [61].

In some frameworks, noise reduction and ROI detection are performed simultaneously. For example, for tissue level feature computation, the preprocessing step selects the ROI by excluding regions with little content and noise [91]. For nuclei level feature computation, noise reduction is succeeded by ROI detection to determine the nuclei region [70], [86].

Thresholding is popular for ROI detection. Sertel et al. [52] introduced the nuclei and cytological components as ROI for grading of follicular lymphoma (FL). Red blood cells (RBCs) and background regions show uniform patterns as compared to other nuclei in FL tissue; thus, thresholding is performed in RGB color model for elimination of RBCs and background. Similarly, Dalle et al. [17] selected neoplasm ROI for nuclei pleomorphism in breast cancer images by using Otsu thresholding along with morphological operations.

Clustering is another method that is commonly used for ROI detection. Cataldo et al. [25] performed automated separation of cancer from noncancerous regions (stroma, blood vessels) using unsupervised clustering. Then, cancerous and noncancerous regions are refined using morphological operations. Dundar et al. [19] proposed a framework for classification of intraductal breast lesions as benign or malignant using the cellular component. The intraductal breast lesions contain four components: cellular, extra cellular, regions with hues of red, and illumina. The H&E-stained image data are modeled into four components using GMM. Parameters of the GMM model are estimated using EM [46]. The resulting mixture distribution is used to classify pixels into four categories. Those classified as the cellular component are further clustered by dynamic thresholding to eliminate blue–purple pixels with relatively less luminance. The remaining pixels are considered cellular region and are used in lesion classification.

Using textural information, Khan et al. [70] proposed a novel and unsupervised approach to segment breast cancer histopathology images into two regions; hypo-cellular stroma (HypoCS) and hyper-cellular stroma (HyperCS). This approach employs magnitude and phase spectrum in the Gabor frequency domain to segment HypoCS and HyperCS regions, respectively. For MN detection in breast cancer histopathology images, the false positive rate (FPR) is reduced by four times by using this technique [86].

The identification of initial markers or seed points, usually one per nucleus and close to its center, is a prerequisite for most nuclei segmentation methods. The accuracy of segmentation methods depends critically on the reliability of the seed points. Initial works in this field rely upon the peaks of the Euclidean distance map [17]. The H-maxima transform detects local maxima as seed points [26]– [53] [54][55], being highly sensitive to texture and often resulting in overseeding. The Hough transform detects seed points for circular-shaped nuclei but requires heavy computation [49]. The Centroid transform also detects seeds but limitations make it useful only for binarized images, being unable to exploit additional cues.

The Euclidean distance map is commonly used for seed detection and Laplacian of Gaussian ($\hbox{LoG}$) is a generic blob detection method. Using multiscale $\hbox{LoG}$ filter with a Euclidean distance map offers important advantages, including computational efficiency and ability to exploit shape and size information. Al-kofahi et al. [58] proposed a distance-constrained multiscale $\hbox{LoG}$ filtering method to identify the center of nuclei by exploiting shape and size cues available in the Euclidean distance map of the binarized image. The main steps of this methodology are as follows.

- Initially, compute the response of the scale-normalized $\hbox{LoG}$ filter $(\hbox{LoG}_{{\rm norm}}(i;\xi) = \xi^2\ \hbox{LoG}(i;\xi))$ at multiple scales $\xi = [\xi_{{\rm min}}, \ldots, \xi_{{\rm max}}]$.
- Use the Euclidean distance map $D_N(i)$ to constrain the maximum scale values when combining the $\hbox{LoG}$ filtering results across scales to compute a single response surface $R_N(i)$ as TeX Source$$R_N(i) = \mathop{\hbox{arg max}}\limits_{\xi \in [\xi_{\rm min}, \xi_{\rm MAX}]} \{\hbox{LoG}_{{\rm norm}}(i;\xi) \times I_N(i) \} \eqno{\hbox{(23)}}$$ where $\xi_{{\rm MAX}} = \hbox{max}\{\xi_{{\rm min}}, \hbox{min}\{\xi_{{\rm max}}, 2 \times D_N(i)\} \}$, and $I_N(i)$ is the nuclear channel image extracted by separating the foreground pixel from background pixel using automatic binarization.
- Identify the local maxima of $R_N(i)$ and impose a minimum region size to filter out irrelevant minima.

This methodology improves the accuracy of seed locations. The main disadvantage of this methodology is its sensitivity to even minor peaks in the distance map that results in over segmentation and detection of tiny regions as nuclei.

The radial symmetry transform (RST) is also used for seed detection. Loy and Zelinsky [93] proposed fast gradient-based interest operator for detection of seed points having high radial symmetry. Although this approach is inspired by the results of the generalized symmetry transform, it determines the symmetrical contribution of each pixel around it, rather than considering the contribution of a local neighborhood to a central pixel. Veta et al. [59] also employed RST for seed detection.

Recently, several other approaches have been proposed to detect the seed points. Qi et al. [64] proposed a novel and fast algorithm for seed detection by utilizing single-path voting with the shifted Gaussian kernel. The shifted Gaussian kernel is specifically designed by amplifying the voting at the center of the targeted object and resulted in low occurrence of false seeds in overlapping regions. First, a cone shape $(r_{{\rm min}}, r_{{\rm max}}, \Delta)$ with its vertex at $(x,y)$ is used to define the voting area $A(x, y; r_{{\rm min}}, r_{{\rm max}}, \Delta)$, where $r_{{\rm min}}$ is a minimum radius, $r_{{\rm max}}$ is a maximum radius, and $\Delta$ is the aperture angle of the cone. The voting direction $\alpha (x,y)$ is computed using the negative gradient direction $-(\hbox{cos}(\theta (x,y)), \hbox{sin}(\theta (x,y))$, where $\theta$ is the angle of the gradient direction with respect to $x$-axis. The voting image $V(x,y; r_{{\rm min}}, r_{{\rm max}}, \Delta)$ is generated using the shifted Gaussian kernel with its means $\mu_x,\mu_y$ and standard deviation $\sigma$ located at the center $(x,y)$ of the voting area $A$ and oriented in the voting direction $\alpha$ using single path approach as TeX Source$$V(x,y;r_{{\rm min}}, r_{{\rm max}}, \Delta) \!=\! \sum_{(u,v) \in A}\Vert \triangledown I(x,y)\Vert {\cal N}(x,y,\mu_x,\mu_y,\sigma) \eqno{\hbox{(24)}}$$ where $\Vert \triangledown I(x,y)\Vert$ is the magnitude of gradient image and ${\cal N}(x,y,\mu_x,\mu_y,\sigma)$ is a 2-D shifted Gaussian kernel defined as TeX Source$${\cal N}(x, y, \mu_x, \mu_y, \sigma) = {1\over 2 \pi \sigma^2} \exp \left(- {(x-\mu_x)^2 + (y - \mu_y)^2\over 2 \sigma^2} \right), \eqno{\hbox{(25)}}$$ where $\mu_x = x + {\cos \theta \over 2} (r_{{\rm max}} + r_{{\rm min}})$ and $\mu_y \!=\! y - {\sin \theta \over 2} (r_{{\rm max}} \!+ r_{{\rm min}})$. Later, the seed points are determined by executing mean shift on the sum of voting images. They have compared their results with iterative voting method in [94].

Counting nuclei by type is highly important for grading purpose. However, manual counting of nuclei is tedious and subject to considerable inter- and intrareader variations. Fuchs and Buhmann [95] reported 42% disagreement between five pathologists on classification of nuclei as normal or atypical. They also reported intrapathologist error of 21.2%. This shows the high potential added value of automatic counting tools.

MN count provides clues to estimate the proliferation and the aggressiveness of the tumor [62]. Anari et al. [88] proposed the fuzzy c-means (FCM) clustering method along with the ultraerosion operation in the Lab color model for detection of MN in IHC images of meningioma. They reported detection accuracy nearly equal as manual annotation. The FCM clustering method is based on the minimization of the following objective function: TeX Source$$J_m(V,C)= \sum_{k=1}^c \sum_{i=1}^{{\cal U}} v_{ki}^m \left\Vert I(i) - C_k\right\Vert^2 \eqno{\hbox{(26)}}$$ with $m > 1 \, (m \in {\bb R})$, ${{\cal U}}$ is the total number of pixels in $I$, $C=\{C_1, C_2, \dots, C_c \}$ are the cluster centers, and $V = [v_{ki}]$ is a $c \times {{\cal U}}$ matrix in which $v_{ki}$ is the $k\hbox{th}$ membership value of $i\hbox{th}$ pixel, such that $\sum_{i=1}^{{\cal U}} v_{ki} = 1$. The membership function $v_{ki}$ is TeX Source$$v_{ki} = {1\over \sum_{j=1}^{{\cal U}} \left({\left\Vert I(i) - C_k \right\Vert \over \left\Vert I(i) - C_j \right\Vert } \right)^{{2\over m-1}}} \eqno{\hbox{(27)}}$$ with the cluster center TeX Source$$C_k = {\sum_{i=1}^{\cal U} v_{ki}^m \cdot I(i)\over \sum_{i=1}^{\cal U} v_{ki}^m} . \eqno{\hbox{(28)}}$$

Recently, Roullier et al. [62] proposed a graph-based multiresolution framework for MN detection in breast cancer IHC images. This approach consists in unsupervised clustering at low resolution followed by refinements at a higher resolution. At multiresolution level, mitotic regions are initially segmented by using the following discrete label regularization function: TeX Source$$\min_{f \in {\cal H}(V)} \left\{R(f)+{\lambda \over 2} \left\Vert f-f^0 \right\Vert^2 \right\} \eqno{\hbox{(29)}}$$ where the first term $R(f)$ is the regularizer defined as the discrete Dirichlet form of the function $f \in {\cal H}(V) : R_w(f) = {1\over 2} \sum_{u \in V} [\sum_{v \sim u} w(u,v) (f(v)-f(u))^2]^{{1\over 2} }$ and ${\cal H}(V)$ is the Hilbert space of real valued functions defined on the vertices $V$ of a graph. The second term is a fitting term. $\lambda \ge 0$ is a fidelity parameter called the Lagrange multiplier which specifies the tradeoff between the two competing terms. The Gauss–Jacobi method is used to approximate the solution of minimization in (29) by the following iterative algorithm: TeX Source$$\left\{\matrix{f^{(0)}(u) = f^0(u)\hfill\cr \displaystyle f^{(t+1)}(u) = {\lambda f^0(u)+\sum_{v \sim u} w(u,v) f^{(t)}(v)\over \lambda + \sum_{v \sim u}w(u,v)}, \forall u \in V\hfill }\right. \eqno{\hbox{(30)}}$$ where $f^{(t)}$ is function at the iteration step $t$. More details on these definitions can be found in [62]. This discrete regularization is adapted for labeling the mitotic regions at higher resolution. The authors reported more than 70% TPR and 80% TNR.

The use of EM for GMM was recently proposed by Khan et al. [86] for the detection of MN in breast cancer histopathological images. In this framework, pixel intensity of mitotic and nonmitotic region is modeled by a Gamma–Gaussian mixture model as TeX Source$$f(I_i;\theta) = \rho_1 \Gamma (I(i);\;\psi,\xi) + \rho_2 {\cal N}(I(i); \mu,\sigma) \eqno{\hbox{(31)}}$$ where $\rho_1$ and $\rho_2$ represent the mixing proportions (prior) of the intensities belonging to mitotic and nonmitotic regions, respectively. $\Gamma (I(i);\psi,\xi)$ represents the Gamma density function for mitotic regions; it is parameterized by shape ($\psi$) and scale ($\xi$) parameters. ${\cal N}(I(i);\mu,\sigma)$ represents the Gaussian density function for nonmitotic regions; it is parameterized by $\mu$ and $\sigma$. In order to estimate unknown parameter ($\theta$), the EM method is employed for the maximum likelihood estimation. The log-likelihood function $\varrho$ of parameter vector $\theta$ is defined as TeX Source$$\varrho (\theta) = \sum_{i=1}^{{\cal U}} \hbox{log} f(I(i); \theta) \eqno{\hbox{(32)}}$$ where $f(I(i); \theta)$ is the mixture density function in (31). The EM method finds the maximum likelihood estimation of the marginal likelihood by iteratively applying expectation and maximization steps iteratively as TeX Source$$\eqalignno{\varrho^c (\theta) &= \sum_{i=1}^{{\cal U}}\sum_{k=1}^{2} {w}_{ik} \hbox{log} \rho_k + \sum_{i=1}^{{\cal U}}\{{w}_{i1} \hbox{log} [\Gamma (I_i;\psi,\xi)] \} \cr &\quad + \sum_{i=1}^{{\cal U}}\{{w}_{i2} \hbox{log} [{\cal N}(I_i; \mu,\sigma)] \} &\hbox{(33)}\cr \hat{\theta } &= \mathop{\hbox{argmax}}\limits_{\theta }\,\, \varrho (\theta) &\hbox{(34)}}$$ where ${w}_{ik}, k=1,2$ are indicator variables showing the component membership of each pixel $I(i)$ in the mixture model (31). This method reported $\hbox{51}\%$ F-score during ICPR 2012 Contest [96].

Cireşan et al. [97] used deep max-pooling convolutional neural networks (CNNs) to detect MN and achieved highest F-score (78%) during ICPR 2012 contest [96]. A training dataset consisting of patch images centered on ground truth mitosis is used to train a CNN. The trained CNN is then used to compute a map of probabilities of mitosis over the whole image. Their approach proved to be very efficient and to have a much lower number of false positives (FPs) as compared to the other contestants.

Grading of lymphocytic infiltration based on detection of large number of LN in IHC HER2+ breast cancer histopathology was reported by Basavanhally et al. [18]. In this framework, LN are automatically detected by a region growing method which uses contrast measures to find optimal boundary. High detection sensitivity has been reported for this framework, resulting in a large number of nuclei other than lymphocytes being detected. In order to reduce the number of FP, size and luminance information based maximum a posteriori (MAP) estimation is applied to temporarily labeled candidates as either LN or CN. Later, Markov random field (MRF) theory with spatial proximity is used in order to finalize the labels. This framework has been evaluated on 41 HER2+ WSI and reported 90.41% detection accuracy as compared to 94.59% manual detection accuracy.

Nuclei features such as size, texture, shape, and other morphological appearance are important indicators for grading and prognosis of cancer. Consequently, classification and grading of cancer is highly dependent on the quality of segmentation of nuclei. The choice of the nuclei segmentation method is correlated with the feature computation method. For instance, some feature computation method requires the exact boundary points of nuclei to compute the nuclei morphology. In this case, high magnification images are required to utilize the exact details of nuclei. Other feature computation methods require their course location to compute topology features. A large number of publications on nuclei segmentation in histopathology use state-of-the-art image segmentation methods based on thresholding, morphology, region growing, watershed, ACMs, clustering, and Gcuts, separately or in combination.

The simplest way to detect and segment nuclei in histopathological images is based on thresholding and morphological operations, a simple methodology to segment nuclei [9]– [10][15], [89], [98]. This methodology reports higher performance on well-defined, preferably uniform background. The main parameters to tune are the threshold level and the size and shape of the structuring elements. The difference between nuclei and background regions may be diffuse, making it harder to find a reliable threshold level. Even though this methodology is usually defined only on gray-scale images, it can be extended to color images or stacks of images, using multidimensional kernels. This methodology actually suffers from its simplicity by including little object knowledge. In addition, it lacks robustness on size and shape variations, as well as on texture variations, which are very frequent in histopathological images. This methodology is not meant to segment clustered or overlapping nuclei.

Several authors have been using the watershed transform for nuclei segmentation [26] [54], [99]. The main advantage of watershed is that there is no tuning to do before using it. However, it requires the prior detection of seed points. The edge map and distance transform are used for seed detection [26], [54]. The reported results are suboptimal for ring-shaped nuclei having clear homogeneous regions. Furthermore, the watershed transform does not include any prior knowledge to improve its robustness.

ACMs can combine both shape characteristics (smoothness and shape model) with image features (image gradient and intensity distribution). However, the resulting segmentation is strongly dependent upon the initial seed points. Cosatto et al. [49] described an automated method for accurately and robustly measuring the size of neoplastic nuclei and providing an objective basis for pleomorphism grading. First, a difference of Gaussian (DoG) filter is used to detect nuclei. Then, the Hough transform is used to pick up radially symmetric shapes. Finally, an ACM with shape, texture, and fitness parameters is used to extract nuclei boundaries. The authors claimed 90% TPR.

Huang and Lai [24] proposed watershed and ACM-based framework for nuclei segmentation in hepatocellular carcinoma biopsy images. Initially, a dual morphological gray-scale reconstruction method is employed to remove noise and accentuate the shapes of nuclei. Then, a marker-controlled watershed transform is performed to find the edges of nuclei. Finally, ACM is applied to generate smooth and accurate contours for nuclei. This framework achieves poor segmentation in case of low contrast, noisy background, and damaged/irregular nuclei.

Dalle et al. proposed gradient in polar space (GiPS), a novel nuclei segmentation method [17]. Initially, nuclei are detected using thresholding and morphological operations. Then, transformation into polar coordinate system is performed for every patch with the center of mass of the nucleus as the origin. Finally, a biquadratic filtering is used to produce a gradient image from which nuclei boundaries are delineated. GiPS reports overall 7.84% accuracy error.

Ta et al. [53] proposed a method based on graph-based regularization. The specificity of this framework is to use graphs as a discrete modeling of images at different levels (pixels or regions) and different component relationships (grid graph, proximity graph, etc.). Based on Voronoi diagrams, a novel image partition (graph reduction) algorithm is proposed for segmentation of nuclei in serous cytological and breast cancer histopathological images. A pseudometric $\delta : \hbox{V} \times \hbox{V} \rightarrow {\bb R}$ is defined as TeX Source$$\delta (u,v) = \min_{\rho \in P_G(u,v)} \sum_{i=1}^{m-1}\sqrt{w(u_i,u_{i+1})}\left(f(u_{i+1}) - f(u_i) \right) \eqno{\hbox{(35)}}$$ where $w(u_i,u_{i+1})$ is a weight function between two pixels and $P_G(u,v)$ is a set of paths connecting two vertices. Given a set of $K$ seeds $S=\left(s_i \subseteq \hbox{V} \right)$, where $i=1,2, \dots, K$, the energy $\delta : \hbox{V} \rightarrow {\bb R}$ induced by the metric $\delta$ for all the seeds of $S$ can be expressed as TeX Source$$\delta_S(u) = \min_{s_i \in S}\delta (s_i,u), \qquad \forall u \in \hbox{V}. \eqno{\hbox{(36)}}$$

The influence zone $z$ (also called Voronoi cell) of a given seed $s_i \in S$ is the set of vertices which are closer to $s_i$ than to any other seeds with respect to the metric $\delta$. It can be defined, $\forall j= 1,2, \ldots, K$ and $j \ne i$, as TeX Source$$z(s_i) = \left\{u \in \hbox{V} : \delta (s_i,u) \le \delta (s_j,u) \right\} . \eqno{\hbox{(37)}}$$

Then, the energy partition of graph, for a given set of seeds $S$ and a metric $\delta$, is the set of influence zones $Z(S,\delta)=\left\{Z(s_i), \forall s_i \in S \right\}$. The authors compared this method with k-means clustering and Bayesian classification methods in [100]. This method reported 95.73% segmentation accuracy as compared to k-means clustering and Bayesian classification methods which reported 93.67% and 96.47% accuracy, respectively.

Kofahi et al [58] proposed another Gcuts-based method for segmentation of breast CN. Initially, the foreground is extracted using Gcut-based binarization. The pixel labeling $I^{\prime }(i)$ is done by minimizing the following energy function: TeX Source$$\eqalignno{{\bb E}(I^{\prime }(i)) &= - \ln {\cal P}(I(i)) + \sum_{i} \sum_{j\in N(i)} \eta (I^{\prime }(i), I^{\prime }(j)) \cr &\quad\times \exp \left(- {I(i)-I(j)\over 2 \sigma_{I^{\prime}}^2} \right) &\hbox{(38)}}$$ where ${\cal P}(I(i)\vert k), k= 0,1$ is a Poisson distribution, $N(i)$ is a spatial neighborhood of pixel $i$, and TeX Source$$\eta (I^{\prime }(i), I^{\prime }(j)) = \left\{\matrix{1, \hfill & \hbox{if}\; I^{\prime }(i) \ne I^{\prime }(j)\hfill\cr 0, \hfill & \hbox{otherwise.}\hfill}\right. \eqno{\hbox{(39)}}$$ In (38), the first term is a data term that represents the cost of assigning a label to a pixel and the second term is a pixel continuity term that penalizes different labels for neighboring pixels when $\vert I(i)-I(j)\vert < \sigma_{I^{\prime}}$. After binarization, nuclear seed points are detected by combining multiscale LoG filtering constrained by a distance map-based adaptive scale selection (23). These detected seed points are used to perform initial segmentation which is refined later using a second Gcuts-based method with combination of alpha expansion and graph coloring to reduce computational complexity. The authors reported 86% accuracy on 25 histopathological images containing 7400 nuclei. The framework often causes oversegmentation when chromatin is highly textured and the shape of nuclei is extremely elongated. In case of highly clustered nuclei with weak borders between nuclei, undersegmentation may occur.

For nuclei segmentation in glioblastoma histopathology images, Chang et al. [66] proposed a multireference Gcuts framework for solving the problem of technical and biological variations by incorporating geodesic constraints. During labeling, a unique label $L(i)$ is assigned to each vertex $v \in \hbox{V}$ and the image cutout is performed by minimizing the energy TeX Source$$\eqalignno{{\bb E} &= \sum_{v \in \hbox{V}} ({\bb E}_{gf}L(v) + {\bb E}_{lf}L(v))\cr &\quad + \sum_{(v,u)\in E} {\bb E}_{{\rm smoothness}}(L(v),L(u)) &\hbox{(40)}}$$ where ${\bb E}_{gf}$ and ${\bb E}_{lf}$ are the global and local data fitness terms applying the fitness cost for assigning $L(v)$ to $v$, and ${\bb E}_{{\rm smoothness}}(L(v),L(u))$ is the prior energy, denoting the cost when the labels of adjacent vertices, $v$ and $u$ are $L(v)$ and $L(u)$, respectively. The authors reported 85% TPR and 75% PPV on TCGA dataset [101] of 440 WSI.

Vink et al. introduced a deterministic approach using machine learning technique to segment EN, LN, and fibroblast nuclei in IHC breast cancer images [69]. Initially, the authors report that one detector cannot cover the whole range of nuclei as diversity in appearance is too large to be covered by a single detector. They formulate two detectors (pixel-based and line-based) using modified AdaBoost. The first detector focuses on the inner structure of nuclei and second detector covers the line structure at the border of nuclei. The outputs of these two detectors are merged using an ACM to refine the border of the detected nuclei. The authors report 95% accuracy with computational cost of one second per field of view image.

These nuclei segmentation frameworks have reported good segmentation accuracy on LN, MC, and EN having regular shape, homogeneous chromatin distribution, smooth boundaries, and individual existence. However, these frameworks have poor segmentation accuracy for CN especially when CN are clustered and overlapping. Furthermore, they are intolerant to chromatin variations, which are very common in CN.

A second generation of nuclei segmentation frameworks tackles the challenges of heterogeneity, overlapping, and clustered nuclei by using machine learning algorithms together with classical segmentation methods. In addition, statistical and shape models are used to separate overlapping and clustered nuclei. As compared with nuclei segmentation methods, these methods are more tolerant to variations in shape of nuclei, partial occlusion, and differences of the staining.

The watershed transform is employed to address the problem of overlapping nuclei by defining a group of basins in the image domain, where ridges in-between basins are borders that isolate nuclei from each other [9] [19], [25], [54], [60]. Wahlby et al. [26] addressed the problem of clustered nuclei and proposed a methodology that combined the intensity and gradient information along with shape parameters for improved segmentation. Morphological filtering is used for finding nuclei seeds. Then, seeded watershed segmentation is applied on the gradient magnitude image to create the region borders. Later, the result of the initial segmentation is refined with gradient magnitude along the boundary separating neighboring objects, resulting in the removal of poorly contrasted objects. In final step, distance transform and shape-based cluster separation methodologies are applied keeping only the separation lines, which went through deep valleys in the distance map. The authors reported 90% accuracy for overlapping nuclei. Cloppet and Boucher [99] presented a scheme for segmentation of overlapping nuclei in immunofluorescence images by providing a specific set of markers to the watershed algorithm. They defined markers as split between overlapping structures and resulted in 77.59% accuracy in case of overlapping nuclei and 95.83% overall accuracy. In [102], a similar approach is used for segmentation of clustered and overlapping nuclei in tissue micro array (TMA) and WSI colorectal cancers. First, combined global and local thresholding are used to select foreground regions. Then, morphological filtering is applied to detect seed points. Region growing from seed points produces initial segmented nuclei. At last, clustered nuclei are separated using watershed and ellipse approximation. The authors claimed 80.3% accuracy.

The main problem with most ACMs is their sensitivity to initialization. To solve this initialization problem, Fatakdawala et al. [57] proposed EM-driven Geodesic ACM with overlap resolution for segmentation of LN in breast cancer histopathology and reported 86% TPR and 64% PPV. EM-based ACM initialization allows the model to focus on relevant objects of interest. The magnetostatic active contour [103] model is used as a force $F$ guiding contour toward boundary. Based on contours enclosing multiple objects, high concavity points are detected on the contours and used in the construction of an edge-path graph. Then, a scheme based on high concavity points and size heuristic is used to resolve overlapping nuclei. The degree of concavity/convexity is proportional to the angle $\theta (c_w)$ between contour points. It is computed as follows: TeX Source$$\theta (c_w) = \pi - \hbox{arccos}\left({(c_w - c_{w-1}) \cdot (c_{w+1}-c_w)\over \vert c_w-c_{w-1}\vert \vert c_{w+1}-c_w\vert } \right) \eqno{\hbox{(41)}}$$ where $c_w$ is a point on the contour.

Yang et al. [51] proposed a nuclei separation methodology in which concave vertex graph and Ncut algorithm are used. Initially, the outer boundary is delineated via robust estimation and color active model, and a concave vertex graph is constructed from automatically detected concave points on boundaries (41) and inner edges. By minimizing a morphological-based cost function, the optimal path in graph is recursively calculated to separate the touching nuclei.

Mouelhi et al. proposed an automatic separation method for clustered nuclei in breast cancer histopathology [61]. First, a modified GAC with the Chan–Vese energy model is used to detect the nuclei region [104]. Second, high concavity points along touching nuclei regions are detected (41). Third, the inner edges are extracted by applying the watershed transform on a hybrid distance transform image, which combines the geometric distance and color gradient information. Fourth, the concave vertex graph using high concavity points and inner edges is constructed. Last, the optimal separating curve is selected by computing the shortest path in the graph.

Moreover, for the recognition of single nuclei in nuclei cluster, Kong et al. [60] integrated a framework consisting of a novel supervised nuclei segmentation and touching nuclei splitting method. For initial segmentation of nuclei, each pixel is classified into nuclei or background regions by utilizing color-texture in the most discriminant color model. The differentiation between clustered and separated nuclei is computed using the distance between the radial symmetry center and the geometrical center of the connected component. For splitting of clustered nuclei, the boundaries of touching clumps are smoothed out by Fourier shape descriptor and then concave point detection is carried out. The authors evaluated this framework on FL images and achieved average 77% TPR and 5.55% splitting ER.

Another adaptive AC scheme that combines shape, boundary, region homogeneity, and mutual occlusion terms in a multilevel set formulation was proposed by Ali et al. [28], [63]. The segmentation of $K$ overlapping nuclei with respect to shape prior $\psi$ is solved by minimizing the following level set $\phi$ function: TeX Source$$\eqalignno{&{\bb E}(\Phi,\Psi,I_{\rm F},I_{\rm B}) = \matrix{\beta_s \underbrace{\sum_{k=1}^{K=2} \int_{\varpi } (\phi_k(I)-\psi (I))^2 \vert \triangledown \phi_k\vert \delta (\phi_k)d_I } \cr {\rm Shape + boundary\,\ energy}} \cr &\quad + \matrix{\underbrace{\beta_r \int_\varpi (\Theta_{\rm F} H_{\chi_1 \vee \chi_2}) d_I + \int_\varpi (\Theta_{\rm B} - H_{\chi_1 \vee \chi_2}) d_I } \cr {\rm Region\,\ energy}} \cr &\quad + \matrix{\underbrace{\omega \int_\varpi H_{\chi_1 \wedge \chi_2} d_I + \sum_{k=1}^{K=2} \int_\varpi (\phi_k-\psi_k)^2 d_I } \cr {\rm Mutual\,\ occlusion\,\ energy}} &\hbox{(42)}}$$ where $\Phi =(\phi_1,\phi_2)$, $\Psi =(\psi_1,\psi_2)$, $I_{\rm F}$ and $I_{\rm B}$ are foreground and background regions, $\beta_s,\beta_r,\omega > 0$ are constants that balance contributions of the shape and boundary, region and mutual occlusion term, respectively, $\delta (\cdot)$ is the Dirac delta function, and $\delta (\phi_k)$ is the contour measure on $\{\phi =0\}$, $H(\cdot)$ is the Heaviside function, $H_{\chi_1 \vee \chi_2} = (H_{\psi_1}+H_{\psi_2}-H_{\psi_i}H_{\psi_2})$, $H_{\chi_1 \wedge \chi_2} = H_{\psi_1} H_{\psi_2}$, and $\Theta_{\rm j} = \vert I-I_{\rm j}\vert^2 + \mu \vert \triangledown I_{\rm j}\vert^2$ and $j \in \{\hbox{F},\hbox{B}\}$. The watershed transform is used for model initialization. The authors evaluated this framework on overlapping nuclei in prostate and breast cancer images and reported 86% TPR and 91% OR on breast images and 87% TPR and 90% OR on prostate images.

Qi et al. [64] proposed a two-step method for the segmentation of overlapping nuclei in hematoxylin-stained breast TMA specimens that require very little prior knowledge. First, seed points are computed by executing mean shift on the sum of the voting images (24). Second, the following level set representation of the contours is used: TeX Source$$\eqalignno{{\bb E} &= \alpha_N \sum^K_{k=1} \int_{\Lambda_k} \vert I-\mu_k\vert^2 di + \alpha_B \sum^K_{k=1} \int_{\Lambda_B} \vert I-\mu_b\vert^2 di \cr &\quad + \beta \sum^K_{k=1} \int^1_0 g(\vert \triangledown I (\varpi_k(z))\vert) \vert {\varpi }^{\prime }_{k}(z)\vert dz\cr &\quad + \lambda \sum^K_{k=1} \sum^K_{j=1,j\ne k}\Lambda_k\cap \Lambda_j &\hbox{(43)}}$$ where $\alpha_N, \alpha_B, \beta > 0$ are constants that balance contributions of each term, $\varpi_k(k=1, \dots, K)$ is the nuclei contours that evolve toward boundaries, $K$ is the number of nuclei, $\Lambda_k$ is the region inside each contour $\varpi_k$, $\Lambda_B$ is the background which represents the regions outside all the nuclei, $\mu_k$ and $\mu_b$ are mean intensities of nuclei and background regions, and $g$ is a sigmoid function $g(x) = (1+e^{({x-\nu \over \zeta })})$, where $\nu$ controls the slope of the output curve and $\zeta$ controls the window size. The last term in (43) is the repulsion term used to represent the repulsion energy between each touching nuclei and $\lambda$ is a regulation parameter. The repulsion term separates the touching nuclei to create smooth and complete contour of each nuclei. The authors claimed 78% TPR and 90% PPV in case of touching nuclei.

To overcome ACMs initialization sensitivity, Kulikova et al. [65] proposed a method based on marked point processes (MPPs). This methodology, a type of high-order ACM, is able to segment overlapping nuclei as several individual objects. There is no need to initialize the process with seed points giving the location of the nuclei to be segmented. A shape prior term is used for handling overlapping nuclei. Fig. 4 shows a comparison of nuclei segmentation results using MPP, GiPS [17], and levelset [105].

Recently, Veillard et al. [67] proposed a method based on the creation of a new image modality consisting in a gray-scale map where the value of each pixel indicates its probability to belong to a nucleus. This probability map is calculated from texture, scale information, and simple pixel color intensities. The resulting modality has a strong object-background contrast and smoothing out the irregularities within the nuclei and background. Later, segmentation is performed using an ACM with a nuclei shape prior [65] to solve the problem of overlapping nuclei. Fig. 5(a) shows the result of ACM segmentation on probability map image and hematoxylin-stained image, produced after color deconvolution [84].

In general, model-based approaches segment nuclei using a prior shape information, which may introduce a bias favoring the segmentation of nuclei with certain characteristics. To address this problem, Wienert et al. [68] proposed a novel contour-based minimum model for nuclei segmentation using minimal a prior information. This minimum model-based segmentation framework consists of six internal processing steps. First, all possible closed contours are computed regardless of shape and size. Second, all initially generated contours are ranked using gradient fit. Third, nonoverlapping segmentation is performed with ranked labeling in a 2-D map. Fourth, segmentation is improved using contour optimization. Fifth, cluster nuclei are separated using concavity point detection (41). Last, segmented regions are classified as nuclei or background using stained related information. This framework avoids a segmentation bias with respect to shape features. The authors managed to achieved 86% TPR and 91% PPV on a dataset of 7931 nuclei.

RST is an iterative algorithm attributing votes to pixels inside a region [93]. After the final iteration, maxima are used as marker of a nuclei segmentation algorithm such as watershed. Each boundary point contributes to votes for a region defined by oriented cone-shape kernels as TeX Source$$\eqalignno{A(x,y; r_{{\rm min}},r_{{\rm max}},\Delta) &= \bigg\{(x+r\cos \phi, y+r\sin \phi) \cr & \vert r_{{\rm min}} \le r \le r_{{\rm max}}, \cr &\theta (x,y)-{\Delta \over 2} \le \phi \le \theta (x,y)+{\Delta \over 2} \bigg\} \qquad &\hbox{(44)}}$$ where the radial range is parameterized by $r_{\rm min}$, $r_{\rm max}$ and the angular range $\Delta$. $\theta (x,y)$ is the angle between the positive $x$-axis and the voting direction. These parameters are updated using votes from the previous iterations.

Schmitt and Hasse [106] separated the clustered nuclei using RST based on the idea that the center of mass in a nucleus is considered as a basic perceptual event that supports separation of clustered nuclei. They initialized iterative voting along the gradient direction where, at each iteration, the voting direction and shape of the kernel are refined iteratively. The voting area can be regulated by selecting the number of steps in the evolution of the kernel shape. Few number of steps resulted in the fragmentation of the center of mass, while a large number of steps increases computational cost. They also proposed a way to deal with holes and sub holes in the region by processing boundaries iteratively.

One limitation of RST is the prior knowledge of scale, which cannot be generalized. To overcome this limitation, multiscale extension of the RST seems to be reasonable. A similar method [106] is used in [50] to decompose regions of clustered nuclei in H&E-stained prostate cancer biopsy images. They initially obtained regions of clustered nuclei by clustering and level-set segmentation. Recently, Veta et al. [59] proposed a method similar to [24] that met the objective of nuclei segmentation in H&E-stained breast cancer biopsy images by applying the fast RST [93] to produce markers for the watershed segmentation. Sertel et al. [56] proposed adaptive likelihood-based nuclei segmentation for FL centroblasts. Initially, nuclear components are clustered using GMM with EM. Using fast RST, the spatial voting matrix is computed along the gradient direction. Finally, local maxima locations associated with individual nuclei are determined.

Alternatively, EM- and GMM-based unsupervised Bayesian classification scheme was used for segmentation of overlapping nuclei in IHC images [55]. The separation of overlapping nuclei is formulated as cluster analysis problem. This approach primarily involves applying the distance transform to generate topographic surface, which is viewed as a mixture of Gaussian. Then, a parametric EM algorithm is employed to learn the distribution of topographic surface (GMM). On the basis of extracted regional maxima, cluster validation is performed to evaluate the optimal number of nuclei. The cluster validity index consists of a compactness measure $\varphi$ (the smaller value means more compact) and a separation measure $\varepsilon$ between the clusters. The main idea is to have nuclei as compact and as well separated as possible. Thus, cluster parameters are chosen to maximize ${\varepsilon \over \varphi }$. A prior knowledge for the overlapping nuclei is incorporated to obtain separation line without jaggedness, as well as to reconstruct occluded contours in overlapping region. They achieved improvements of up to 6.80%, 5.70%, and 3.43% with respect to classical watershed, conditional erosion, and adaptive H-minima transform schemes in terms of separation accuracy. Overall, they achieved 93.48% segmentation accuracy for overlapping nuclei on specimens of cervical nuclei and breast invasive ductal carcinomas.

The novelty of these approaches corresponds to the use of machine learning and statistical methods to eliminate malformed nuclear outlines and thus, to allow robust nuclei segmentation. These methods are mainly dependent on the availability of expert annotations. Furthermore, these models may not be generalizable and have limited application due to the manual training step, sensitivity to initialization, and limited ability to segment multiple overlapping objects.

Features computed from segmented nuclei are usually a prerequisite to nuclei classification that generate higher level information regarding the state of the disease. The classifiers use nuclei features, which capture the deviations in the nuclei structures, to learn how to classify nuclei into different classes. In order to extract features, there are two types of information available in the image: 1) the intensity values of pixels; and 2) their spatial interdependency [29].

We found a compilation of features for cytopathology imagery [107], but found relatively little such work for histopathology imagery. In histopathology, these features can be categorized into the following four categories: 1) cytological; 2) intensity; 3) morphological; and 4) texture features. A summary of nuclei features is listed in Table III. Definitions for all listed features can be found in [29], [72], and [108].

In some frameworks, the computed features, like intensity and texture features, are explicitly used for segmentation of nuclei with K-means clustering [56], [57]. To address the problem of heterogeneity in CN, Veillard et al. [67] used intensity and textural features with support vector machine (SVM) classifier for the creation of a new image modality to segment CN. Recently, Vink et al. [69] constructed a large set of features and modified AdaBoost to create two detectors that solved the problem of variations in nuclei segmentation. The first detector is formulated with intensity features; the second detector is constructed using Haar like features.

In addition to the morphological features computed from cytological regions, Huang and Lai [24] extracted intensity and cooccurrence (CO) features. They extracted a total of 14 features (intensity, morphological, and texture features) from segmented nuclei in biopsy images, which comprise both local and global characteristics so that benignancy and different degrees of malignancy can be distinguished effectively. An SVM-based decision graph classifier with feature subset selection on each decision node of classifier is used in comparison with k-nearest neighbor and simple SVM; the accuracy rate of classification promoted from 92.88% to 94.54% with an SVM-based decision graph classifier.

Intensity and morphological features are extensively used for nuclei classification as epithelial and CN in [17], [19], and [49]. An exhaustive set of features including morphological and texture features are explored to determine the optimal features for nuclei classification [109]. Their results of feature selection demonstrated that Zernike moment, Daubechies wavelets, and Gabor wavelets are the most important features for nuclei classification in microscopy images. Recently, Irshad et al. [89] [98], [110] used intensity, morphology, CO, and run-length (RL) features in selective color channels from different color models with a decision tree and SVM classifiers for mitosis detection in MITOS dataset of breast cancer histopathology images and ranked second with 72% F-score in ICPR 2012 contest [96]. Similarly, Malon et al. [72] computed intensity, texture, and morphological features and used these features with SVM for the classification of segmented candidate regions into mitotic and nonmitotic regions. This method reported 66% F-score during ICPR 2012 contest [96].

According to Al-Kadi [10], the combination of several texture measures instead of using just one might improve the overall accuracy. Different texture measures tend to extract different features each capturing alternative characteristics of the examined structure. They computed four different texture features, two of them are model-based: Gaussian Markov random field (GMRF) and fractal dimension (FD); the other two are statistically based: CO and RL features. Using selected features after excluding highly correlated features, Bayesian classifier was trained for meningioma subtype classification. They studied the variation of texture measure as the number of nuclei increased; the GMRF was nearly uniform, while the RL and FD performed better in the high frequencies. They also studied the texture measures response to additive texture distortion noise while varying nuclei shape densities. The GMRF was the least affected, yet the RL and FD performed better in high and low shape frequency, respectively. The combination of GMRF and RL improved the overall accuracy up to 92.50% with none of the classified meningioma subtypes achieving below 90%.

By observing the cancer detection procedure adopted by pathologists, Nguyen et al. [27] developed a novel idea for cancer detection in prostate using cytological (nuclear) and textural features. Prominent nucleoli (cytological feature) inside nuclei region is used to classify nuclei as cancerous or not. In addition, prostate cancer is detected using cytological, intensity, morphological, and textural features having 78% TPR on a dataset including six WSI for training and 11 for testing.

SECTION V

Since the last decade, a significant number of articles have been published in the field of histopathology, focusing on nuclei segmentation and classification in different image modalities. Still, there are some open research areas where little study has been done. These open research areas have unique challenges, which should be covered in future research. One of the aforementioned challenges is the lack of unified benchmarks. Studies cited in this review have been performed using their own private datasets. Moreover, it is not straightforward to evaluate and numerically compare different studies solely based on their reported results since they use different datasets, various evaluation methods, and multiple performance metrics. For numerical comparison of the studies, it is definitely necessary to build benchmark datasets. These datasets should be medically validated, comprise samples coming from a large number of patients, and annotated by different pathologists to accommodate subjective variations in annotation. Such an effort would make possible the numerical comparison of the results obtained by different studies and the identification of distinguishing features. To the best of our knowledge, we know of only a few benchmark datasets: UCSB Bio-Segmentation [111], the MITOS mitosis detection [71] benchmark, as well as a recent similar initiative AMIDA [112].

The UCSB Bio-Segmentation Benchmark dataset consists of 2-D/3-D images and time-lapse sequences that can be used for evaluating the performance of novel state-of-the-art computer vision methods. The data cover subcellular, cellular, and tissue level. Tasks include segmentation, classification, and tracking.

The MITOS benchmark has been set up to provide a database of mitosis freely available to the research community. Mitotic count is an important parameter in breast cancer grading as it gives an evaluation of the aggressiveness of the tumor. Detection of mitosis is a very challenging task, since mitosis are small objects with a large variety of shape configurations; however, it has not been addressed well in the literature, mainly because of the lack of available data. The MITOS benchmark has been set up as an international contest of mitosis detection in the framework of conference ICPR 2012. AMIDA benchmark reedited in 2013 the same type of mitosis detection challenge as MITOS did in 2012.

Most of these benchmarks highlighted the fact that despite the promising results, there are still progresses to be made to reach clinically acceptable results. For instance, the overall best results on mitosis detection presented during the recent MITOS and AMIDA contests achieved an F-score of 78.21% for MITOS [71] and 61.1% for AMIDA [112], which would not be considered accurate detection under medical terms.

The issue of inter- and intrapathologist disagreements is also to be taken into account. Fuchs and Buhmann [95] reported 42% disagreement between five pathologists on nuclei classification as normal or atypical. They also reported intrapathologist error of 21.2%. A conclusion of this study is that that self-assessment is not a reliable validation method. A similar study by Malon et al. [113] reported a moderate agreement between three pathologists for identifying MN on H&E-stained breast cancer slides. Although the seemingly large figures are to be interpreted into the specific context of the study, it shows that validation by medical expert is not a straightforward issue.

It is also important to address the issue of robustness to varying clinical/technical conditions including: 1) different scanners used for image acquisition, 2) different staining characteristics, 3) different lightening conditions, and 4) magnification.

Segmentation methods like thresholding, region growing, and watershed can locate the nuclei region but problems arise when they try to segment the touching and overlapping nuclei. They employ only local intensity information without any prior knowledge about the object to be segmented and produce inaccurate nuclei boundaries.

Dealing with overlapping and clustered nuclei is still a major challenge in the field of nuclei segmentation. While different methods have been developed with various levels of success in the literature for the problem of overlapping and clustered nuclei, the problem has not yet been completely solved. A variety of schemes taking into account concavity point detection [28] [51], [57], [60], [61], [68], distance transform [11] [26], [54], marker-controlled watershed [9] [19], [24], [25], [59], adaptive ACM with shape and curvature information [63]– [64] [65][67], GMM and EM [55], and graphs [51] [58], [61] have been investigated to separate overlapping and clustered/touching nuclei. These methods have good results for nuclei that are slightly touching or overlapping each other, but they are not suitable for specimens containing larger numbers of nuclei with extensive overlapping and touching. These methods suffer from dependencies inducing instability. For instance, the computation of curvature is highly dependent on concavity point detection algorithm, region growing tends to rely on shape and size of nuclei, marker-controlled watershed needs true nuclei markers, and ellipse-fitting techniques are unable to accommodate the shape of most nuclei. Most of these methods also require prior knowledge. In spite of the availability of few methods like clustering, GMM and EM, and new image modality [67] able to deal with heterogeneity, accurate segmentation of touching or overlapping nuclei is still an open research area.

To the best of our knowledge, only few supervised machine-learning techniques like Bayesian [18], [55], SVM [67], and AdaBoost [69] are used for nuclei segmentation. The basic philosophy of the machine learning approach is that human provides examples of the desired segmentation, and leaves the optimization and parameter tuning tasks to the learning algorithm. The two main avenues to be explored in terms of supervised machine-learning algorithms are the use of more domain specific features and limitation of overfitting issues.

No Data Available

No Data Available

None

No Data Available

- This paper appears in:
- No Data Available
- Issue Date:
- No Data Available
- On page(s):
- No Data Available
- ISSN:
- None
- INSPEC Accession Number:
- None
- Digital Object Identifier:
- None
- Date of Current Version:
- No Data Available
- Date of Original Publication:
- No Data Available

Normal | Large

- Bookmark This Article
- Email to a Colleague
- Share
- Download Citation
- Download References
- Rights and Permissions