Fuzzy C-Means Clustering with Histogram based Cluster Selection for Skin Lesion Segmentation using Non-Dermoscopic Images

or the skin one, targeted in this research. Various computerized techniques for skin lesions segmentation are developed by researchers. Different segmentation approaches such as region merging, active contours and thresholding have been used for this purpose [4] [5]. Friedman et al. [6] , proposed a diagnostic technique using the criteria such as Asymmetry, Border irregularity, Color variation and a Diameter greater than 6 Abstract Purpose – Pre-screening of skin lesion for malignancy is highly demanded as melanoma being a life-threatening skin cancer due to unpaired DNA damage. In this paper, lesion segmentation based on Fuzzy C-Means clustering using non-dermoscopic images has been proposed. Design/methodology/approach – The proposed methodology consists of automatic cluster selection for FCM using the histogram property. The system used the local maxima along with Euclidean distance to detect the binomial distribution property of the image histogram, to segment the melanoma from normal skin. As the Value channel of HSV color image provides better and distinct histogram distribution based on the entropy, it has been used for segmentation purpose. Findings – The proposed system can effectively segment the lesion region from the normal skin. The system provides a segmentation accuracy of 95.69 % and the comparative analysis has been performed with various segmentation methods. From the analysis, it has been observed that the proposed system can effectively segment the lesion region from normal skin automatically. Originality/Value – This paper suggests a new approach for skin lesion segmentation based on FCM with automatic cluster selection. Here, different color channel has also been analyzed using entropy to select the better channel for segmentation. In future, the classification of melanoma from benign naevi can be performed.


I. Introduction
M elanoma is one of the threatening skin cancer diagnostics in the human body. It is mainly developed due to unpaired DNA damage by ultraviolet radiation from sunlight. Also triggers mutations which cause the rapid multiplication of skin cells and form malignant tumors. The death rate due to melanoma skin cancer is rapidly increasing as compared to other cancer except lung cancer. In United States, 1-78-560 cases are reported as melanoma in 2018. Out of those, 87-290 cases are noninvasive and 91-270 cases are invasive. Melanoma is curable in early stage, but if it is not, it becomes fatal and hard to treat. About 9-320 people died due to melanoma cancer in the US (men: 5990, women: 3330).
There are two possible ways to lower the death rate from melanoma cancer, i.e. to diminish the number of new tumors by identifying and eliminating or preventing the development of melanoma. Many researchers have developed many systems to identify and prevent the situations but it does not hold great.
Classification of melanoma from benign melanocytic naevi is not an easy task. Hence, the importance of digital image analysis techniques is increasing [1]. The analysis can be done using two different images, i.e. dermoscopic (microscopic) and non-dermoscopic (clinical images). Non dermoscopic images are easily available and accessible as compared to dermoscopic. These images are captured by using conventional user grade cameras which is easily accessible by naked eye [2]. Non-dermoscopic images are preferred over dermoscopic images as they have the benefit of being easily accessible. Whereas dermoscopic images are captures by using a special instrument called dermatoscope which is not easily available. In US, dermoscopic images are used by 50% of the dermatologists for cancer examination. Non-dermoscopic image has also some issues such as non-uniform illumination and less contrast results in tough lesion segmentation.

II. literature review
Image segmentation is much used in the medical field for computer aid detection of different types of cancer as the breast one [3] or the skin one, targeted in this research. Various computerized techniques for skin lesions segmentation are developed by researchers. Different segmentation approaches such as region merging, active contours and thresholding have been used for this purpose [4] [5]. Friedman et al. [6], proposed a diagnostic technique using the criteria such as Asymmetry, Border irregularity, Color variation and a Diameter greater than 6 mm for melanoma skin cancer (ABCD acronym). Rosado et al. [7], performed a comparative study between the diagnostic results of computer diagnostic systems and trained clinical expert. Here, higher sensitivity is observed with computer systems, but lower specificity (i.e. more false positive than humans). Unsupervised based pigmented skin border detection using statistical region merging algorithm has been proposed. The method comprises of black frame removal, image smoothing and segmentation skin lesion based on region merging [8]. P. G. Cavalcanti et al., [9] utilized a discriminating 3-channel space obtained by principle component analysis to segment the skin lesion. Sadri et. al., proposed a system based on wavelet network for dermoscopic images. Here, fixed-grid network is formed without training. Further, network weights calculation and network optimization is done by using orthogonal least squares algorithm. Here, experimental analysis was done using 30 dermoscopic images [10]. An automated lesion segmentation framework has been proposed which consists of a multi-stage illumination correction and texturebased segmentation [11] [12]. High level intuitive features were also proposed to model ABCD criteria used for melanoma detection. It is mainly designed to overcome the issues related with low-level features [13]. Smartphone based melanoma detection has also been proposed.
Here, a set of features were extracted for better assessment of the ABCD rule [14]. For non-dermoscopic images, red color channel image along with otsu thresholding has been used to segment the lesion region [15]. Further, principal component analysis has been applied on RGB color space to discriminate the lesion region [16] [17]. Convolutional neural network based lesion region extraction has also been discussed [18]. Skin lesion segmentation is still one of the challenging tasks due to various issues such as scarcity of the image database [19], non-uniform illumination, system performance varying for different images.
Various approaches such as supervised machine learning based segmentation [18], unsupervised approaches (Fuzzy C-means, PCA, etc) have been introduced to overcome the lesion segmentation issues.
Recently, Fuzzy C-means (FCM) clustering has been used for medical image segmentation. Ali et al. [20] proposed a fuzzy c-means clustering along with mathematical morphology for melanoma segmentation. The aim of the paper is to develop unsupervised skin lesion segmentation for non-dermoscopic image. In FCM, one of the problems is that numbers of cluster should be predefined. To develop an automatic segmentation system based on fuzzy c-means clustering, an automatic system for number of cluster selection is required. In this paper, we develop an automatic skin lesion segmentation system using fuzzy c-means clustering along with histogram property. Histogram property helps to select the number of cluster automatically. The contributions of the paper are: 1. Number of clusters for FCM has been selected automatically using the histogram property. Histogram has been analysed to detect the distribution model (i.e. binomial or normal). First, the two most prominent local maxima of the number of pixel counts are evaluated and the Euclidean distance between them is evaluated. If the distance between them is greater than some threshold value, then the histogram shows normal distribution. Here, threshold value is set by analysis of the histogram of three different cases of skin, i.e. normal skin, melanoma skin and benign naevus skin.
2. The entropy of the different color channels (Hue, Saturation, Value) has been evaluated to choose the better color channel for Fuzzy c-means clustering.
3. Fuzzy c-means has been applied on value channel image to segment the skin lesion. Further, morphological filtering has been done to remove the unwanted artefacts which have been segmented as the lesion region.
4. Further, a comparative analysis has been performed at different levels such as qualitative as well as quantitative.

III. Proposed Method
The complete architecture of melanoma segmentation consists of five stages i.e. Color space conversion, channel image extraction, number of cluster selection, contrast enhancement and segmentation as shown in Fig. 1

A. Image Database
For the experimental analysis, the image dataset of the Department of Dermatology of the University Medical Center Groningen (UMCG) has been used [21] [22]. The images were captured using a Nikon D3 or Nikon D1x body and a Nikkor 2.8/105 mm micro lens maintaining a distance of 33 cm between the lens and the lesion. A total of 170 images (melanoma: 70, naevus: 100) have been used for the experimental purpose. Some of the sample images of the dataset of UMCG are shown in Fig. 2. The sample contains various melanoma cases.

B. Color Space Conversion
HSV (Hue, Saturation, Value) defines color using familiar comparisons such as color, vibrancy and brightness, with value channel being similar to grayscale image [23]. It is preferred over the RGB color model as it has tendency to perceive color similar to the human eyes. Hue (H), Saturation (S), and Value (V) channel can be defined as

C. Channel Image Extraction
One of the major issues of lesion segmentation is the poor contrast between the normal skin and lesion region. Therefore, it is required to examine the contrast in different color channels to select the better one. Here, the color channel selection for segmentation is done by entropy [23] [25] . Entropy provides the measure of information content in images. The entropy results of various color channels (i.e. hue, saturation and value) of some sample images are listed in Table  I. More entropy indicates the presence of more information which is defined in eq. (4). (4) where, P ij represents the probability density function of the image of gray level (i, j) and M represents the total number of gray levels.

D. Number of Cluster Selection
In Fuzzy c-means, data clustering is done based on the number of clusters given in advance. Here, the number of clusters is selected automatically using histogram of color channel image. The algorithm used for numbers of cluster selection is shown below:

Algorithm for Histogram based Cluster Selection:
Step1: Find the most two prominent local maxima of the histogram Step2: Calculate the distance (dist) between two prominent local maxima Step3: if dist <= threshold

Out=Normal Skin
The threshold value is selected by calculating the Euclidean distance between the two prominent local maxima of the histogram of value channel image. Experimental analysis has been conducted to select the better threshold to separate the normal skin and melanoma skin. From the analysis, the threshold value is set as Th = 5671 no.of pixels . If the Euclidean distance between two prominent local maxima is greater than the Th value, then the image will be considered as normal skin image. If less than Th, then the non-dermoscopic image contain melanoma lesion.

E. Contrast Enhancement
Contrast enhancement is used to map the intensity of an image to a specific range. It helps to improve the quality of the image. The contrast of an image can be decrease or increase depending upon the range of the data [22] [24].

F. Fuzzy C-Means Clustering
An unsupervised Fuzzy C-Means (FCM) clustering has been extensively used for medical imaging systems mainly for segmentation [26] [27] [28]. Here, FCM has been used to segment the melanoma skin from non-dermoscopic images. FCM algorithm consists of two different activities. In the first activity, the number of cluster should be provided. In the second activity, data clustering is performed by iteratively searching for a set of fuzzy clusters. The number of cluster selection is done automatically based on the histogram analysis. A membership value v ij is used to indicate the membership degree of the ith data point to the jth cluster. The main aim of FCM is to minimize the cost function J as where, n ∈ [1,∞] subject to the constraints, where, m is the number of data points, is the Euclidean distance between the data point and cluster center. C is the number of clusters (C ≤ m). The membership function v ij and centroids v i are updated until minimum J is acquired. The algorithm of FCM algorithm is shown below: Step 1: Select the cluster centers randomly Step 2: Calculate the fuzzy membership v ij Step 3: Compute the fuzzy centers Step 4: Repeat step 2 and 3 until the minimum objective function is obtained

G. Morphological Filtering
Morphological filtering is a non-linear operation which is related to morphological features in an image. It helps to remove as well as recover the unwanted portion of the image segmented as foreground or background in binary image. In binary image, morphological operation creates a new binary image in which the pixel has a non-zero value. Some of the basic morphological operations are erosion, dilation, opening and closing. Here, morphological opening is used to recover lesion region segmented as background [23].

IV. Experimental Results and Discussion
Experimental analysis for lesion segmentation has been done for both quantitative and qualitative analysis. The analysis has been performed using the UMCG dataset which is publicly available. The image dataset consists of melanoma (70 images) and benign naevus skin (100 images). The proposed system has been implemented in MATLAM R2017b, on a computer with Intel Core i3 processor, and 4GB RAM. To reduce the runtime for segmentation, the image has been resized into 250x250 pixels. The proposed system took around 1.7s to segment the lesion region from image having 250x250 pixels.
The proposed system consists of color space conversion, channel image extraction, number of cluster selection, contrast enhancement, unsupervised clustering and morphological filtering. Here, RGB image has been converted into HSV image as the HSV color space has capability to perceive the color similar to human eye. From HSV color space, the most preferable color space has been chosen based on the entropy of the various color spaces. Value channel has higher entropy in comparison to other color channels, so it has been used for lesion segmentation. Further, number of clusters has been selected based on the histogram property as shown in Fig. 3.
From the analysis of the histogram of the various skin conditions, it has been observed that the distance between the most two prominent local maximum of the histogram is less in the skin having lesion. In case of normal skin, the distance between the local maxima is higher in comparison to lesion skin. The threshold value has been used to decide whether the histogram is binomial or not. This threshold value has been set based on the analysis of the histogram of various skin conditions. Moreover, the non-uniform illumination and low contrast of the nondermoscopic image has been corrected by using contrast enhancement technique. The preprocessed image is segmented by using FCM with automatically selected clusters number to segment the lesion from normal skin. The performance of the proposed method is evaluated by metrics such as sensitivity, specificity and accuracy. The segmentation has been done as a classification problem. Here, the pixels in the images are classified as lesion or normal skin pixels.
The ground truth of the segmented lesion region has been created as shown in Fig. 4 left column. Fig. 4 shows the comparative analysis of the qualitative measure of various segmentation results. The segmentation result of the proposed method has been compared with Otsu_R method [15]

V. Conclusion and Future Work
Computer assisted skin lesions segmentation using nondermoscopic images is of great importance. Due to various factors such as non-uniform illumination and low contrast in non-dermoscopic images, skin lesion segmentation becomes a challenging task. In our proposed method, the information contained in different color channel images is analyzed using entropy. From the analysis, value channel image provides the better results. So, value channel image has been further preprocessed using contrast enhancement to segment the lesion region. Further, pixels clustering has been done using FCM. For FCM, the number of clusters has been selected using local maxima and Euclidean distance property to detect the histogram distribution. Experimental results show that the proposed segmentation method outperforms existing methods, providing an accuracy of 95.69% and sensitivity of 90.02%. In future, lesion classification based on various supervised classifier can be performed. The importance of feature extraction can also be analyzed. The proposed system is mainly designed for non-dermoscopic images. As the images (i.e.

Ground Truth
Otsu_R [15] Proposed method non-dermoscopic and dermoscopic) acquired different properties, the proposed system may not be applicable for dermoscopic images, in the present form. With certain customization, the system may also be used for lesion segmentation in dermoscopic images.