Deep Learning-Based System for Automatic Melanoma Detection

Melanoma is the deadliest form of skin cancer. Distinguishing melanoma lesions from non-melanoma lesions has however been a challenging task. Many Computer Aided Diagnosis and Detection Systems have been developed in the past for this task. They have been limited in performance due to the complex visual characteristics of the skin lesion images which consists of inhomogeneous features and fuzzy boundaries. In this paper, we propose a deep learning-based method that overcomes these limitations for automatic melanoma lesion detection and segmentation. An enhanced encoder-decoder network with encoder and decoder sub-networks connected through a series of skip pathways which brings the semantic level of the encoder feature maps closer to that of the decoder feature maps is proposed for efficient learning and feature extraction. The system employs multi-stage and multi-scale approach and utilizes softmax classifier for pixel-wise classification of melanoma lesions. We devise a new method called Lesion-classifier that performs the classification of skin lesions into melanoma and non-melanoma based on results derived from pixel-wise classification. Our experiments on two well-established public benchmark skin lesion datasets, International Symposium on Biomedical Imaging(ISBI)2017 and Hospital Pedro Hispano (PH2), demonstrate that our method is more effective than some state-of-the-art methods. We achieved accuracy and dice coefficient of 95% and 92% on ISIC 2017 dataset and accuracy and dice coefficient of 95% and 93% on PH2 datasets.


I. INTRODUCTION
Melanoma is a malignant itumour which develops from the pigment-containing cells known as melanocytes [1]. It has the most rapidly increasing mortality rate among skin cancers. The American Cancer Society [2] estimates that about 7,230 people are expected to die of melanoma and about 96,480 new melanomas is diagnosed in the United States in the year 2019. According to the statistics [2], the lifetime risk of getting melanoma is about 2.6% for whites, 0.1% for blacks, and 0.6% for Hispanics. Cutaneous melanoma is the most dangerous form of skin tumor that causes 90% of skin cancer mortality [3]. According to Garbe et al. [3] melanomas account for 90% of the deaths associated with cutaneous tumors. They also investigated that the incidence rate is around 25 new melanoma cases per 100,000 in Europe, and around 30 per 100,000 inhabitants in the United States The associate editor coordinating the review of this manuscript and approving it for publication was Naveed Akhtar .
of America (USA) and in Australia where the highest incidence rate is observed it is around 60 per 100,000 inhabitants. The major and most important exogenous factor causing melanoma is exposure to UV irradiation through sun exposure [4]- [6]. Melanoma can however be cured with prompt excision [7], [8] if diagnosed and detected early. Identification of melanoma from skin lesions using methods such as visual inspection, clinical screening, dermoscopic analysis, biopsy and histopathological examination of skin lesion can be inaccurate and laborious even with experienced dermatologists [9]- [11]. This is due to the complex visual characteristics of the skin lesions such as multi-sizes, multishapes, fuzzy boundaries, low contrast when compared to the skin and noise presence such as skin hair, oils, air and bubbles. Development of an efficient Computer Aided Diagnosis (CAD) system for detection and diagnosis of melanoma cancer is thus requires. This will improve the diagnosis rate of melanoma and early detection which can facilitate treatment and reduce the mortality rate of the disease. These methods termed hand-crafted are limited with the noise presence on the skin lesion and also the low contrast and irregular borders features of skin lesions [25]. These methods [26] lack deep supervision and this leads to loss of detailed information during training thus experience difficulty in analyzing the complex visual characteristics of the skin lesion. Intelligent based systems possess features such as adaptability, fault tolerance and optimal performance for better analysis of skin lesions [25]. Developing an efficient system will reduce the cost and time required for the dermatologists and doctors to diagnose all patients for melanoma [27]. Codella et al. [28] proposed a system that combines recent developments in deep learning with established machine learning approaches to create ensembles of methods that are capable of segmenting skin lesions for melanoma detection. Even though those methods have achieved great success, there still remain several challenges to the skin cancer segmentation task due to the complex nature of skin lesion images. Skin lesions images are characterized with fuzzy borders, low contrast between lesions and the background, variability in size and resolution and with possible presence of noise and artifacts.
In this paper, we propose an intelligent system based on deep learning techniques to detect and distinguish melanoma from non-melanoma lesions using a single DCNN for all the processes. Firstly, dermoscopy image may include hair, blood vessels, and other factors that interfere with segmentation. Moreover, the low contrast between the lesion area and the surrounding skin causes blurry boundary, which makes it difficult to segment the lesion accurately. At last, melanoma usually has different sizes, shapes, and colors depending on different skin condition, which could be a hamper to achieve high segmentation accuracy. To tackle VOLUME 8, 2020 these challenges, we propose a novel CNN based approach with an enhanced deep supervised encoder-decoder network to extract strong and robust features of skin lesions images. This network is able to extract complex features from the lesion images through its multi-stage approach in which the encoder stage of the network learns the general appearance including possible hairs influence on the lesion region and localization information while the decoder stage learns the lesion boundaries characteristics. After extracting the features, a new method called Lesion-classifier is devised to perform the classification of skin lesions into melanoma and non-melanoma in a pixel-wise manner. Our network is distinguished from the existing methods based on the following three aspects: (1) we connect the encoder and decoder sub-networks together through a series of skip pathways as shown in Figure 3. This brings the semantic level of the encoder feature maps closer to that of the decoder feature maps is proposed to enhance the feature learning ability of the network and feature extraction; (2) we design a multiscale system at each skip pathways of the network to handle various sizes of skin lesions images; (3) we devise a method called Lesion-classifier which is computationally efficient to classify skin lesions into melanoma and non-melanoma in a pixel-wise manner to distinguish melanoma lesions from nonmelanoma images. The key innovation behind our proposed algorithm is that the melanoma detection task is structured as a point object detection task, where the region of interests (ROI) occupies only a tiny fraction of the total number of pixels of the skin lesion images. Our aim is to develop an efficient system that can extract the most suitable features using limited training images dataset and detect melanoma cancer with reduced computing resources that can meet up with the requirement in the real-time clinical practice.
Our last contribution is experimental. We present a more detailed study, with better visualizations of the results and outputs. The proposed system obtains state-of-the-art results on ISIC 2017 and PH2 datasets and show better results than the existing techniques in major performance metrics in recognition and localization. The proposed methodology archives encouraging results having 96% accuracy. We finally experimentally evaluate the training computation time per epoch and the test time per a single dermoscopy image for the proposed method and compare the results with the performance of some popular deep learning methods such as SegNet, UNet and FCN on ISBI 2017 dataset.

A. OUR APPROACH
We propose a new fully automated method for detecting melanoma on dermoscopic images. Experiments show that the proposed method improves on the state-of-the-art and is specifically optimized for melanoma detection. The system employs an end-to-end and pixel by pixels supervised learning approach using Deep Convolutional Networks combined with softmax classifier and dice loss function. In addition to this, we contrive a method to classify skin lesions into melanoma and non-melanoma based on pixel-wise result from the softmax module. This model combines all the challenging tasks of segmentation, features extraction and classification in a manner that no extra computing cost is required. Our research aims at improving the rate and accuracy in identifying and classifying skin lesions. The following contributions have been introduced to the present state-of-the-art:

1) DEEP CONVOLUTIONAL ENCODER-DECODER ARCHITECTURE
We propose a Deep Convolutional Architecture that is interconnected through a series of skip pathways as shown in Figure 3. This brings the semantic level of the encoder feature maps closer to that of the decoder feature maps is proposed to enhance the feature learning ability of the network and feature extraction;

2) MULTISTAGE AND MULTI-SCALE APPROACH
The Encoder-Decoder network is enhanced into a moderate size with a multi-stage and multi-scale approach to enhance learning of the complex features and handle various sizes of skin lesions images;

3) LESION-CLASSIFIER
We devise a new predictive method called Lesion-classifier which is computationally efficient to classify skin lesions into melanoma and non-melanoma in a pixel-wise manner to distinguish melanoma lesions from non-melanoma images using the output of the softmax modules.
Our method is particularly effective for analysing challenging skin lesions, which usually have fuzzy boundaries and heterogeneous textures, for melanoma detection.
The remaining part of this work is organized as follows: Section II discusses the Related Works and Materials and Methods is described in Section III. Section IV discusses the Experiments and Results. The paper is concluded in Section V.

II. RELATED WORKS
An efficient automatic melanoma detection system requires a reliable feature extraction mechanism. This process is of vital importance while detecting melanoma using CAD diagnostic systems. In the recent past decade various methods have been proposed to extract and analyse various features from skin lesion images. Extracted features include but not limited to colour, Texture wavelet, gray-level co-occurrence matrix (GLCM) and shape features. The performance of any CAD system is highly dependent on the efficiency of the extraction of these features. Many semi-automatic and fully-automatic algorithms have been proposed for melanoma detection from skin lesions images. This section describes both the semiautomatic techniques and automatic deep learning methods for features extraction and melanoma detection.

A. SEMI-AUTOMATIC TECHNIQUES
Warsi et al. [29] presented a technique based on D-optimality orthogonal matching pursuit (DOOMP) to perform image enhancement, segmentation, and classification on skin lesions using fixed wavelet grid network (FWGN). The system gave accuracy results of 91.82%. Barata et al. [30] also used a technique to extract both color and texture features for detection of melanoma and nonmelanoma images. The method showed that color feature outperforms the texture feature. Schaefer et al. [31] also proposed a method to extract color and texture features from the lesion component of dermoscopic images. The proposed method used the combination of SVM, SMOTE, and ensemble of classifiers. The results of the proposed ensemble classifier system gave accuracy of 93.83%. Waheed et al. [32] extracted both the color and texture features using GLCM method for texture feature extraction and SVM classifier for classification. The system obtained the accuracy of 96%. Sivaraj et al. [33] used Firefly with K-Nearest Neighbor algorithm (FKNN) classifier to predict and classify skin cancer along with threshold-based segmentation and ABCD feature extraction algorithm. Pennisi et al. [34] presented a method based on Delaunay triangulation known as ASLM to extract a binary mask of the lesion. This method combines two parallel processes for detection of skin and lesion. Warsi et al designed a method termed multi-direction 3D colortexture feature (CTF) for feature extraction from dermoscopic images. They used back propagation multilayer neural network (NN) classifier for detection and classification of melanoma [29]. The shortcomings of these methods include requiring elaborate image pre-processing steps, careful initialization from a human expert and also too slow for realtime analysis and diagnosis. These techniques do not however include a prior knowledge of the image characteristics in the algorithm unlike approaches based on deep learning algorithms.

B. AUTOMATIC DEEP LEARNING TECHNIQUES
Recently, Convolutional Neural Network (CNN) and deep learning-based approaches have been used for cancer detection. Bi et al. [35] proposed an automatic melanoma detection technique for dermoscopic images using multi-scale lesionbiased representation (MLR) and Joint reverse classification (JRC). JRC model was used for classification and it provided additional information for melanoma detection. PH2 public database wass used for evaluation and testing of the proposed method. The results gave 92% accuracy. Yıldız [36] designed a deep neural network model named C4Net for melanoma detection. The proposed model classified skin lesions into malignant and benign. Abbes and Sellami [37] proposed a model based on the ABCD rule for features extraction and used Fuzzy C-Means(FCM) to determine membership degree and finally used a deep neural network classifier for decision making. The model gave 87.5% accuracy as result. A pre-trained deep learning network and transfer learning are utilized for skin lesion classification by Hosny et al. [38] Transfer learning was applied to AlexNet by replacing the last layer with softmax to classify the lesions. Finally, a single CNN was utilized and trained end-to-end from images using only pixels and disease labels as inputs [39]. This was used for binary classification of keratinocyte carcinomas versus benign seborrheic keratoses; and malignant melanomas versus benign nevi. The deep learning CNN outperforms the average of the dermatologists at skin cancer classification using photographic and dermoscopic images however the performance in a real-world and clinical setting is yet to be evaluated [39]. The computation cost for these approaches are major barriers in clinical applications [40]- [42] Al-Masni et al. [43] proposed a method that learns the full resolution features of each individual pixel of an input data directly. The system was evaluated using two publicly available databases,ISBI 2017 Challenge and PH2 datasets. He et al. [44] presented a skin lesion segmentation network using a very deep dense deconvolution network. They employed the combination of deep dense layer and generic multi-path Deep RefineNet. Esteva et al. [39] developed CNN architecture using GoogleNet Inception v3 that was pre-trained on approximately 1.28 million images for melanoma detection. Goyal and Yap [45] presented an endto-end solution using fully convolutional networks (FCNs) for multi-class semantic segmentation. The system automatically segmented the melanoma into keratoses and benign lesions. Ramachandram and Devries [46] proposed a semantic segmentation architecture that utilized atrous convolutions for super-resolution upsampling of predictions using subpixel. A deep learning framework consisting of two fully-convolutional residual networks (FCRN) was developed to simultaneously produce the segmentation result and the coarse classification result of skin lesion [47]. A lesion index calculation unit (LICU) was then developed to refine the coarse classification results by calculating the distance heat-map.
Our proposed system aims at lowering trainable parameters to reduce computational resources and time and make the system feasible for real-time medical diagnosis. Most of the systems discussed above employ larger and more complex deep learning architecture. Our proposed system is able to perform both segmentation and pixel-wise classification of melanoma lesion pixels using a moderate-size deep convolutional network. Some of these methods are also too slow and requires huge amount of computing processing resources for real-time medical analysis and diagnosis.

A. MATERIALS
In this work, two publicly available dermoscopy dataset were utilized for training and testing of our proposed method. These were used to evaluate the proposed method. The first dataset is the ISIC 2017 challenge dataset. This dataset is extracted from the International Skin Imaging Collaboration (ISIC) archive. The dataset contains dermoscopy images with different image sizes with the highest resolution of 1022 × 767. 2000 dermoscopy images and 600 dermoscopy images were used for training and testing respectively. These images were also presented with their respective ground truth labels for supervised training. The second dataset is the PH2 images. The dataset contains dermoscopy images with different image sizes with highest resolution of 765 × 574 pixels. The images were collected from the Dermatology centre of Hospital Pedro Hispano. 200 dermoscopy images and 60 dermoscopy images were used for training and testing respectively. They are also presented with their corresponding ground truth labels based on manual delineations by clinical experts.

B. OVERVIEW OF THE PROPOSED METHOD
The diagram in Figure 2 shows all the major stages from image-preprocessing to features extraction and pixel-wise classification and finally lesion classification. The major components include skin lesions images datasets, Encoder-Decoder Network, Softmax classifier and lesion classifier. The image datasets were first pre-processed before being sent into the Encoder-Decoder Networks. The training dermoscopic images together with the ground truth labels (annotations) are used in training the Deep Convolutional Encoder-Decoder Networks. The input is first sent into the encoder sub-network and then to the the decoder for features extraction. An additional module that combines the dice loss function with the softmax classifier is later used for pixel-wise classification of the images and identification of Region of Interest for melanoma. During the testing stage, we also applied the Deep Convolutional Encoder-Decoder Networks to the input dermoscopic image dataset. The full architectural diagram of the Encoder-Decoder Network is shown in Figure 3.

C. DATA PRE-PROCESSING AND IMAGE AUGMENTATION
The skin lesions images are often characterized with noise and artefacts such as air, hair and bubbles. They are also characterised with variation in size and multi-scale and multi-resolution nature features of the skin lesion images. During the pre-processing stage the images are separated from noise using the Gaussian filter. The function is stated below. This smoothens and prepares the images for further processing. The images are also prepared to have them in the same scale and resolution via cropping, resizing and resampling. In this work, we use relatively small image size of 224 × 224. This will reasonably reduce the input feature map size for the network. The images are normalized by computing the mean pixel value and the standard deviation for data centring data normalization. The system applied elastic deformations through random displacements before augmenting the dataset. Elastic deformation utilizes local distortion and random affine transformation for high-quality output. These transformation takes place with random displacement. In addition, simple and random rotation is adopted in the augmentation process to improve the performance. The system centres the pixel intensity ranging around zero by remapping intensities linearly to the range [−0.5, 0.5] to provide numerical stability during training.

D. ROI IDENTIFICATION AND FEATURES EXTRACTION
In the process of detecting melanoma, identification of region of interest (ROI) and features extraction are very important tasks. A lesion is characterized by different features including colour, texture etc. In this work, an efficient deep learning framework with a medium-size and less trainable weight of the encoder-decoder network is trained and adapted for features extraction. Both encoding part and the decoding part are made up of five blocks each.
Each block in the encoding part is composed of two 3 × 3 convolutional layers and one max pooling layer and a RELU non-linear activation function for features extraction.The max-pooling module breaks down the input feature maps into pooling sections, and computes the maximum of each section.The max-pooling module pools over every pixel within a 2 × 2 area from the feature map and reduces the feature map's size and resolution. This eliminates features redundancy and minimizes computation time. This also facilitates learning process in the network.
where F i is the feature map,F i − 1 is the feature map for the previous layer, W is the filter kernel and b i is the bias applied to each feature map of each layer. The ReLU activation function is stated as: where y is the resulting feature map. Encoder learns lesion image pattern via the convolutional layers and the ReLU activation function during an end-toend and pixel by pixel system training process. The first convolution layer extracts feature maps from the training dataset. The encoder captures the semantic and contextual information of the lesions by learning the general appearance and localization information of the input image. Our architecture is a deeply-supervised encoder-decoder network where the encoder and decoder sub-networks are connected through a series of skip pathways. This series of skip pathways is composed of convolutional networks and short skip network. This brings the semantic level of the encoder feature maps closer to that of the feature maps awaiting in the decoder for efficient and faster processing.
The decoder part also consists of five units but with each units composing two convolutional layers and one upsampling layer. In the decoding part, the previous block output is upsampled with the nearest neighbour using a 2 × 2 convolutional layers. It is then concatenated with the output from the encoder part at the corresponding level. Increase in deep learning network size automatically increases the computational cost and reduce the feasibility of the system in medical diagnosis. So in this work we limited the size of the encoder-decoder network. The network has however been optimized by replacing the usual skip network connection VOLUME 8, 2020 between the encoder subnetwork and decoder subnetwork with a series of skip pathway as illustrated in Figure 3. The up-sampling layer is concatenated with the corresponding feature map from the convolutional layer in the encoder part. Feature maps are convolved with the decoder filters in the convolutional layers to produce dense feature maps. The lesion boundaries localisation information is learned at the decoder section. The decoder section learns the characteristics of the lesion boundaries in recovering spatial information. The decoders work to restore the feature maps to the original size in the network using the upsampling layers with the function stated below.
where x is the feature map from the encoder and n is the upsampling layer input. The upsampling module is an advanced technique of unpooling that reverts maxpooling operation by using the value and location of the maxima values in the maxpooling layers for feature maps restoration.

1) SYSTEM ALGORITHM
Algorithm 1 shows the steps of the proposed model implementation. The algorithm first gets an input image X , feature maps are generated and then sent into encoderdecoder system before pixel-wise classification with softmax classifier, and finally obtain the classification result via the Lesion-Classifier. In the encoder unit, the feature maps generated, F m ap, is first sent to the convolutional layers as Conv(F m ap) and then to the ReLu activation function as Relu(F c i). This is then down-sampled with the max-pooling function, Pooling(F r i). This is done through the for loop structure until all the images are processed. The output from the encoder is then sent into the decoder. It passess into the for loop structure and goes through features upsampling using the upsampling function Upsamp(F p i). This is also sent into the ReLU activation function and the convolutional layers.

E. PIXEL-WISE CLASSIFICATION
The pixel-based classification as presented by Wu et al. [48] has been employed in the pixel-wise classification of skin lesions. The encoder-decoder trains skin lesion images from end-to-end and pixels to pixels using pixels and disease labels to produce pixel-wise prediction. The output from the encoder-decoder network with high dimensional feature representation is sent into a trainable soft-max classfier. The softmax performs the classification where n represents the number of classes to specifically predict the class for each pixel as melanoma or non-melanoma. The Dice loss function utilizes Dice similarity coefficient [49] to measure overlap between the input image and the corresponding ground truth. These are explained by the equations below. Extract the feature map F m ap from the input image; 3: For i = 0 to M -1: 4: Set F ci = Conv(F map ); 5: Set F ri = Relu(F ci ); 6: Set F pi = Pooling(F ri ); 7: if i <= M then 8: Set F pi+1 = F pi ; 9: else return F pi 10: end if 11: end for 12: procedure Decoder(F pi ) F pi is the downsampled feature maps 13: For i =M−1 to 0: 14: Set F di = Upsamp(F pi ); 15: Set F ri = Relu(F di ); 16: Set F ci = Conv(F ri ); 17: if i <= M AND i != 0 then 18: Set F pi−1 = F ci ; 19: else return F ci 20: end if 21: end for 22: Predictedpixels= softmaxclassifier(F ci ) F ci which is the output from the decoder is sent to softmax classifier function for pixel-wise prediction 23: Pi= cluster(Predictedpixels) The predicted pixels are clustered into segmented output 24: Finalresults= Pi Final Segmented Output Display Finalresults where x T w represents the product of x and w, x is the feature map and w is the kernel operator.
where xTw represents the product of x and w, x is the feature map and w is the kernel operator.
Applying Dice loss functions in this system results in better performance of the proposed model. This loss function is efficient with skin lesion images due to the low contrast characteristics of skin lesions images, with the dice loss function differentiating the background from the lesion itself. Dice loss function computes the losses for each class separately and then find the average [50] to yield a final score. Incorporating dice loss function will create a function that returns the dice loss between the predictions made and the training targets and utilize this difference to improve the performance.
where i is the pixels,Y t denotes the ground truth label, Y p denotes the predicted image output. This can be further expressed as: where Y is the output of the previous layer and T represents the training targets.
The Softmax classifier provides and compute the probabilities with direct interpretation for all labels. Our results are presented using confusion matrix as illustrated in Figure 4 where True positives, TP, represent pixels in which the prediction and actual value are melanoma. True negatives, TN, are pixels captured when the actual and predicted value is Non-Melanoma. False negatives, FN, represent pixels, where the pixel's prediction is non-melanoma and the actual category is melanoma. Finally, false positives, FP, capture pixels, where the prediction is melanoma and the actual category is non-melanoma.

F. SKIN LESION CLASSIFICATION
Algorithm 2 shows the steps of the proposed model implementation. The algorithm first gets an input image X , feature maps are generated and then sent into encoderdecoder system before pixel-wise classification with softmax classifier, and finally obtain the classification result via the Lesion-Classifier. In the encoder unit, the feature maps generated, F m ap, is first sent to the convolutional layers as Conv(F m ap) and then to the ReLu activation function as Relu(F c i).

IV. EXPERIMENTS AND RESULTS
Various experiments were carried out in this section to evaluate the performance of our proposed segmentation approach. Two publicly available dataset were used to demonstrate our proposed methods. This was also compared with the existing algorithms in Table 1 and 2.

A. DATASETS
The two well-established publicly available datasets used in the evaluation of the proposed segmentation method are from the ISIC challenge in skin lesion segmentation and PH2 data repository. They are described below: PH2 [51] contains 200 skin lesion images with highest resolution of 765 × 574 pixels. They were gotten at Dermatology Service of Hospital Pedro Hispano. This dataset was categorized into training and testing image set both comprising of images and ground truth labels respectively. The input dataset are skin lesion images in JPEG format while the ground truth are mask image in PNG format.

2) ISIC
ISIC 2017 [52] contains 2000 training images with the ground truth provided by experts. The image sizes possess highest resolution of 1022 × 767. This dataset was provided from the ISIC Dermoscopic Archive [52]. This dataset was categorized VOLUME 8, 2020 into training and testing image set both comprising of images and ground truth labels respectively. The input dataset are skin lesion images in JPEG format while the ground truth are mask image in PNG format. The ground truth labels are provided for training and evaluating validation and test phases data using the performance evaluation metrics.

B. MODEL IMPLEMENTATION
The experimental platform is on LENGAU clusters hosted on CHPC super computers server at chpc.ac.za. Hardware resources with an Intel Core i7 processor with ten (10) 3.4 GHZ cores and NVIDIA Tesla K40c GPU, 148.5 TB shared memory was utilized.
Training time taken for the experiments: 4hrs The software used for the model implementation includes: The most common skin lesion segmentation evaluation metrics were used for comparison including: dice similarity coefficient (DSC.), sensitivity, specificity and accuracy. These metrics were used for evaluation of the model. They are illustrated below: Dice Similarity Coefficient: It measures the similarity or overlap between the ground truth and the automatic segmentation. It is defined as Sensitivity: It measures the proportion of those with positive values among those who are actually positive.
Specificity: This is the proportion of those that are negative among those who actually tested negative.
Accuracy: It measures the proportion of true results (both true positives and true negatives) among the total number of cases examined.

Accuracy =
TP + TN TP + TN + FP + FN (9) where FP is the number of false positive pixels, FN is the number of false negative pixels, TP is the number of true positive pixels and TN is the number of true negative pixels.

D. RESULTS AND DISCUSSION
With regard to melanoma lesion detection and segmentation process, the proposed system was evaluated on two publicly available database. First, the proposed model was trained on ISIC 2017 dermoscopic dataset with 2000 training skin lesion images. This was then tested on 200 skin lesion images. The result achieved accuracy and dice coefficient of 95% and 92% respectively with training steps of 35 epochs as shown Figure 5 and 6. Training the proposed model on PH2 skin lesion image dataset requires more number of training steps because of the dataset size of 200 images. The model was also tested on 50 images. As shown in Figure 5 and 6, the result achieved accuracy and dice coefficient of 95% and 93% with 250 training epochs on PH2.
The results displayed in Figure 5 and 6 for both ISIC 2017 and PH2 dataset also show that the accuracy and the dice score can still improve with increase in training steps and dataset. The learning ability of the proposed model through experiments with the two datasets was evaluated with the accuracy curve in figure 5. The result from the curve clearly shows that the ISIC 2017 dataset with fairly large dataset reached the accuracy percentage of 95%. This improvement can be due to the adopted dice loss function with the softmax classifier.   We compared the performance of the proposed model with the performance of some state-of-the-art methods from the latest study in literature such as FrCN, CDNN, FCN and UNET. This was carried out on the two datasets ISIC 2017 and PH2 and the results are stated in Table1 and Table2. The results show that the proposed model outperform some of the state-of-the-arts. From the results in Table 1, the proposed model records higher accuracy percentage and dice score of 95% and 92% on ISIC 2017 dataset when compared with other methods. It also shows higher sensitivity and specificity of 97% and 96% as against the other methods.
From the results in Table 2, the proposed model also gives higher accuracy percentage and dice score of 95% and 93% on PH2 dataset when compared with other methods. It also shows high sensitivity and specificity of 93% and 95% as against some of the methods. This results again indicates that the proposed system is able to identify and differentiate higher number of affected skin lesions from the healthy tissues on PH2 skin lesion dataset. Table 1 shows the performance evaluation of our proposed method on ISIC 2017 Dataset and then compared with other state-of-the-art algorithms while Table 2 shows the performance evaluation of our proposed method on PH2 skin lesion dataset also compared with other state-ofthe-arts methods. We use the same comparison algorithms but different databases in the two table. In Table 1, our proposed method and other state-the-arts methods were experimented on ISIC 2017 dataset while in Table 2 our proposed method and other state-the-arts methods were experimented on PH2 dataset. Table 3 shows the performance evaluation of our proposed method on ISIC 2017 Dataset and then compared with the best three models in ISIC 2017 challenge as stated below.
The robustness and effectiveness of the proposed method is further shown in the Figure 7 with Deep Convolutional Encoder Decoder Network (DCEDN) architecture producing segmentation output with well defined lesion boundaries and contours for eight sample melanoma lesion images.   Figure 8 shows the the pixel-wise results output using confusion matrix of a sample melanoma lesion image with 6957 pixels that are melanoma and predicted to be melanoma. It also discovers 27195 pixels of non-melanoma pixels predicted to be non-melanoma.
We have investigated both training and experimental time for further evaluation of the proposed model. The training computation time per epoch and the test time per a single dermoscopy image for the proposed method have been evaluated. The proposed system was evaluated on ISIC 2017 with 2000 training images over a total number of 35 epochs. It takes approximately 350s for training each epoch and the test time per a single image is around 5s. The proposed method is feasible for medical practices since the processing can be done in averagely 5s of processing time for each dermoscopy image. This result was compared with the performance of some segmentation methods such as SegNet, UNet and FCN. The evaluation was done under the same hardware conditions, the same dataset and the same number of epochs and the result presented in Table 4. The computational speed during the training phase of our proposed method was faster than other segmentation methods.

V. CONCLUSION
In this paper, a deep convolutional network based architecture has been proposed for robust detection and segmentation of melanoma lesions. This architecture adopts an enhanced deep convolutional network that is interconnected with series of skip pathway. It also employs a reduced-size encoder-decoder network that aims at minimizing the computational resources consumption. The multi-stage approach overcomes the limitation of some deep convolutional networks in producing coarsely segmented outputs when processing challenging skin lesion images. In this approach, the whole network is divided into stages, with each stage handling different section of features learning and extraction. A new method is devised to classify melanoma and non-melanoma lesion based on the results from the softmax classifier. The system adapts dice loss function that learns and computes losses from the overlap in-between the predicted output and the ground truth label into softmax classifier for pixel-wise classification. This loss function consumes lesser system resources since it does not perform sample re-weighting unlike some other loss function. The system aims at reducing deep learning architecture complexity in detecting melanoma. It also aims at developing an efficient system that can meet up with real time medical diagnosis task in diagnosing melanoma cancer. The proposed method is feasible for medical practices with the processing time for each dermoscopy image at averagely 5s. The system was evaluated on two publicly available skin lesion image dataset. The proposed system achieved an overall accuracy and dice coefficient of 95% and 92% on ISIC 2017 dataset and accuracy and dice coefficient of 95% and 93% on PH2 dataset. Our proposed method outperforms some existing state-of-the-arts.