Journals & Magazines >IEEE Access >Volume: 8

Computer-Aided Gastrointestinal Diseases Analysis From Wireless Capsule Endoscopy: A Framework of Best Features Selection

Graphical abstract of proposed deep learning based automated design for Stomach diseases classification.

Abstract:

The continuous improvements in the area of medical imaging, makes the patient monitoring a crucial concern. The internet of things (IoT) embedded in a medical technologie...Show More

Topic: Deep Learning Algorithms for Internet of Medical Things

Metadata

Abstract:

The continuous improvements in the area of medical imaging, makes the patient monitoring a crucial concern. The internet of things (IoT) embedded in a medical technologies to collect data from human body through sensors, wireless connectivity etc. The junction of medicine and IT like medical informatics will transform healthcare, curbing cost, make more efficient, and saving lives. Various computerized techniques are implemented in the area of Artificial Intelligence (AI) for the application of medical imaging to diagnose the infected regions in the images and videos such as WCE and pathology. The famous stomach infections are ulcer, polyp, and bleeding. Stomach cancer is the most common infection and a leading cause of human deaths worldwide. In the USA, since 2019, a total of 27,510 new cases are reported including 17,230 men and 10,230 women. While the number of deaths is 11,140 consists of 6,800 men and 4,340 women. The manual diagnosis of these stomach infections is a difficult and agitated process therefore it is required to design a fully automated system using AI. In this article, we presented a fully automated system for stomach infection recognition based on deep learning features fusion and selection. In this design, ulcer images are assigned manually and support to a saliency-based method for ulcer detection. Later, pre-trained deep learning model named VGG16 is employing and re-trained using transfer learning. Features of re-trained model are extracted from two consecutive fully connected layers and fused by array-based approach. Besides, the best individuals are selected through the metaheuristic approach name PSO along mean value-based fitness function. The selected individuals are finally recognized through Cubic SVM. The experiments are conducted on Private collected dataset and achieved an accuracy of 98.4%, which is best as compared to existing state-of-the-art techniques.

Topic: Deep Learning Algorithms for Internet of Medical Things

Graphical abstract of proposed deep learning based automated design for Stomach diseases classification.

Published in: IEEE Access ( Volume: 8)

Page(s): 132850 - 132859

Date of Publication: 20 July 2020

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2020.3010448

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

In medical imaging, automatic detection and classification of cancers like skin cancer [1], [2], lungs cancer [3], brain tumor [4], stomach cancer [5] and few more are most important research topics from last few decades [6], [7]. From these, stomach is most common cancer name colon. The most common stomach infections are ulcer, bleeding, and polyps. These gastric infections have become a major cause of deaths of humans. A worldwide survey shows that colon cancer caused 525,000 mortalities, and 765,000 deaths occurred due to stomach cancer since 2017. In the United States, currently about 1.6 million people are facing bowel infections, and every year 0.2 million new cases are happening [8]. In the developing countries of the world, 694,000 deaths occurred due to colorectal cancer [9]. This is also known as bowel cancer. In 2015, 132,000 fresh cases of bowel cancer are happened according to an American cancer society [10]. In worldwide common cancers, esophageal cancer is at $7^{\mathrm {th}}$ number [11]. In the global cancer deaths, stomach cancer is at $3^{\mathrm {rd}}$ number [12].

WCE is the medical imaging technique to examine the gastrointestinal (GI) tract. This technique is extensively used in hospitals for the detection of gastric abnormalities such as ulcer, bleeding, and many more as shown in Figure 1 [13], [14]. A recent report shows that the treatment of about one million patients has been successfully done with WCE [15]. A small camera is used to capture the images of human gastrointestinal tract. Then the gastroenterologists analyze these WCE images, and it is a time taking procedure. About more than 50,000 images are produced during a WCE examination. A physician required two hours average to analyze these images, and a risk of false detection is also present [16].

FIGURE 1.

Sample frames of healthy and gastric abnormalities [14].

Show All

Many image processing researchers have developed the automated systems for the recognition of stomach infections from endoscopic images. These systems help in early detection of stomach diseases. The survival rate can be improved by diagnosing the gastric infections at early stage. The fundamental steps of the automated detection systems are features extraction, feature selection, and classification. Different methods for feature extraction utilized by the researchers are include point features [17], texture features [15], HOG features [18], and color features [19], [20]. Convolutional Neural Networks (CNNs) are combined with the handcrafted features to enhance the system’s performance. Different CNN models such as AlexNet [21], VGG-16 [22], and ResNet [23] are used for the deep features extraction. The most important step in image processing is to extract and select the best features for classification. Most appropriate features produce the high accuracy results for the infection detection and classification.

SECTION II.

Related Work

Researchers have developed many automated detection and recognition systems. Mainly these are the supervised learning approaches that follows handcrafted and CNN features for detection and recognition of abnormalities in gastrointestinal tract. An esophageal cancer detection method is presented based on the Gabor features and Faster Region-Based CNN (Faster R-CNN). In this method, combined the handcrafted Gabor features with CNN descriptors [11]. Gabor features become more effective when combined with CNN features [24] and various studies have shown the effectiveness when handcrafted Gabor and deep features are combined [25]–[28]. A CNN based model is developed for the recognition of ulcer, polyp, and erosion. CNN features are used together with SVM for the detection of gastric infections. By using this technique, 80% accuracy was achieved [29]. This system utilized the fire segments from SqueezNet. This method reduces the size of network and achieves the accuracy of 88.90%. Billah et al. [30] combine the color wavelet features with CNN features. In classification phase, SVM is used to obtain the results. In [8], Geometric features are utilized from the segmented region of GI images. Then geometric features are combined with the features of VGG-19, and VGG-16. The deep features of VGG-19 and VGG-16 are fused based on the Euclidean Fisher Vector method.

A color transformation based technique is presented in [31]. HSI and YIQ transformation is applied on RGB images and calculate the maximum and minimum pixel values. Then Local Binary Pattern (LBP), and Gray Level Co-occurrences Matrices (GLCM) features are extracted and fused with the color-based features and the final vector fed to the multi-layer perceptron. This technique detects and classifies the stomach infections. A model is proposed for ulcer detection based on YIQ color transformation [32]. This method utilized the Y plane and SVM is used in classification phase. Suman et al. [19] developed a statistical color features based technique for automatic detection of gastric bleeding. A two phase model is introduced for automated detection of ulcer [33]. In first step, a super pixels-based saliency method is utilized, which identify the infected region. In second step, saliency based max pooling (SMP) technique is introduced. The SMP method then combines with locality constrained linear coding (LLC) and obtains the 92.65% of classification accuracy. For the classification of bleeding, polyp, and ulcer K-mean clustering technique was utilized and achieve 88.61% of accuracy. A method was developed based on texture features for the classification of ulcer and non-ulcer. The final feature vector then fed to the SVM classifier and achieved 94.16% of accuracy [10]. Fan et al. [16] introduced a stomach diseases recognition system based on LBP, and Scale-Invariant Feature Transform (SIFT) features.

Discrete Wavelet Transform (DWT), variance, and LBP features are extracted and classified using SVM for the detection of colon infections. Texture information is calculated from these features, and SVM classifier is used to obtain the classification results [34]. Bag of visual Words (BoW) is generated from the features extracted from different color spaces and color histograms for bleeding detection [35]. Features of pre-trained networks such as Inception-V3, and VGGNet are extracted and fused with the baseline features. This method achieved the 96.1% classification accuracy on SVM [36]. A similar method is proposed utilizing ResNet50 features. Feature vector fed to the logistic model tree (LMT) for classification, and 95.7% of accuracy is achieved [37]. A technique is introduced based on color and statistical texture features for detection of GI tract infections [38]. HSI and LAB color transformations are used to detect the bleeding area in WCE images. These color spaces are helpful in bleeding detection. Classification results are enhanced by using multi perceptron learning approach [39]. Shape, texture, and color features were utilized in different studies to detect the abnormal regions [40], [41]. Fisher scoring method was applied to select the feature set with maximum information, extracted from HSV color transformation and texture features. Researchers classify the ulcer and bleeding images using multilayered neural network [42]. Two SVM classifiers based on RGB and HSV color spaces were fused to build an automated detection system [43]. This classifier fusion technique achieved the classification accuracy of 95%.

SECTION III.

Challenges and Contributions

In the above listed techniques, it is observed that the most of recent methods follows the fusion process of handcrafted and CNN features. However, this process increases the overall system execution time. Moreover, it is also noted that the existing techniques decreases the classification accuracy in the process of raw images for features extraction. For example, the pixels values of ulcer and original images are almost similar except infected region pixels. Therefore, it is a solution to first extract the ulcer regions from the original frames and then extract its features. Few other challenges are inconsistency of ulcer regions and selection of irrelevant features which cause a problem in accurate infection classification. In this article, a new method is proposed for automated gastrointestinal infections recognition using WCE imaging modality. Major contributions are:

A dark channel along with de-correlation formulation based approach is designed to improve the pixel range of ulcer region.
An optimized saliency based method is adopted along with few morphological operations for ulcer detection.
Using pre-trained deep learning model named VGG16 and extract features using transfer learning. Features are computed from two sequential layers and fused using array-based method.
Best features are selected through PSO-GM meta-heuristic approach and classify selected features using Cubic SVM. The results of both fusion and selection process are computed and analyzed in terms of confusion matrices and graphs.

SECTION IV.

Proposed Methodology

In this article, a new automated system is proposed for gastrointestinal infections recognition from WCE imaging modality. The proposed system includes few famous steps including preprocessing of ulcer frames through dark channel prior and decorrelation based ulcer visibility improvement, segmentation of ulcer using optimized saliency based method along with morphological operations, deep learning features extraction, selection of best features, and finally classification of selected features. These steps are clearly illustrated in Figure 2. The detail of each step is given below.

FIGURE 2.

Detailed flow diagram of gastrointestinal infections recognition.

Show All

A. Data Acquisition

Image acquisition anticipates taking images for validation of the proposed method. In this work, WCE imaging modality is employed for the detection and recognition of stomach infections. These images are obtained from the CUI Wah database [31], which includes a total of 6000 RGB images of WCE modality. A few sample images are also illustrated in Figure 1. These images further include ulcer, bleeding, and normal where apiece category includes 2000 images. In this dataset, ulcer images are separate manually and further utilized for segmentation while the bleeding and normal images are directly supported to the feature extraction step as shown in Figure 2.

B. Ulcer Detection

In the ulcer detection step, the following process is followed- i) dark channel based contrast enhancement; ii) implementation of decorrelation formulation on dark channel enhanced image, iii) implement an existing saliency method on decorrelated image, and iv) perform morphological operations for final refinement. Improve the visibility of an image at the initial stage is an important step to get better detection and relevant features of an infected region. As shown in Figure 3 (a), the original WCE images have dark effects on ulcer regions which mean that the pixel range of infected part is towards 0. For this purpose, we implement a haze reduction based approach [44].

FIGURE 3.

Dark channel enhancement and decorrelation formulation effects- a) original WCE image; b) dark channel enhanced image; c) formulation of decorrelation effects.

Show All

Let, we have original WCE frames denotes by $\Delta$ and one image denoted by $\varphi \left ({x,y }\right)$ of dimension $512\times 512\times 3$ where $N=512$ , $M=512$ , and three channels R, G, and B, respectively. According to [45], haze formulation is defined as:

$\begin{equation*} \tilde {\varphi }\left ({f }\right)=J\left ({f }\right)t\left ({f }\right)+A(1-t(f))\tag{1}\end{equation*}$ View Source

where,

$f\in (x,y)$

$\tilde {\varphi }\left ({f }\right)$

is observed intensity,

$J$

denotes the scene radiance, global air brightness is denoted by

$A$

, and

$t(f)$

represents the medium transmission of light, respectively. The problem of radiance is removal is key issue in this work which is defined through following formulation:

$\begin{equation*} J\left ({f }\right)=\frac {\left ({\tilde {\varphi }\left ({f }\right)-A }\right)}{\left ({\max \left ({t\left ({f }\right),t_{0} }\right) }\right)}+A\tag{2}\end{equation*}$

View Source

The effects of this formulation are illustrated in Figure 3(b).

Later, the decorrelation formulation is employed on $J(f)$ to highlight the ulcer regions. Mathematically, the formulation of decorrelation is defined as:

$\begin{equation*} \beta =\tau \times \left ({\alpha -\mu }\right)+\mu _{t}\tag{3}\end{equation*}$ View Source

where,

$\beta$

denotes the corresponding pixel in resultant image,

$\alpha$

denotes

$n\times 1$

band vector,

$\tau$

is an

$n\times n$

band vectors,

$\mu$

is a mean and

$\mu _{t}$

denotes the target mean of each sub-bands, respectively. Later, compute the correlation as:

$\begin{equation*} Cor=inv\left ({\sigma }\right)\times Cov\times inv~(\sigma)\tag{4}\end{equation*}$

View Source

where,

$\sigma \left ({k,k }\right)=\sqrt {Cov(k,k)}$

and

$k=1,2,3,\ldots,\mathrm { }n\_{}bands$

. By employing these

$n\_{}bands$

, the stretch function is formulated as:

$\begin{align*} S_{f}\left ({k,k }\right)=&\frac {1}{\sqrt {\lambda (k,k)}} \tag{5}\\ \tau=&\sigma _{t}O_{m}S{(O_{m})}^{\prime }\tag{6}\end{align*}$

View Source

where,

$S$

represent a diagonal matrix,

$\lambda$

is a diagonal matrix of Eigen Values (EV), and

$O_{m}$

is an orthogonal matrix, respectively. At the end, all these notations are employed in

$\beta$

to compute final output as:

$\begin{equation*} \beta =\mu _{t}+\sigma _{t}O_{m}S\left ({O_{m} }\right)^{\prime }inv\left ({\sigma }\right)\times (\alpha -\mu)\tag{7}\end{equation*}$

View Source

The resultant outputs of this formulation in the form of visual effects are shown in Figure 3 (c). In this figure, it is clearly presented that after implementation of decorrelation formulation; the ulcer region is clearly highlighted. After that, the ulcer is segmented by using an existing saliency based approach name histogram based contrast pixels clustering [46]. As we have resultant decorrelated image

$\beta$

where

$\beta \in \beta _{1}(x,y)$

which is utilized for computing the distance among pixels of image as:

$\begin{equation*} Sal\left ({\beta _{q} }\right)=\sum \nolimits _{\forall \beta _{i}\in \beta } {d(\beta _{q},\beta _{i})}\tag{8}\end{equation*}$

View Source

where,

$d(\beta _{q},\beta _{i})$

denotes the color distance among pixels

$\beta _{q}$

and

$\beta _{i}$

in image

$\beta$

. This expression computes the distance among image pixels which later grouped same type of pixels into number of clusters like pixels which are range of green values and any others as follows:

$\begin{equation*} Sal\left ({cl }\right)=\sum \nolimits _{k=1}^{K} {p_{j}d(cl,c_{k})}\tag{9}\end{equation*}$

View Source

where,

$p_{j}$

denotes the probability of pixels color

$c_{k}$

in image

$\beta$

. This expression return a pixels based saliency map as shown in Figure 4 (b). Furthermore, convert this saliency mapped image into binary format by employing following threshold function.

$\begin{align*} Th\left ({x,y }\right)=\begin{cases} 1&if~Sal\left ({cl }\right)\ge Thr \\ 0&Otherwise \\ \end{cases}\tag{10}\end{align*}$

View Source

where,

$Th\left ({x,y }\right)$

denotes the binary image as shown in Figure 4 (c) and

$Thr$

is a threshold value which range is between 0.683 to 0.719. These binary images are further refined by applying morphological operations such as closing and dilation, and its effects are illustrated in Figure 4(d).

FIGURE 4.

Ulcer detection results- (a) implementation of decorrelation formulation as an input; (b) saliency estimation; (c) binary image through thresholding function; (d) refinement function and morphological operations based effects, and (e) mapped results.

Show All

C. Deep Learning Features

Classification is a key challenge in machine learning but the performance of this always depends on the nature of input data like features [47], [48]. The power of ML depends on the number of training data; however, a lot of samples are noisy and irrelevant which generates noisy and inappropriate features [49]. Through, these features, the performance of a system is decreased which is a key issue in this area [50], [51]. In medical imaging, the selection of best features is more important for classification [52]. The key challenge in the classification phase is how to select the most discriminant features for the final classification. In this work, we are using deep learning features. A pre-trained deep learning model named VGG16 is employing and re-trained with the help of transfer learning on collected WCE dataset. Later, an important texture features are concatenated with deep features and applied optimization of features using PSI-GM approach. The selected features are finally classified using Cubic SVM classifier. A flow diagram is showing in Figure 5.

FIGURE 5.

Overview of extracted features and best selected features for recognition.

Show All

1) Pre-Trained Deep Learning Model

A pre-trained CNN model named VGG16 [] is employing in this work for deep learning features. Originally, this model consists of five convolutional layers and 3 fully connected (FC) layers along with a Softmax layer for final classification. The layers are max pooling and ReLu. Between FC layers, a dropout layer is added of value 0.5. This model take an input image of size $224\times 224\times 3$ . Initially, input image is passed in first convolutional layer. In this layer image features are extracted and passed in the next successive layer as shown in Figure 6. Mathematically, the convolutional layer is defined as follows:

$\begin{equation*} \lambda _{(t)}^{(k)}=\sigma \left ({\psi ^{\left ({k }\right)}\lambda ^{\left ({k-1 }\right)}\left ({t }\right)-\beta ^{(k)} }\right)\tag{11}\end{equation*}$ View Source

where,

$k=1,2,3,..K$

$\psi ^{\left ({k }\right)}=\left ({w_{j-k}^{(i)} }\right)$

is a

$d_{i}\times d_{i-1}$

convolutional matrix,

$\sigma$

act as a vector component and defined as a ReLu activation function.

$\begin{equation*} \sigma \left ({k }\right)=\max \left ({\mathrm {k,0} }\right),\quad k\in \mathbb {R}\tag{12}\end{equation*}$

View Source

The

$\beta$

represent as a bias vector and defined as-

$\beta =\left \{{d_{i}\times d_{i-1} }\right \}$

. Another layer name max pooling is also added in this network to resolve the problem of underflow. Later on, the most important layer named FC is added. In this layer, the information is extracted into a one-dimensional. The features of this layer are used for classification through Softmax layer. A main architecture of VGG16 is showing in Figure 6.

FIGURE 6.

Pre-trained VGG16 deep learning model.

Show All

2) Transfer Learning Based Feature Extraction

In this article, we are employing transfer learning to re-train this model on WCE images. In the TL, the same parameters of original model are utilized to train a new model. The main purpose of TL is to solve the problem of much time for training a new model from scratch. Based on TL based training, it is easy to train a model with less computational time. After retraining this model on WCE images, the activation is employing on last two FC layers for features extraction. The size of resultant vector of each layer is $1\times 4096$ and $1\times 1000$ , respectively. For $N$ images, the resultant vectors length is $N\times 4096$ and $N\times 1000$ , respectively.

3) GRAY Level Difference Matrix (GLDM)

The GLDM features [53] are represented as absolute difference among two gray level pixels of an image. In this method, three core parameters are required such as difference, distance, and angle. Mathematically, it can be formulated as:

$\begin{align*} D_{v}=&(\nu _{k},\nu _{l}) \tag{13}\\ I_{n}\left ({k,l }\right)=&\left |{ I_{n}\left ({k,l }\right)-I_{n}(k+\nu _{k},l+\nu _{l}) }\right | \tag{14}\\ \left ({k,D_{v} }\right)=&Prob(I_{n\left ({0 }\right)}\left ({k,l }\right)=1)\tag{15}\end{align*}$ View Source

where,

$I_{n}\left ({k,l }\right)$

denotes the image intensity values and

$\left ({k,D_{v} }\right)$

represent estimated probability density function, respectively. For extraction of GLDM features, the following parameters are initialized: difference is 2, distance is 1, and angle is

${0}^{0}$

. In the output, a resultant vector is returned of matrix dimension

$256\times pd$

where

$pd\in (1,2,3,4)$

and we select

$pd=2$

D. Features Fusion and Selection

After that, a simple array based method is proposed to combine both sequential layer features in one matrix. The main purpose of fusion process is to obtain more informative feature vector for best classification. Mathematically, this process is explained below:

Let $\Delta X_{1}$ denotes feature vector of FC layer two of dimension $N\times 4096$ , $\Delta X_{2}$ denotes feature vector of FC layer three of dimension $N\times 1000$ , and $\Delta X_{3}$ denotes feature vector of GLDM features of dimension $N\times 256$ . Suppose $\Delta X_{N}$ is a resultant feature matrix of dimension $N\times K$ , where $K$ denotes fused features. Hence the initial length of features is depends on the all three vectors sum and computed as:

$\begin{equation*} \sum \left ({\Delta }\right)=\sum \nolimits _{i=1}^{3} \sum \nolimits _{j=1}^{N} \left ({\Delta X_{i}^{j} }\right)\tag{16}\end{equation*}$ View Source

Based on the size of

$\sum \left ({\Delta }\right)$

, an array is initialized and place features as:

$\begin{align*} \Delta X_{N}=&(\Delta X_{1},\mathrm {\Delta }X_{2}) \tag{17}\\ \Delta X_{N}=&\left ({{{\begin{array}{l} \Delta X_{1}\\ \Delta X_{2} \\ \Delta X_{3} \\ \end{array}}} }\right)_{N\times \sum \left ({\Delta }\right)}\tag{18}\end{align*}$

View Source

Finally, the fused features are optimized using evolutionary search method named PSO [54] along with grand mean fitness function. PSO have two major benefits and it is a motivation of chosen- convergence speed is fast and less computational cost. Mathematically, PSO is defined as follows:

$\begin{equation*} {Vl}_{i,j}\!=\!{Vl}_{i,j}\!+\!c_{1}r_{1,j} \left ({\varphi _{i,j}^{pbt}\!-\!\varphi _{i,j} }\right)+c_{2}r_{2,j}\left ({\varphi _{j}^{gbt}-\varphi _{i,j} }\right)\tag{19}\end{equation*}$

View Source

where,

$\varphi _{i}$

and

${Vl}_{i}$

denotes position vectors and velocity vectors, respectively. These vectors are defined as:

$\begin{align*} \varphi _{i}=&\left ({\varphi _{i,1},\varphi _{i,2},\ldots,\varphi _{i,n} }\right)^{T} \tag{20}\\ {Vl}_{i}=&\left ({{Vl}_{i,1},{Vl}_{i,2},\mathrm { }\ldots,{Vl}_{i,n} }\right)^{T}\tag{21}\end{align*}$

View Source

The historical best position and population historical best position is denoted by

$\varphi _{i,j}^{pbt}$

and

$\varphi _{j}^{gbt}$

, and initialized as follows:

$\begin{align*} \varphi _{i,j}^{pbt}=&\left ({\varphi _{i,1}^{pbt},\mathrm { }\varphi _{i,2}^{pbt},\ldots.,\varphi _{i,n}^{pbt} }\right)^{T} \tag{22}\\ \varphi _{j}^{gbt}=&\left ({\varphi _{1}^{gbt},\varphi _{2}^{gbt},\ldots.,\varphi _{n}^{gbt} }\right)^{T}\tag{23}\end{align*}$

View Source

In each generation, particles are evaluated by a fitness function. The main purpose of fitness function is to give the best solution for final classification. A grand mean based fitness function is employing in this work.

$\begin{equation*} Fitness=\sqrt {\sum \nolimits _{i=1}^{C} \left \{{{(\mu _{i}-\mu _{0})}^{t}\left ({{(\mu }_{i}-\mu _{0} }\right) }\right \}}\tag{24}\end{equation*}$

View Source

where,

$\mu _{i}$

denotes corresponding class mean value and

$\mu _{0}$

denotes grand mean of whole feature space. Based on this, if fitness value is greater than pbest and gbest, then it will be updated. After that One-against-All SVM is utilized of Cubic kernel function for the classification of these best features. The performance of Cubic SVM is compared with few other classification algorithms as shown in Figure 7.

FIGURE 7.

Description of selected classifiers for experimental process.

Show All

SECTION V.

Results and Analysis

In the experimental process, a Privately collected dataset is employed as a detailed description is provided in Section 4.1. The images in this dataset are complex like low brightness of ulcer regions and similarity of pixels. In the evaluation step, the performance of Cubic SVM is compared with a few other classification techniques as illustrated in Figure 7. A 10-Fold cross validation is performed for both training and testing data [55], where selected ratio is 70, 30. For performance analysis, standard parameters are employed- sensitivity (Sen), precision (Pre), F1 Score (F1-S), area under the curve (AUC), FP rate (FPR), and accuracy. All simulations were employed on MATLAB Tool with Deep Learning (DL) toolbox name Matconvnet. This tool was employed for CNN feature extraction.

A. Numerical and Visual Results

The numerical results of this work are presented in this section. The results are acquired in two different steps. In the first step, fusion of extracted features based experiment was performed and results are given in Table 1. In this table, it is clearly illustrated that the recognition results on CSVM are best in the form of standard calculated parameters such as Sen (96.42%), Pre (96.20%), F1-S (96.31%), AUC (0.992), FPR (0.019), and accuracy is 96.50%, respectively. In Figure 8, the performance of CSVM can be confirmed by calculating the true positive rates (TPR). In this figure, it is indicated that the normal class gives maximum of 97% TPR. Compared with other classification techniques, it is observed from Table 1 that the second best performance in the fusion process is 96% (accuracy) which is achieved on MGSVM. While, the lowest attained accuracy is 86.80% for EBT classifier. On the remaining classifiers such as LSVM, QSVM, Co-SVM, FKNN, EBT, and DT, the accuracy performance is 93.40%, 94.92%, 90.14%, 91.40%, 86.80%, and 87.42%, respectively. Overall, it is observed that the fusion of all calculated features was performed well.

TABLE 1 Proposed Features Fusion Based Classification Results on Private Collected Dataset

FIGURE 8.

Confusion matrix of CSVM for proposed features fusion process.

Show All

In the second step, the best selected features based on PSO-GM approach are employed for experimental process and results are given in Table 2. In this table, it is clearly illustrated that the recognition results on CSVM are best in the form of standard calculated parameters such as Sen (98.33%), Pre (98.36%), F1-S (98.34%), AUC (1.00), FPR (0.007), and accuracy is 98.40%, respectively. In Figure 9, the performance of CSVM can be confirmed by calculating the true positive rates (TPR). Compared with the other classification techniques, it is observed from Table 2 that the second best performance in the fusion process is 98.20% (accuracy) which is achieved on MGSVM. While, the lowest attained accuracy is 91.20% for EBT classifier. On the remaining classifiers such as LSVM, QSVM, Co-SVM, FKNN, EBT, and DT, the accuracy performance is 96.60%, 97.90%, 93.80%, 93.00%, 91.20%, and 91.60%, respectively. Overall, it is observed that the selection of best features through proposed approach gives sufficient accuracy on all listed classifiers.

TABLE 2 Proposed Features Selection Based Classification Results on Private Collected Dataset

FIGURE 9.

Confusion matrix of CSVM for proposed features selection process.

Show All

B. Discussion and Comparison

Initially, the visual analysis is performed of proposed numerical results as presented in Table 1 and 2. From these tables, it is observed that the selection of best features through proposed PSO-GM based method given better results as compare to fusion process. The comparison in the form of fused features based accuracy and selection based accuracy is plotted in Figure 10. In this figure, it is observed that the selection accuracy is almost 3% to 4% increases as compared to fused vector results. Furthermore, the visual effects of each step as listed in Figure 2 are also shown such as in Figure 3, the effects of contrast improvement and implementation of de-correlation formulation are illustrated. Later on, through saliency based ulcer segmentation is performed that further refined through morphological operations as shown in Figure 10. In this figure, the final output mapped results are put in features extraction step instead of whole ulcer image. The purpose of this step is to get the more relevant features which are dissimilar as compare to normal image features. The whole architecture of features based recognition is shown in Figure 5. Overall analysis, it is observed that the selection of best features based recognition process gives better accuracy on CSVM as compared to other classifiers. Moreover, it is also observed from the results, given in Table 1 and 2, the accuracy of fused vector is lesser related to selection vector.

FIGURE 10.

Proposed features based accuracy comparison on Private dataset.

Show All

In the addition, we compare the feature fusion and selection results with few deep learning models. In this comparison, we extract features from original models and perform classification without any selection approach. Results are plotted in Figure 11. In this figure, it is showing that the feature selection process gives improved performance as compared to other deep learning models. Also, this figure shows the choice of VGG16 pre-trained model for deep feature extraction for WCE images.

FIGURE 11.

Comparison of proposed results with existing deep learning models for WCE images.

Show All

The visualization of selected deep feature is illustrated in Figure 12. In the last, we compare the proposed fusion and selection results with few latest techniques to further authenticate the performance of proposed architecture. In [20], authors achieved an accuracy of 97.89%. Liaqat et al. [14] attained accuracy of 98.49% on same dataset while our method achieved an accuracy of 98.40% where number of images in the updated dataset was higher than [14]. From the results, it is clearly illustrated the achievement of proposed scheme.

FIGURE 12.

Comparison of proposed results with existing deep learning models for WCE images.

Show All

SECTION VI.

Conclusion

A fully automated CAD system is proposed in this work for stomach infection diagnosis and classification using deep learning. In the proposed design, initially, an ulcer is detected through a saliency-based method and next step deep learning and GLDM features are extracted. The array based approach is employing to fused these features and optimize the resultant vector using PSO-GM evolutionary approach. The selected features are classified using Multi-class Cubic SVM and achieved an accuracy of 98.40%. Based on the results, it is concluded that the process of ulcer detection instead of direct used for features extraction gives most discriminative features as compared to features computed through original images. However, it is also noted that this process increase the system computational time which is a key limitation of this work. Besides, it is concluded that the reduction of fused features through Meta heuristic approach returns more informative features which helps in better recognition performance. The main limitations of this work are- (i) incorrect segmentation of ulcer regions create a problem in false training of a deep learning model and also responsible for extracting irrelevant features, (ii) selection of features using evolutionary techniques consume higher time as compare to heuristic techniques. In the future studies, our focus will be on minimizing the computational time.

Computer-Aided Gastrointestinal Diseases Analysis From Wireless Capsule Endoscopy: A Framework of Best Features Selection

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Related Work

Challenges and Contributions

Proposed Methodology

A. Data Acquisition

B. Ulcer Detection

C. Deep Learning Features

1) Pre-Trained Deep Learning Model

2) Transfer Learning Based Feature Extraction

3) GRAY Level Difference Matrix (GLDM)

D. Features Fusion and Selection

Results and Analysis

A. Numerical and Visual Results

B. Discussion and Comparison

Conclusion

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Computer-Aided Gastrointestinal Diseases Analysis From Wireless Capsule Endoscopy: A Framework of Best Features Selection

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Related Work

Challenges and Contributions

Proposed Methodology

A. Data Acquisition

B. Ulcer Detection

C. Deep Learning Features

1) Pre-Trained Deep Learning Model

2) Transfer Learning Based Feature Extraction

3) GRAY Level Difference Matrix (GLDM)

D. Features Fusion and Selection

Results and Analysis

A. Numerical and Visual Results

B. Discussion and Comparison

Conclusion

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?