Processing math: 100%
Detecting Pork Adulteration in Beef for Halal Authentication Using an Optimized Electronic Nose System | IEEE Journals & Magazine | IEEE Xplore

Detecting Pork Adulteration in Beef for Halal Authentication Using an Optimized Electronic Nose System

DatasetsAvailable

Schematic of the e-nose experiment for OENS.

Abstract:

Recently, the issue of food authentication has gained attention, especially halal authentication, because of cases of pork adulteration in beef. Many studies have develop...Show More

Abstract:

Recently, the issue of food authentication has gained attention, especially halal authentication, because of cases of pork adulteration in beef. Many studies have developed rapid detection for adulterated meat. However, these studies are not yet practical and economical methods and instruments and a faster analysis process. In this context, this paper proposes the Optimized Electronic Nose System (OENS) for more accurately detecting pork adulteration in beef. OENS has advantages such as proper noise filtering, an optimized sensor array, and optimized support vector machine (SVM) parameters. Noise filtering is carried out by cross-validation with different mother wavelets, i.e., Haar, dmey, coiflet, symlet, and Daubechies. The sensor array was optimized by dimension reduction using principal component analysis (PCA). An algorithm is proposed for the optimization of the SVM parameters. An experiment was conducted by analyzing seven classes of meat, comprising seven different mixtures of beef and pork. The first and seventh classes were 100% beef and 100% pork, respectively, while the second, third, fourth, fifth, and sixth classes contained 10%, 25%, 50%, 75%, and 90% of beef in a sample of 100 grams, respectively. Sample testing was carried out for 15 minutes for each sample. The classification test results to detect beef and pork had an accuracy of 98.10% using the optimized support vector machine. Thus, OENS has a favorable performance to detect pork adulteration in beef for halal authentication.
Schematic of the e-nose experiment for OENS.
Published in: IEEE Access ( Volume: 8)
Page(s): 221700 - 221711
Date of Publication: 09 December 2020
Electronic ISSN: 2169-3536

Funding Agency:


CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
SECTION I.

Introduction

The issue of food authentication has recently attracted the attention of consumers because of religious or lifestyle reasons [1]–​[4]. Especially for Muslims, food authentication regarding halal food is essential [5]. Pork is food that Muslims cannot eat (The Holy Quran, 1:173; 5:3; 6:145; 16:115). However, pork adulteration in beef has been discovered in the market [3], [6]. The practice of mixing beef with pork is sometimes done for economic reasons [7], [8]; the seller adulterates pork in beef because pork is cheaper than beef [9].

Recent research has discussed meat authentication using visual detection. The procedure includes DNA isolation from fresh meat samples, amplification of specific DNA sequences, and detection using lateral flow assays. This research can authenticate horse meat and pork meat with high selectivity and reproducibility values. However, this process still takes quite a long time, namely, 25-30 minutes [10]. Another recent study used lateral flow sensing (LFS) and polymerase chain reaction (PCR) for the rapid visual detection of adulterated meat [11]. The samples used in this study were the adulterated beef samples prepared by mixing with duck meat in a series of proportions of 0%, 0.01%, 0.05%, 0.1%, 0.5%, 1%, 5%, 10%, 50%, and 100%. This research took less than 2 hours to process. Various scientific methods have been developed to identify mixed meats, including gas chromatography (GC) and mass spectrometry (MS) [12], high-performance liquid chromatography (HPLC), nuclear magnetic resonance (NMR) spectroscopy [13], and Fourier transform infrared (FTIR) spectroscopy [14]. However, several things have to be considered when using these tools, such as cost, time, and experience [15], [16]. The price of GC-MS instrument is around USD 120,000 in 2017 [17], while the cost of testing a sample is about USD 50. In addition, the testing process of one sample can take about 1 to 2 days, depending on the complexity of the gases. Another consideration is the assistance of a person who has experience with operating the GC-MS instruments.

A solution is needed to meet these considerations using more practical and economical methods and instruments, and a faster analysis process with reliable results. This paper proposes the Optimized Electronic Nose System (OENS). An electronic nose (e-nose) is the main instrument in OENS. E-noses are devices with several advantages over other techniques for analyzing food smell, for example, the small amount of sample required, fast performance, simple usage, high sensitivity, and good correlation between the data from sensor analysis. The e-nose features offer five main categories of food analysis that can be used: monitoring, expiry checking, freshness evaluation, purity testing, and other food quality control investigations. Hence, the motivation for this study can be formulated as follows:

  1. Several types of research have used an e-nose to identify pork adulteration in beef for food quality control. However, most of them were focused on the differentiation and classification of species of meat. Only a few researchers have tried to determine different gas contents, which can be used for halal authentication in food.

  2. In the existing studies, e-nose systems have been developed for halal authentication. They show the potential of e-nose for halal authentication, even though their experiments were quietly limited without performing classification or regression tasks. For example, e-nose with PCA was used to differentiate pure lard, pure chicken fats, beef fats, mutton fats, and adulterated samples [18]. Moreover, e-nose with PCA was employed to discriminate four meat samples and three types of sausage [12]. Furthermore, another study attempted to perform binary classification to differentiate beef and pork using Naïve Bayes classifier [19].

According to these motivations, the main contribution of this study is to propose OENS for performing multiclass classification to differentiate seven mixtures of beef and pork. Therefore, this study makes the e-nose implementation for the practical application of halal authentication closer. In addition, e-nose produces signals that are sent to a computer for processing and analyzing. The proposed OENS can prevent the distortion of e-nose signal analysis by: (i) proper noise filtering, (ii) optimizing the sensor array, and (iii) optimizing the support vector machine (SVM) parameters.

The rest of this paper is organized as follows. Section 2 discusses previous works related to the topic of this study. Section 3 explains the details of OENS, including a specification of the materials and methods used in the experiment, such as the classification method and the discrete wavelet transform for signal processing. Section 4 describes the results of the experiment. Section 5 is the conclusion.

SECTION II.

Related Works

E-nose can be used for food authentication and adulteration assessment, as summarized in TABLE 1. Research to detect meat adulteration using an electronic nose has developed and is being studied. The latest research can detect a mixture of minced mutton in pork [20]. The study made six mixed combinations, namely mixing minced pork at 0%, 20%, 40%, 60%, 80%, and 100% by weight with minced mutton. To build the predictive model, these studies using multiple linear regression (MLR), partial least square analysis (PLS), and backpropagation neural network (BPNN). The predictive R2 result for the six classes is 0.97.

TABLE 1 Application of Electronic Noses for Food Assessment in the Last Five Years
Table 1- 
Application of Electronic Noses for Food Assessment in the Last Five Years

An electronic nose to detect adulteration levels in tomato juices is discussed [21]. This research compared six previous methods with the most recent popular one, spectral clustering using three methods of evaluating the clustering performance, i.e., mutual information criteria (MI), precision, and rand index (RI) which give statistical significance result (alpha = 0.05), thus outperformed the other methods. Rodriguez [22] studied two food adulteration cases (a pure variety of green coffee beans and pure cayenne adulteration with bell pepper powder). This work aimed to report improvements achieved in the differentiation of aroma samples with minimal differences in odor pattern.

Moreover, wine traceability and authenticity can be used to prevent outlawed adulteration practices, such as (i) addition ethanol, coloring and flavoring compounds; (ii) diluting wine with water; and (iii) replacing with cheaper wine. Therefore, the combination between e-nose and multivariate statistical methods improved the traceability and the classification of grapes and wine (especially the varieties and the geographical origin of grape) [23]. E-nose was also succeeded in detecting adulteration of mutton, which led into developing a model capable of detecting and estimating the adulteration of minced mutton with pork [24]. The volatile compounds occurring in the samples were collected by utilizing MOS-based e-nose. Later, an optimal data matrix is obtained using feature extraction methods, PCA, loading analysis, and SLDA.

Most of the studies on using e-noses only distinguished between 2 products or more and did not consider possible noise contamination of the gas sensor signals from the e-nose. However, in certain conditions, noise can affect the raw signal by 20% [25]. The noise influences the classification performance. While being sent to the computer, the signals can be interrupted and mixed with unwanted signals, which creates noise [26]–​[28]. These noises may interfere with the authenticity of the information, for example, caused by air that is contaminated by certain substances or smells. This noise should be removed to prevent the distortion of the analysis and the classification process. Several researchers have used the discrete wavelet transformation (DWT) to reduce noise in data signals [29]–​[35]. However, these studies only focused on the use of the DWT method without involving the use of suitable parameters, such as mother wavelet and level decomposition, although these parameters could improve the performance based on the noise-filtered signal [36]–​[38]. Apart from that, the number of sensors has also not been considered, even though using more sensors than necessary incurs extra costs. Based on the analytics, some of the sensors provide no significant information on the samples, hence the costs can be decreased by eliminating unnecessary sensors. Several works also perform sensor array optimization to reduce data dimensions, electrical consumption, production cost, computational and traffic overhead, etc [39]–​[41]. For interested readers, recent development and challenges for e-nose signal processing are summarized here [42].

SECTION III.

Materials and Method

A. Materials

In this study, an e-nose was built using nine MQ series gas sensors from Zhengzhou Winsen Electronics Technology Co., Ltd. The gas sensors were also used to detect different types of gases, as in our previous study [19]. The list of gas sensors is given in TABLE 2. These gas sensors were assembled to an Arduino microcontroller. For data communication, a universal serial bus (USB) interface was used to transfer the signals from the microcontroller to the computer. The gas sensors were placed in a sample chamber made of transparent glass. FIGURE 1 depicts the component of the e-nose system.

TABLE 2 Gas Sensors in the Sensor Array
Table 2- 
Gas Sensors in the Sensor Array
FIGURE 1. - Schematic of the e-nose experiment for OENS.
FIGURE 1.

Schematic of the e-nose experiment for OENS.

The samples used were ground beef and ground pork bought in fresh condition from the same store on the same date. In the experiment, samples of seven combinations of beef and pork were used. Both ground beef and pork were used in samples with a weight of 100 gr each with various compositions, which were divided into seven classes: the first and seventh classes were 100% beef and 100% pork, respectively. The second, third, fourth, fifth, and sixth classes contained 10%, 25%, 50%, 75%, and 90% of beef from a total sample of 100 grams, respectively. A scale was used to ensure that the weight of the mixture was appropriate. The compositions of the respective samples can be seen in TABLE 3. The following steps were used to collect the data samples:

  1. the e-nose was turned on, and the sensors were warmed up for 15 minutes;

  2. the sample was placed in the sample chamber with the gas sensors;

  3. the processes of data retrieval and transfer to the computer using the USB interface took 15 to 20 minutes for each sample;

  4. the sample chamber was cleaned using a flashing fan for 5 minutes after every sampling, so the next sampling was not affected by gas residue from the previous sampling.

TABLE 3 Composition of Samples
Table 3- 
Composition of Samples

As mentioned previously, the data were divided into seven classes, with 60 data for each class. Therefore, the total number of recorded data was 420. Each data had 10 digital outputs, i.e., S1, S2, S3, S4, S5, S6, S7, S8, S9 for temperature, and another S9 for humidity. In this paper, the digital output is called the raw signal. The data of all 7 classes are shown in TABLE 3. For interested readers, our dataset has also been uploaded here [43], [44].

B. Proposed Methods

After the dataset had been generated, the raw signals were analyzed through several steps, as shown in FIGURE 2. The first step is signal pre-processing, which cleans up the noise and produces output in the form of a reconstructed data signal. The next step is statistical parameter extraction, which utilizes the reconstructed data signal and extracts it to obtain the characteristics of the signal. The third step is the dimensional reduction, where the signal obtained is analyzed to select only the sensors that have the largest impact on pork adulteration detection. The final step is constructing the classification model from the 7 classes. The data obtained from the previous processes are divided into testing data (30%) and training data (70%) to be evaluated by the classification model. The data acquired from the e-nose are processed using a computer with scikit-learn by Python-based machine-learning software [45].

FIGURE 2. - Signal analysis steps for OENS.
FIGURE 2.

Signal analysis steps for OENS.

1) Signal Pre-Processing

Signal pre-processing is carried out to eliminate noise in the signals [46]. In this research, the noise was caused by the internal sensors, changes in ambient conditions such as humidity and temperature, and changes in electrical conditions such as voltage and current. The signals produced by an e-nose are usually non-stationary, where the statistical properties of the signal change with time [46], making the noise reduction process more complicated. This study used the discrete wavelet transform (DWT) and then compared several mother wavelets to determine the best-suited mother wavelet for noise filtering. This technique identifies the data from various aspects of signal analysis, trends, breakdown points, discontinuities, and similarities. The data produced by the e-nose are then divided into 7 classes. The first step is to look at the shape of the signals. In the second step, the type of wavelet, the so-called mother wavelet, is determined; this is indispensable because it is varied and is grouped based on the respective basic wavelet functions. The most popular types of mother wavelets in signal processing are Haar, dmey, coiflet, symlet, and Daubechies, all of which were compared in our experiment, with several decomposition levels. The discrete wavelet transform process for a given signal x(t) is expressed in Equation 1.\begin{align*} dwt(m,n)=&\left \langle{ {x(t),w_{m,n} (t)} }\right \rangle \\=&\frac {1}{\sqrt {2^{m}}}\int _{-\infty }^\infty {x(t)\omega \times \left ({{\frac {t-n2^{m}}{2^{m}}} }\right)} dt\tag{1}\end{align*}

View SourceRight-click on figure for MathML and additional features. where m , n , \omega represents scaling parameter, translation parameter, and mother wavelet, respectively. The explanation for the wavelet transform process is as follows: the first step is transforming the data with Equation 2, \begin{equation*} T(a,b)=\frac {1}{\sqrt {a}}\int _{-\infty }^{+\infty } {x(t)\omega \times \left ({{\frac {t-b}{a}} }\right)dt}\tag{2}\end{equation*}
View SourceRight-click on figure for MathML and additional features.
where \omega \times (t) is the conjugation of wavelet complex function analysis, a is the wavelet dilation parameters, and b is the location or position of the parameters. The wavelet function in discrete form is as follows:\begin{equation*} \omega _{m,n} (t)=\frac {1}{\sqrt {a_{0}^{m}}}\omega \left ({{\frac {t-nb_{0} a_{0}^{m}}{a_{0}^{m}}} }\right)\tag{3}\end{equation*}
View SourceRight-click on figure for MathML and additional features.
where m , n represent dilatation and wavelet translation control, respectively. a_{0} is a constant dilatation parameter with a value of more than one and b_{0} is the location parameter, which should be more than 0. If a_{0} =2 and b_{0} =2 are substituted into Equation 2, the dyadic grid of the wavelet transform is written as follows:\begin{equation*} \omega _{m,n} (t)=2^{\frac {-m}{2}}\omega (2^{-m}t-n)\tag{4}\end{equation*}
View SourceRight-click on figure for MathML and additional features.
By using this discrete wavelet function, the discrete transformation is obtained:\begin{equation*} T_{m,n} =\int _{=\infty }^\infty {x(t)\omega _{m,n}} (t)dt\tag{5}\end{equation*}
View SourceRight-click on figure for MathML and additional features.
T_{m,n} is known as the detail wavelet coefficient with index scale m and location n . The discrete wavelet is related to the scaling function and its dilatation equation. The use of the scaling function is meant to smoothen the signal. The result of the scaling function is convoluted with the signal, which provides the approximation coefficient. In this experiment, PyWavelets was used [47].

2) Statistical Parameter Extraction

In this step, parameter extraction is performed to extract the most relevant and informative values to represent the characteristics of the overall sensor response. The pre-processing values of sensor responses are averaged to get a single value [48]. In this research, several statistical parameter extraction methods were carried out (e.g., standard deviation (ST), mean (M), kurtosis (K), and skewness (SK). This study also made several combinations of the main parameter extraction methods such as mean combined with standard deviation (M + ST), mean with skewness (M + SK), mean with kurtosis (M + K), mean with standard deviation and skewness (M + ST. + SK), mean with standard deviation and kurtosis (M + ST + K), and mean with all major parameter extractions (M + ST + SK + K). Statistic parameter extraction using M parameter, the average of the signals to be reconstructed is represented by y(t) . To reconstruct the signals using the mean parameter, Equation 6 is used.\begin{equation*} \overline {y(t)} =\frac {\sum {y(t)}}{N}\tag{6}\end{equation*}

View SourceRight-click on figure for MathML and additional features. where \sum {y(t)} is the sum of the results of one sensor, and \mathrm {N}N is the total number of data. Meanwhile, if using standard deviation (ST) as a statistical parameter, Equation 7 is used.\begin{equation*} \sigma =\sqrt {\frac {1}{N}\sum \limits _{i=1}^{N} {(x_{i} -\overline {y(t)})^{2}}}\tag{7}\end{equation*}
View SourceRight-click on figure for MathML and additional features.
where x_{i} is each value from the population. The formula for reconstructing the signals using skewness (\alpha ) is represented by Equation 8.\begin{equation*} \alpha ^{3}=\frac {1}{N\sigma ^{3}}\sum \limits _{i=1}^{N} {(x_{i} -\overline {y(t)})^{3}}\tag{8}\end{equation*}
View SourceRight-click on figure for MathML and additional features.
where \sigma is a variance. While using one statistical parameter method, the resulting features are 10 features. Furthermore, 20 features are generated while using two statistical parameter methods, and so on.

3) Dimensional Reduction

The features generated can be spread across multiple dimensions; for this reason, dimension reduction is used to eliminate variables that do not have a significant role in detecting pork adulteration. Principal component analysis (PCA) is the dimensional reduction method that was used in this research. The eigenvector is used to consider the relationship between the variables. From the experimental results, the digital outputs are considered as PCA variables. The steps to perform principal component analysis are as follows:

  1. calculate the covariance (Cov) using Equation 9, where x is the signal and y is the class target from the signal.\begin{equation*} Cov(x,y)=\frac {\sum {xy}}{n}-(\overline x)(\overline y)\tag{9}\end{equation*}

    View SourceRight-click on figure for MathML and additional features.

  2. calculate the eigenvalue using Equation 10.\begin{equation*} (A-\lambda I)=(0)\tag{10}\end{equation*}

    View SourceRight-click on figure for MathML and additional features. where A,\lambda,I are square matrices of size n x n, scalar numbers, and identities, respectively.

  3. calculate the eigenvector using Equation 11.\begin{equation*} [A-\lambda I][X]=[{0}]\tag{11}\end{equation*}

    View SourceRight-click on figure for MathML and additional features.

  4. determine the new variable (component) by multiplying the natural variable with the eigenvector.\begin{equation*} \rho I=\frac {\lambda _{i}}{\sum \limits _{j=1}^{D} {\lambda _{i}}}\times 100\%\tag{12}\end{equation*}

    View SourceRight-click on figure for MathML and additional features.

If the resulting value from one component combined with another component is 0, then the correlation is considered low and can be interpreted as no relationship [49]. The variables that have 0 value are removed. After the number of dimensions has been reduced, the results are standardized so that the values are not too large or too small. The method used for the standardization process is Standard Scaler. This method gives a threshold according to the existing data.

4) Optimizing the Classification Parameters

Classification is a process of dividing the variables into classes. The division of the classes should match the real condition, i.e., if the meat sample is beef, then the sample should be classified into the beef class by OENS. In this research, OENS used the optimized support vector machine (SVM) as the classification method since this method is capable of learning the data and generating the classification classes by itself [50]. SVM is based on the use of a hyperplane that separates objects based on different classes. SVM has two main parameters, which are C and gamma (\gamma ) [50]. Adjustment of these parameters can produce satisfactory performance [51].

C is regularization parameter in the SVM algorithm. It trades off maximization of decision margin against correct classification of training data to prevent overfitting. In addition, gamma parameter is a part of kernelized SVM using radial basis function (RBF). It refers to the influence of a single training data. These parameters can increase the accuracy as well as the performance of the algorithm.

Unfortunately, there are no exact parameter values for use in the classification process. Several researchers have tried several different value combinations for the parameters, but it takes a long time to execute this process [52]. Hence, this research developed an algorithm to find the best parameters, which can be seen in Algorithm 1. The values were determined based on an experiment with the value of C, ranging from 0.01 to 1000 and \gamma ranging from 0.001 to 100.

Algorithm 1 Optimized Parameters of SVM

c_param = [0.001, 0.01, 0.1, 1, 10, 100,200,1000]

gamma_param = [0.001, 0.01, 0.1, 1, 10, 100]

for c in c_param:

for g in gamma_param:

for training, testing dataset:

model = svm_train(training, c, g)

score = svm_predict(test, model)

cv_list.insert(score)

scores_list.insert(mean(cv_list),c,g)

print max(scores_list)

SECTION IV.

Results and Discussion

A sensor test was done to find out the response of the e-nose when executing sample testing [21]. The response generated by the e-nose sensors can be seen in FIGURE 3. Each class is indicated in different colors. Classes 1, 2, 3, 4, 5, 6, and 7 are shown in blue, green, red, cyan, magenta, yellow and black, respectively. The sensor response can be seen for each sensor. The different combination of beef and pork leads to different response of gas sensor. It is influenced by the gas emitted from a meat sample. The different compositions of protein and lipid can produce different gas. The different drawing order of different classes indicates the different response values of each gas sensor. It can be good sign of capability to detect beef adulteration. For example, FIGURE 3(a) is a graph of the signals generated by Sensor 1 for the 7 classes. In total 420 signals were recorded, which were stacked against each class. These stacks would be difficult to identify through the images. For example, the grouping will be incorrect when the data from Class 1 are close to those of Class 2. There was also some interference in each signal caused by noise, as can be seen in FIGURE 3(b), 3(c), and 3(h). The severe noise can be found in sensor 8.

FIGURE 3. - Graphic of the raw signal before preprocessing.
FIGURE 3.

Graphic of the raw signal before preprocessing.

This sensor has selectivity to detect toluene, acetone, and ethanol. The volatility of the three compounds can cause the unstable responses. Furthermore, the raw signal has to be optimized by OENS to ensure that the result is appropriate.

A. Results of Proper Noise Filtering

This research used the discrete wavelet transform for noise reduction, using cross-validation to find the best parameter through mother wavelet and level decomposition. TABLE 4 shows that the db6 wavelet was compatible with the aims of this research based on a comparison with the mother wavelet. The result from 20 experimental runs was level 1 of decomposition; db6 gave a satisfactory result. Furthermore, this research also calculated the accuracy of the raw data signal. The result was 87.61%, which means that the accuracy was increased by 1% by employing proper noise filtering using

TABLE 4 Wavelet Decomposition Level of Eleven Gas Sensors
Table 4- 
Wavelet Decomposition Level of Eleven Gas Sensors

DWT with wavelet db6. The preprocessing result is shown in FIGURE 4. The signal looks smoother and the noise is lowered or smoothed. As depicted in FIGURE 3(h), the original signal shows significant noise; it has been reduced after finishing signal reconstruction by DWT with db6, as can be seen in FIGURE 4(h). After the signal was reconstructed, the signal results were extracted by statistical parameter extraction. This research has made 10 combinations of statistical parameter extraction. These statistical parameters will be used as features, as has been done in previous research [53]. Dimensional reduction is used in this study to see which features or variables affect the detection of the mixture of pigs in beef.

FIGURE 4. - Graphic of the raw signal after processing.
FIGURE 4.

Graphic of the raw signal after processing.

B. Optimized Sensor Array

The dimensional reduction is used in this study for dimensional reduction; other than that, it is used as an optimization sensor array. From these experiments, the gas sensor produces ten digital outputs considered as variables in PCA. However, before entering PCA, 10 digital outputs were extracted using several parameter statistical methods. In this manuscript, an example is presented using the Mean (M) as the statistical parameter extraction. Because the extraction parameter is only one, the resulting feature is only 10 features. These ten features will be used as input into the PCA formula.

This research tried to reduce the number of variables. The first step is to calculate the covariance to reduce the number of dimensions or components. TABLE 5 shows the calculation of the eigenvalue, proportion of variance, and cumulative variance that contributes to each component. The next step is choosing the principal component (PC) that will be used. If a cumulative variance of 50% does not give significant accumulation, then a cumulative variance of more than 50% is the best option to get a significant result. From the result, this research used PC 1, which showed 57% of recent variation. For PC 2, it was 75%, for PC 3 it was 87%, for PC 4 it was 92%, for PC 5 it was 96%, for PC 6 it was 98%, for PC 7 it was 99%, and PC 8 it was 100%. PC 9 and PC 10 were not selected because they did not show a significant contribution. The proportion of variance is the percentage after the eigenvalue was generated, 8 components had a substantial contribution (PC1, PC2, PC3, PC4, PC5, PC6, PC7, and PC8). The next step was calculating the eigenvector, as shown in TABLE 6. The eigenvector was calculated for each gas sensor based on the PC that was obtained previously and sorted from the largest to the smallest. Based on the results of PCA calculations, the data from e-nose to detect the adulteration of pork in beef was using 8 most dominant components based on 8 variables provided. These eight components had a fairly big correlation with a proportion of variance of 100%, namely the highest and most dominant factor, MQ 135 factor, with a proportion of variance of 57%, the MQ 4 factor, with a proportion of variance of 19%, and the MQ 9 factor, with a proportion of variance of 12%. The total variance obtained from the 8 variables was 100%.

TABLE 5 Result of Eigenvalue Calculation
Table 5- 
Result of Eigenvalue Calculation
TABLE 6 Result of Eigenvector Calculation
Table 6- 
Result of Eigenvector Calculation

Besides that, from the eight components that have been selected, this study determines which n_component sensor has the most dominant factor. TABLE 7 shows that in the first component, the dominant factor is S5 or MQ 135. The most significant factor in all components is S1 or MQ 2, which is in component 8. TABLE 8 shows the dimensional reduction results of the ten statistical parameter extraction combinations. Some components from the results of several feature extraction methods can be reduced, such as using the M parameter statistical method. It can reduce the dimensions from 10 to 8 components using the SVM classifier. The M + ST parameter statistical method can reduce the dimensions to 15 from 20. While the M parameter statistical method + SK using four classifiers does not reduce dimensions, 20 components are still being used. The statistical parameter method that produces the most features is M + ST + SK + K with 40 features, which can be reduced using the ANN classifier. FIGURE 5 shows the data after dimensional reduction using PCA. FIGURE 5 a and b denote the data before and after feature scaling using Standard Scaler (Z-score) normalization, respectively. The standardization was used for collecting the distributed data. It can be inferred from FIGURE 5 that the data from the first class became more clustered compared to the other classes.

TABLE 7 Result of Feature Selection With PCA
Table 7- 
Result of Feature Selection With PCA
TABLE 8 Comparison of the Accuracy of the Reduced Features and Parameter Optimization
Table 8- 
Comparison of the Accuracy of the Reduced Features and Parameter Optimization
FIGURE 5. - Plot diagram of the dimensional reduction result using PCA: (a) the data before normalization; (b) the data after normalization using Standard Scaler.
FIGURE 5.

Plot diagram of the dimensional reduction result using PCA: (a) the data before normalization; (b) the data after normalization using Standard Scaler.

C. Optimized Support Vector Machine (SVM) Parameters

The algorithm to find optimal SVM parameters from the 420 data required 16 seconds of execution time. The data is divided into two, namely, training data and testing data using cross-validation. This study compared three cross-validation types to get fair results, namely 3-fold, 5-fold, and 10-fold. The optimal values found for parameters C and \gamma were 100 and 0.1, respectively, using 10-fold cross-validation, as shown in TABLE 9. The tests were run 20 times to optimize the parameters. The final step was the classification using SVM. In FIGURE 6, all of the data from Classes 1, 6, and 7 were correctly predicted.

TABLE 9 Comparison of Evaluation Results Using Cross-Validation of the Classification Method
Table 9- 
Comparison of Evaluation Results Using Cross-Validation of the Classification Method
FIGURE 6. - Confusion matrix from SVM classification with optimal parameters.
FIGURE 6.

Confusion matrix from SVM classification with optimal parameters.

Meanwhile, for Class 2, 59 data were predicted correctly, and 1 data was predicted incorrectly; for Class 3, 58 data were predicted correctly, and 2 data were predicted incorrectly; 4 data were predicted incorrectly for Class 4, and 1 data was predicted incorrectly for Class 7, and 3 data were predicted incorrectly for Class 3. Lastly, for Class 5, 59 data were predicted correctly and 1 data was predicted incorrectly. In addition, TABLE 10 denotes the results of evaluation SVM with optimal parameters.

TABLE 10 Results of Evaluation SVM With Optimal Parameters
Table 10- 
Results of Evaluation SVM With Optimal Parameters

Furthermore, this research also compared several classification methods, i.e., artificial neural network (ANN) [54], linear discriminant analysis (LDA), K-nearest neighbors (KNN), and SVM, without using the parameter optimization algorithm 89%, 54%, 87%, and 91%, respectively. SVM with parameters optimization algorithms, which are C and \gamma , were 100 and 0.1, respectively, and yielded the best result (98.10%). In comparison, ANN with parameter optimization algorithm relu as activation generated 95.48%, KNN with parameter optimization algorithm neighbors = 1 and distance as the weight generated 93.10%, and LDA with parameter optimization algorithm generated 92.86%. These results show that the optimized SVM has superior performance than others. The optimization of hyperparameter settings makes the best decision boundary to classify seven classes of beef and pork mixtures.

SECTION V.

Conclusion

In this study, an OENS was developed, employing 9 gas sensors and producing 10 digital outputs. The noise of the signals was reduced by reconstructing the signals using DWT with mother wavelet db6, which could increase classification accuracy by 1%. By using mean as the statistical parameter method, generates 10 features and is spread into 10 dimensions. PCA successfully reduced the number of components/dimensions from 10 to 8 components. These 8 components had a fairly big correlation with a proportion of variance of 100%, namely the highest and most dominant factor, MQ 135 factor, with a proportion of variance of 57%, the MQ 4 factor, with a proportion of variance of 19%, and the MQ 9 factor, with a proportion of variance of 12%. The total variance obtained from the 8 variables was 100%. Thus, the optimization algorithm supported the efficiency of the SVM classification process in obtaining the best solution, which was 98.10% on average. This result indicates that OENS is potentially developed for halal authentication and brings closer to practical applications.

For future work, the fingerprint of pork adulteration in smaller portions of beef will be developed.

This article includes datasets hosted on IEEE DataPort(TM), a data repository created by IEEE to facilitate research reproducibility or another IEEE approved repository. Click the dataset name below to access it on the data repository

References

References is not available for this document.