Processing math: 100%
Improving k-Nearest Neighbors Model using SMOTE with Bagging Ensemble | IEEE Conference Publication | IEEE Xplore

Improving k-Nearest Neighbors Model using SMOTE with Bagging Ensemble


Abstract:

Water is a primary need and the quality of drinking water greatly affects human life. The purpose of this research is to classify the quality or feasibility of drinking w...Show More

Abstract:

Water is a primary need and the quality of drinking water greatly affects human life. The purpose of this research is to classify the quality or feasibility of drinking water using machine learning. The use of a single k-Nearest Neighbors (k-NN) algorithm often results in low accuracy due to many factors, so in this study SMOTE (Synthetic Minority Oversampling Technique) is used to handle imbalanced data and k-NN is used as an estimator in the Bagging technique for classify. This research will compare the performance of the model in classifying when only using the k-NN algorithm, k-NN with SMOTE and SMOTE integrated with k-NN as a Bagging estimator (SMOTE + k-NN as Bagging estimator) and to evaluate the performance results, researcher use accuracy, precision, recall, f1-score and ROC-AVC. In this research of classifying the quality or feasibility of drinking water, the single k-NN algorithm only produces an average AVC of 0.75 while the SMOTE +\mathrm{k}-\text{NN} as Bagging estimator produces an average AVC of 0.958. In this case study, the our proposed model is able to increase the AVC by 20.8% so it can be said that the out proposed model is able to improve the performance of the model in distinguishing classes on the target.
Date of Conference: 10-12 October 2023
Date Added to IEEE Xplore: 28 December 2023
ISBN Information:

ISSN Information:

Conference Location: Bali, Indonesia

I. Introduction

Water is one of the primary needs for humans in daily life, such as washing, bathing and consumption. Water sources can come from anywhere, such as rain, rivers, groundwater and others. In Indonesia, water that is suitable for consumption is regulated in the Regulation of the Minister of Health of the Republic of Indonesia Number 2 of 2023 (Permenkes). This regulation is important, given that water is easily contaminated and it is feared that if the water is consumed it can endanger the health of the body. Therefore, it is important to know the feasibility or quality of water to be consumed. [1]–[3]. To find out the feasibility or quality of water, testing is necessary. There are several manual methods commonly used to classify the feasibility of drinking water based on its content, namely STORET (STOrage and RETrieval) and Pollution Index. Both of these methods have their own weaknesses that cause a lack of efficiency in terms of time, manpower, cost and also a lack of accuracy of the results of these methods. [4], [5].

Contact IEEE to Subscribe

References

References is not available for this document.