I. Introduction
In machine learning and statistics, classification is defined as training a system with labeled dataset to identify a new unseen dataset to which class it belongs. Recently, there is enormous growth in data and, unfortunately, there is lack of quality labeled data. Various traditional machine learning methods assumed that the target classes have the same distribution. However, this assumption is not correct in several applications, for example weather forecast [1], diagnosis of illnesses [2], finding fraud [3], as nearly most of the instances are labeled with one class, while few instances are labeled as the other class. For this reason, the models lean more to the majority class and eliminate the minority class. This reflects on the models performance as these models will perform poorly when the datasets are imbalanced. This is called class imbalance problem. Thus, in such situation, although a good accuracy can be gained, however, we don’t gain good enough scores in other evaluation metrics, such as precision, recall, F1-score [4] and ROC score.