Skip to Main Content
There is currently a surge of interest in adaptive learning algorithms for applications ranging from ozone level peak predictions, learning stock market indicators, and detecting smart phone usage patterns. In such scenarios, the detection of change (or drift) in the concept being learned is important to ensure that correct, timely and relevant models are constructed. In addition, such data is often imbalanced and, to further complicate the issue, we are frequently interested in learning the minority class. It follows that ignoring these two aspects during learning may lead to unreliable, or even incorrect, models being built. In this research we discuss the interplay between concept drift detection and imbalanced data sets in order to ensure reliable results. We introduce a novel algorithm that, rather than considering a single performance evaluation measure such as accuracy for change detection, considers all the components of a confusion matrix and employs the cosine similarity coefficient. We evaluate our algorithm against a real world mobile phone database, as well as benchmarking datasets, and we compare it with two other state-of-the-art methods. The results show that our approach is particularly sensitive to concept drifts occurring in imbalanced data sets. Our evaluation indicates that our algorithm is able to detect concept drift reliably. Further, our method is shown to perform very well compared to the other techniques, especially when the drift occurs in the minority class of a class imbalance problem.
Date of Conference: 10-10 Dec. 2012