Abstract:
Anomaly Detection (AD) is a machine learning and data mining technique for identifying patterns.behaviours, or instances in data that are different or unusual from most o...Show MoreMetadata
Abstract:
Anomaly Detection (AD) is a machine learning and data mining technique for identifying patterns.behaviours, or instances in data that are different or unusual from most otherdata. The goal is to discover samples that are inconsistent with expected behaviour, which may be anomalies or outliers. By identifying anomalies in a timely manner, it can help organisations prevent losses or potential risks. In this paper, we propose a new algorithm that generates supplementary pseudo-anomalies because of a constrained number of labelled anomalies and vast unlabelled data. Our proposed algorithm, called Enhanced Nearest Neighbour Gaussian Mixing (ENNG-Mix), efficiently integrates information from both labelled and unlabelled data to generate pseudo-anomalies. We compare the performance of this new algorithm with commonly used enhancement techniques such as Mixup and Cutout. We evaluate ENNG-Mix by training various existing semi-supervised and supervised anomaly detecti-on algorithms on raw training data along with the generated pseudo-anomalies. Through extensive experiments on 57 benchmark datasets in ADBench, reflecting different data types, we demonstrate that ENNG-Mix outperforms other data enhancement methods. It yields significant performance improvements compared to a baseline trained only on the original training data. Notably, in ADBench, ENNG-Mix improves the yield by 17.6%, 10.8% and 9.2% compared to Classical, CV and NLP datasets, respectively.
Published in: 2024 9th International Conference on Intelligent Computing and Signal Processing (ICSP)
Date of Conference: 19-21 April 2024
Date Added to IEEE Xplore: 12 November 2024
ISBN Information: