Skip to Main Content
Performance evaluation and anomaly detection in complex systems are time consuming tasks based on analyzing, similarity analysis and classification of many different data sets from real operations. This paper presents an original computational technology for unsupervised incremental classification of large data sets by using a specially introduced similarity analysis method. First of all the so called compressed data models are obtained from the original large data sets by a newly proposed sequential clustering algorithm. Then the data sets are compared by pairs not directly, but by using their respective compressed data models. The evaluation of the pairs is done by a special similarity analysis method that uses the so called Intelligent Sensors (Agents) and data potentials. Finally a classification decision is generated by using a predefined threshold of similarity. The applicability of the proposed computational scheme for anomaly detection, based on many available large data sets is demonstrated on an example of 18 synthetic data sets. Suggestions for further improvements of the whole computation technology and a better applicability are also discussed in the paper.