Skip to Main Content
We present a scalable and effective algorithm called ApproxMGMSP (Approximate Mining of Global Multidimensional Sequential Patterns) to solve the problem of mining the multidimensional sequential patterns for large databases in the distributed environment. Our method differs from previous related works of mining multidimensional patterns on distributed system. The main difference is that an approximate mining method is used in large multidimensional sequence database firstly. In this paper, to convert the mining on the multidimensional sequential patterns to sequential patterns, the multidimensional information is embedded into the corresponding sequences. Then the sequences are clustered, summarized, and analyzed on the distributed sites, and the local patterns could be obtained by the effective approximate sequential pattern mining method. Finally, the global multidimensional sequential patterns could be quickly mined by high vote sequential pattern model after collecting all the local patterns on one site. Both the theories and the experiments indicate that this method could simplify the problem of mining the multidimensional sequential patterns and avoid mining the redundant information. The global sequential patterns could be obtained effectively by the scalable method after reducing the cost of communication.
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on (Volume:2 )
Date of Conference: 24-27 Aug. 2007