In this paper we introduce a new parallel algorithm MLFPT (multiple local frequent pattern tree) for parallel mining of frequent patterns, based on FP-growth mining, that uses only two full I/O scans of the database, eliminating the need for generating candidate items, and distributing the work fairly among processors. We have devised partitioning strategies at different stages of the mining process to achieve near optimal balancing between processors. We have successfully tested our algorithm on datasets larger than 50 million transactions
Published in:
Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
Date of Conference: 2001