Skip to Main Content
FP-growth is the most famous algorithm for discovering frequent patterns. As the database size growths or the minimum support decreases, however, both of the memory requirement and execution time increase greatly. Many researchers tried to solve this problem by utilizing distributed computing techniques to improve the scalability and execution efficiency. In this paper, we propose a method for discovering frequent patterns from very large database in cloud computing environments. To build the whole FP-Tree, we use the disk as the secondary memory. Because the disk access is much slower than main memory, an efficient data structure for storing and retrieving FP-Tree from disk is also proposed. Through empirical evaluations on various simulation conditions, the proposed method delivers excellent performance in terms of scalability and execution time.
Date of Conference: 8-10 Nov. 2011