Skip to Main Content
The previous study of pattern discovery in storage systems focus on sequential pattern (SP) mining in lower level traces, but they don't scale well to the application level. For patterns in application level are mostly composed of Contiguous Item Sequential Patterns (CISP) which are much simpler than SP, so it's inefficient for the previous studies to mine CISP with clumsy SP mining algorithms. We propose a novel algorithm FPG-Grow which is more preferable for mining application level IO patterns. The FPG-Grow only scan the origin sequence in one-pass to construct a Frequent Pattern Graph (FPG), from which we can easily extract the CISPs by fetching the frequent sub-graphs with linear cost. Also we can do the verification efficiently by avoiding the origin sequence scan. Furthermore, the grow method will eliminate the information loss introduced by sequence cutting as C-Miner does. The experiment result shows that the FPG-Grow outperforms C-Miner prominently in mining with real application IO traces and the simulation result also proves the effectiveness of CISP in application IO optimizations.