Loading web-font TeX/Caligraphic/Regular
Scalable Order-Preserving Pattern Mining | IEEE Conference Publication | IEEE Xplore

Abstract:

Time series are ubiquitous in domains ranging from medicine to marketing and finance. Frequent Pattern Mining (FPM) from a time series has thus received much attention. T...Show More

Abstract:

Time series are ubiquitous in domains ranging from medicine to marketing and finance. Frequent Pattern Mining (FPM) from a time series has thus received much attention. This general problem has been studied under different matching relations determining whether two time series match or not. Recently, it has been studied under the order-preserving (OP) matching relation stating that a match occurs when two time series have the same relative order (i.e., ranks) on their elements. Thus, a frequent OP pattern captures a trend shared by sufficiently many parts of the input time series. Here, we propose exact, highly scalable algorithms for FPM in the OP setting. Our algorithms employ an OP suffix tree (OPST) as an index to store and query time series efficiently. Unfortunately, there are no practical algorithms for OPST construction. Thus, we first propose a novel and practical \mathcal{O}(n\sigma\log\sigma) -time and \mathcal{O}(n) - space algorithm for constructing the OPST of a length-n time series over an alphabet of size \sigma. We also propose an alternative faster OPST construction algorithm running in \mathcal{O}(n\log\sigma) time using \mathcal{O}(n) space; this algorithm is mainly of theoretical interest. Then, we propose an exact \mathcal{O}(n) -time and \mathcal{O}(n) -space algorithm for mining all maximal frequent OP patterns, given an OPST. This significantly improves on the state of the art, which takes \Omega(n^{3}) time in the worst case. We also formalize the notion of closed frequent OP patterns and propose an exact \mathcal{O}(n) -time and \mathcal{O}(n) -space algorithm for mining all closed frequent OP patterns, given an OPST. We conducted experiments using real-world, multi-million letter time series showing that our \mathcal{O}(n\sigma\log\sigma)- time OPST construction algorithm runs in \mathcal{O}(n) time on these datasets despite the \mathcal{O}(n\sigma\log\sigma) bound; that our frequent pattern mining algorithms are...
Date of Conference: 09-12 December 2024
Date Added to IEEE Xplore: 21 February 2025
ISBN Information:

ISSN Information:

Conference Location: Abu Dhabi, United Arab Emirates

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.