Abstract:
Time series are ubiquitous in domains ranging from medicine to marketing and finance. Frequent Pattern Mining (FPM) from a time series has thus received much attention. T...Show MoreMetadata
Abstract:
Time series are ubiquitous in domains ranging from medicine to marketing and finance. Frequent Pattern Mining (FPM) from a time series has thus received much attention. This general problem has been studied under different matching relations determining whether two time series match or not. Recently, it has been studied under the order-preserving (OP) matching relation stating that a match occurs when two time series have the same relative order (i.e., ranks) on their elements. Thus, a frequent OP pattern captures a trend shared by sufficiently many parts of the input time series. Here, we propose exact, highly scalable algorithms for FPM in the OP setting. Our algorithms employ an OP suffix tree (OPST) as an index to store and query time series efficiently. Unfortunately, there are no practical algorithms for OPST construction. Thus, we first propose a novel and practical \mathcal{O}(n\sigma\log\sigma) -time and \mathcal{O}(n) - space algorithm for constructing the OPST of a length-n time series over an alphabet of size \sigma. We also propose an alternative faster OPST construction algorithm running in \mathcal{O}(n\log\sigma) time using \mathcal{O}(n) space; this algorithm is mainly of theoretical interest. Then, we propose an exact \mathcal{O}(n) -time and \mathcal{O}(n) -space algorithm for mining all maximal frequent OP patterns, given an OPST. This significantly improves on the state of the art, which takes \Omega(n^{3}) time in the worst case. We also formalize the notion of closed frequent OP patterns and propose an exact \mathcal{O}(n) -time and \mathcal{O}(n) -space algorithm for mining all closed frequent OP patterns, given an OPST. We conducted experiments using real-world, multi-million letter time series showing that our \mathcal{O}(n\sigma\log\sigma)- time OPST construction algorithm runs in \mathcal{O}(n) time on these datasets despite the \mathcal{O}(n\sigma\log\sigma) bound; that our frequent pattern mining algorithms are...
Published in: 2024 IEEE International Conference on Data Mining (ICDM)
Date of Conference: 09-12 December 2024
Date Added to IEEE Xplore: 21 February 2025
ISBN Information:
ISSN Information:
Funding Agency:
References is not available for this document.
Select All
1.
P. Esling and C. Agon. “Time-series data mining ”. In: ACM Comput. Surv. 45. 1 ( 2012 ), pp. 1–34.
2.
P. J. Brockwell and R. A. Davis. Introduction to Time Series and Forecasting. 2016.
3.
Y. Wu “NTP-Miner: Nonoverlapping Three-Way Sequential Pattern Mining ”. In: TKDD ( 2022 ), 51:1–51:21.
4.
P. Schäfer and U. Leser. “Motiflets - Simple and Accurate Detection of Motifs in Time Series ”. In: PVLDB ( 2022 ), pp. 725–737.
5.
Y. Cai “Fast Mining of a Network of Coevolving Time Series ”. In: ICDM. 2015, pp. 298–306.
6.
Y. Wu “OPR-Miner: Order-Preserving Rule Mining for Time Series ”. In: TKDE ( 2023 ), pp. 11722–11735.
7.
V. L. Ho, N. Ho, and T. B. Pedersen. “Efficient Temporal Pattern Mining in Big Time Series Using Mutual Information ”. In: PVLDB ( 2021 ), pp. 673–685.
8.
Q. Li “A Multimodal Event-Driven LSTM Model for Stock Prediction Using Online News ”. In: TKDE ( 2021 ), pp. 3323–3337.
9.
S. J. Rotman, B. Cule, and L. Feremans. “Efficiently Mining Frequent Representative Motifs in Large Collections of Time Series ”. In: IEEE BigData. 2023, pp. 66–75.
10.
M. Karaca “Frequent pattern mining from multi-variate time series data ”. In: ESWA ( 2022 ), p. 116435.
11.
H. Arimura and T. Uno. “An efficient polynomial space and polynomial delay algorithm for enumeration of maximal motifs in a sequence ”. In: J. Comb. Optim. ( 2007 ), pp. 243–262.
12.
N. R. Mabroukeh and C. I. Ezeife. “A taxonomy of sequential pattern mining algorithms ”. In: ACM Comput. Surv. ( 2010 ), 3:1–3:41.
13.
M. Linardi “Matrix Profile X: Valmod - Scalable Discovery of Variable-Length Motifs in Data Series ”. In: SIGMOD. 2018, pp. 1053–1066.
14.
S. Alaee “Time series motifs discovery under DTW allows more robust discovery of conserved structure ”. In: DMKD ( 2021 ), pp. 863–910.
15.
M. Imamura and T. Nakamura. “Parameter-free Spikelet: Discovering Different Length and Warped Time Series Motifs using an Adaptive Time Series Representation ”. In: KDD. 2023, pp. 857–866.
16.
Y. Wu “OPP-Miner: Order-Preserving Sequential Pattern Mining for Time Series ”. In: IEEE Trans. Cybern. ( 2023 ), pp. 3288–3300.
17.
Y. Wu “Co-occurrence order-preserving pattern mining with keypoint alignment for time series ”. In: ACM Trans. Man. Inf. Syst. ( Apr. 2024 ). Just Accepted.
18.
Y. Wu “COPP-Miner: Top-k Contrast Order-Preserving Pattern Mining for Time Series Classification ”. In: TKDE 36. 06 ( 2024 ), pp. 2372–2387.
19.
M. Kubica “A linear time algorithm for consecutive permutation pattern matching ”. In: Inf. Process. Lett. ( 2013 ), pp. 430–433.
20.
M. Crochemore “Order-preserving indexing ”. In: Theor. Comput. Sci. ( 2016 ), pp. 122–135.
21.
T. Gagie, G. Manzini, and R. Venturini. “An Encoding for Order-Preserving Matching ”. In: ESA. 2017, 38:1–38:15.
22.
A. Ganguly “LF Successor: Compact Space Indexing for Order-Isomorphic Pattern Matching ”. In: ICALP. 2021, 71:1–71:19.
23.
S. Kim and H. Cho. “Simple Order-Isomorphic Matching Index with Expected Compact Space ”. In: ISAAC. 2022, 61:1–61:17.
24.
M. A. Bender and M. Farach-Colton. “The LCA Problem Revisited ”. In: LATIN. 2000, pp. 88–94.
25.
M. A. Babenko “Wavelet Trees Meet Suffix Trees ”. In: SODA. 2015, pp. 572–591.
26.
L. Li Supplement for Scalable O rder-Preserving Pattern Mining. https://bit.ly/3TxHpwV. 2024.
27.
E. M. McCreight. “A Space-Economical Suffix Tree Construction Algorithm ”. In: J. ACM ( 1976 ), pp. 262–272.
28.
M. A. Bender “Iceberg Hashing: Optimizing Many Hash-Table Criteria at Once ”. In: J. ACM ( 2023 ), 40:1–40:51.
29.
R. Cole and R. Hariharan. “Faster Suffix Tree Construction with Missing Suffix Links ”. In: SIAM J. Comput.. 33. 1 ( 2003 ), pp. 26–42.
30.
R. Agrawal and R. Srikant. “Fast Algorithms for Mining Association Rules in Large Databases ”. In: VLDB. 1994, pp. 487–499.