Similarity search over time-series data using wavelets | IEEE Conference Publication | IEEE Xplore

Similarity search over time-series data using wavelets


Abstract:

Considers the use of wavelet transformations as a dimensionality reduction technique to permit efficient similarity searching over high-dimensional time-series data. Whil...Show More

Abstract:

Considers the use of wavelet transformations as a dimensionality reduction technique to permit efficient similarity searching over high-dimensional time-series data. While numerous transformations have been proposed and studied, the only wavelet that has been shown to be effective for this application is the Haar wavelet. In this work, we observe that a large class of wavelet transformations (not only orthonormal wavelets but also bi-orthonormal wavelets) can be used to support similarity searching. This class includes the most popular and most effective wavelets being used in image compression. We present a detailed performance study of the effects of using different wavelets on the performance of similarity searching for time-series data. We include several wavelets that outperform both the Haar wavelet and the best-known non-wavelet transformations for this application. To ensure our results are usable by an application engineer, we also show how to configure an indexing strategy for the best-performing transformations. Finally, we identify classes of data that can be indexed efficiently using these wavelet transformations.
Date of Conference: 26 February 2002 - 01 March 2002
Date Added to IEEE Xplore: 07 August 2002
Print ISBN:0-7695-1531-2
Print ISSN: 1063-6382
Conference Location: San Jose, CA, USA

1. Introduction

The quantity of data stored in computers is growing rapidly. Much of this data, particularly data collected automatically by sensing or monitoring applications, is time-series data. A time series is a real-valued sequence, which represents the status of a single variable over time. The monitored activity can be a process defined by some human activity, like the fluctuations in Microsoft stock closing prices, or a natural process, like Lake Huron historical water levels. The presence of a time component in data is what unifies such diverse data sets and classifies them as time series. Therefore, it is hardly surprising that much research has been devoted recently to the efficient management of time-series data [1], [24], [16], [19], et al].

Contact IEEE to Subscribe

References

References is not available for this document.