By Topic

A universal predictor based on pattern matching

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Jacquet, P. ; Inst. Nat. de Recherche en Inf. et Autom., Le Chesnay, France ; Szpankowski, W. ; Apostol, I.

We consider a universal predictor based on pattern matching. Given a sequence X1, ..., Xn drawn from a stationary mixing source, it predicts the next symbol Xn+1 based on selecting a context of Xn+1. The predictor, called the sampled pattern matching (SPM), is a modification of the Ehrenfeucht-Mycielski (1992) pseudorandom generator algorithm. It predicts the value of the most frequent symbol appearing at the so-called sampled positions. These positions follow the occurrences of a fraction of the longest suffix of the original sequence that has another copy inside X1X2···Xn ; that is, in SPM, the context selection consists of taking certain fraction of the longest match. The study of the longest match for lossless data compression was initiated by Wyner and Ziv in their 1989 seminal paper. Here, we estimate the redundancy of the SPM universal predictor, that is, we prove that the probability the SPM predictor makes worse decisions than the optimal predictor is O(n) for some 0<ν<½ as n→∞. As a matter of fact, we show that we can predict K=O(1) symbols with the same probability of error

Published in:

Information Theory, IEEE Transactions on  (Volume:48 ,  Issue: 6 )