Skip to Main Content
Consider the normalized cumulative loss of a predictor F on the sequence xn=(x1,...,xn), denoted LF(xn). For a set of predictors G, let L(G,xn)=minF∈GLF(xn) denote the loss of the best predictor in the class on xn. Given the stochastic process X=X1,X2,..., we look at EL(G,Xn), termed the competitive predictability of G on Xn. Our interest is in the optimal predictor set of size M, i.e., the predictor set achieving min|G|≤MEL(G,Xn). When M is subexponential in n, simple arguments show that min|G|≤MEL(G,Xn) coincides, for large n, with the Bayesian envelope minFELF(Xn). We investigate the behavior, for large n, of min|G|≤enREL(G,Xn), which we term the competitive predictability of X at rate R. We show that whenever X has an autoregressive representation via a predictor with an associated independent and identically distributed (i.i.d.) innovation process, its competitive predictability is given by the distortion-rate function of that innovation process. Indeed, it will be argued that by viewing G as a rate-distortion codebook and the predictors in it as codewords allowed to base the reconstruction of each symbol on the past unquantized symbols, the result can be considered as the source-coding analog of Shannon's classical result that feedback does not increase the capacity of a memoryless channel. For a general process X, we show that the competitive predictability is lower-bounded by the Shannon lower bound (SLB) on the distortion-rate function of X and upper-bounded by the distortion-rate function of any (not necessarily memoryless) innovation process through which the process X has an autoregressive representation. Thus, the competitive predictability is also precisely characterized whenever X can be autoregressively represented via an innovation process for which the SLB is tight. The error exponent, i.e., the exponential behavior of min|G|≤exp(nR)Pr(L(G,Xn)>d), is also characterized for processes that can be autoregressively represented with an i.i.d. innovation process.