Loading [a11y]/accessibility-menu.js
Sequential Prediction Over Hierarchical Structures | IEEE Journals & Magazine | IEEE Xplore

Abstract:

We study sequential compound decision problems in the context of sequential prediction of real valued sequences. In particular, we consider finite state (FS) predictors t...Show More

Abstract:

We study sequential compound decision problems in the context of sequential prediction of real valued sequences. In particular, we consider finite state (FS) predictors that are constructed based on a hierarchical structure, such as the order preserving patterns of the sequence history. We define hierarchical equivalence classes by tying certain models at a hierarchy level in a recursive manner in order to mitigate undertraining problems. These equivalence classes defined on a hierarchical structure are then used to construct a super exponential number of sequential FS predictors based on their combinations and permutations. We then introduce truly sequential algorithms with computational complexity only linear in the pattern length that 1) asymptotically achieve the performance of the best FS predictor or the best linear combination of all the FS predictors in an individual sequence manner without any stochastic assumptions over any data length n under a wide range of loss functions; 2) achieve the mean square error of the best linear combination of all FS filters or predictors in the steady-state for certain nonstationary models. We illustrate the superior convergence and tracking capabilities of our algorithm with respect to several state-of-the-art methods in the literature through simulations over synthetic and real benchmark data.
Published in: IEEE Transactions on Signal Processing ( Volume: 64, Issue: 23, 01 December 2016)
Page(s): 6284 - 6298
Date of Publication: 08 September 2016

ISSN Information:

Funding Agency:


I. Introduction

We investigate sequential compound decision problems that arise in several different signal processing [1]–[5], information theory [6], [7] and machine learning applications [8]– [11]. In particular, we sequentially observe a real valued sequence and produce a decision (or an action) at each time as our output based on the past . We then suffer a loss based on this output when the true is revealed and our goal is to minimize the (weighted) accumulated or expected loss as much as possible while using a limited amount of information from the past. As an example, in the well-known sequential prediction problem under the square error loss, the output at time corresponds to an estimate of the next data point , where the algorithm suffers the loss after , i.e., , is revealed. The algorithm can then adjust itself in order to reduce the future losses. This generic setup models a wide range problems in various different applications ranging from adaptive filtering [12], channel equalization [13], repeated game playing [9] to online compression by sequential probability assignment [14], [15].

Contact IEEE to Subscribe

References

References is not available for this document.