Cart (Loading....) | Create Account
Close category search window
 

Classification With Finite Memory Revisited

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Ziv, J. ; Dept. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa

We consider the class of strong-mixing probability laws with positive transitions that are defined on doubly infinite sequences in a finite alphabet A. A device called the classifier (or discriminator) observes a training sequence whose probability law Q is unknown. The classifier's task is to consider a second probability law P and decide whether P = Q, or P and Q are sufficiently different according to some appropriate criterion Delta(Q,P) > Delta. If the classifier has available an infinite amount of training data, this is a simple matter. However, here we study the case where the amount of training data is limited to N letters. We define a function NDelta(Q|P), which quantifies the minimum length sequence needed to distinguish Q and P and the class M(NDelta) of all probability laws pairs (Q,P) that satisfy NDelta(Q|P) les NDelta for some given positive number NDelta. It is shown that every pair Q,P of probability laws that are sufficiently different according to the Delta criterion is contained in M(NDelta). We demonstrate that for any universal classifier there exists some Q for which the classification probability lambda(Q) = 1 for some N-sequence emerging from Q, for some P : (Q,P) epsi M circ(NDelta).Delta(Q,P) > Delta, if N < NDelta. Conversely, we introduce a classification algorithm that is essentially optimal in the sense that for every (Q,P) epsi M(NDelta), the probability of classification error lambda(Q) is uniformly vanishing with N for every P : (Q,P) epsi M circ(NDelta) if N ges NDelta 1+O(log log N Delta /log N Delta ). The proposed algorithm finds the largest empirical conditional divergence for a set of contexts which appear in the tested N-sequence. The computational complexity of the classification algorithm is O(N2(log N)3). Also, we introduce a second simplified context classification algorithm with a computational complexity of only O(N(log N)4) that is efficient in the sense that for every pair (Q,P) epsi M(NDelta), the pairwise probability of classification error lambda(Q,P) for the pair Q,P vanishes with N if N ges NDelta 1+O(log log N Delta /log N Delta ). Conversely, lambda(Q,P) = 1 at least for some (Q,P) epsi M(NDelta), if N < NDelta.

Published in:

Information Theory, IEEE Transactions on  (Volume:53 ,  Issue: 12 )

Date of Publication:

Dec. 2007

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.