By Topic

On the evaluation of independent binary features (Corresp.)

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)

248 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-24, NO. 2, MARCH 1978 automaton that is close to optimal and eliminates the need for artificial randomization was also provided. This automaton is close to optimal in the sense that it requires at most 2 extra bits of memory, independent of m, to match the performance of the optimal randomized m-state automaton for all PA and PB. Both the problems studied here, however, involve only 2 coins. How to extend the results of this paper to situations where more than 2 coins are involved is an open question. Some ad hoc expedient automata are available in the literature [6], [7]. Before an optimal solution to the many-armed bandit problem is possible, the problem of multiple hypothesis testing with finite memory needs to be solved. For some recent results concerning this problem, see [13]. Further, finite time finite memory solutions to these problems are of interest. Vasilev [14] and Witten [15] studied the finite time behavior of some solutions to the TABP. No optimal solution, however, is available. Some recent progress has been reported by Cover et al. [16]. ACKNOWLEDGMENT The authors thank the referees for comments which helped to improve the paper. APPENDIX Denote by p(a;pA,PB) the asymptotic proportion of heads achieved, given the coins A and B and the automaton a. Even though p (a;pA,PB) is maximized over all m-state automata if and only if r (a;pA,PB) is maximized, maximizing {inf p (a;pA,PB)} is not necessarily equivalent to maximizing linf r(a;pA,Ps)} where the infimum is over {(PA,PS)}. In fact, for the TABPO, where PS is known precisely, an automaton that maximizes {infp(a;pA,PB)} tosses coin B exclusively. Furthermore, this automaton is not even expedient, and thus in some sense this solution is unsatisfactory. REFERENCES [1] H. Robbins, "Some Aspects of the sequential design of experiments," Bull. Am. Math. Soc., vol. 58, pp. 527-535, 1952. [2] H. Robbins, "A sequential decision problem with a finite memory," Proc. Nat'l. Acad. Sci., vol. 42, pp. 920-923, 1956. [3] I. H. Witten, "The apparent conflict between estimation and control--A survey of the two-armed bandit problem," J. Franklin Institute, vol. 301, no. 1-2, pp. 161-190, Jan.-Feb. 1976. [4] T.- Cover and M. E. Hellman, "The two-armed bandit problem with time-invariant finite memory," IEEE Trans. Inform. Theory, vol. IT-16, No. 2, pp. 185-195, Mar. 1970. [5] M.E. Hellman and T. Cover, "Learning with finite memory," Ann. Math. Stat., vol. 41, pp. 765-782, June 1970. [6] M. L. Tsetlin, Automaton Theory and Modeling of Biological Systems. New York: Academic, 1973. [7] K.S. Fu and T. J. Li, "Formulation of learning automata and automata games," Information Sciences, vol. 1, no. 3, pp. 237-256, July 1969. [8] H. Chernoff, "Approaches in sequential design of experiments," in Statistical Design and Linear Models, J. N. Srivastava (Ed.). New York: American Elsevier, 1975, pp. 67-90. [9] S. J. Yakowitz, Mathematics of Adaptive Control Processes. New York: American Elsevier, 1969. [10] M. H. DeGroot, Optimal Statistical Decisions. New York: McGraw-Hin, 1970, Ch. 14. [11] K. B. Lakshmanan and B. Chandrasekaran, "Compound hypothesis testing with finite memory," submitted for publication. [12] A.A. Milyutin, "On automata with optimal expedient behavior in stationary media," Automation and Remote Control, vol. 26, pp. 116-131, 1965. [13] B. Chandrasekaran and K. B. Lakshmanan, "Multiple hypothesis testing with finite memory," Cybernetics and Information Science, 1977, to appear. [14] N.B. Vasilev and I. I. Pyatetskii-Shapiro, "The time for an automaton to adapt to the external medium," Automation and Remote Control, pp. 1100-1103, 1967. [15] I.H. Witten, "Finite time performance of some two-armed bandit controllers," IEEE Trans. Systems, Man, Cybern., vol. SMC-3, no. 3, pp. 194-197, Mar. 1973. [16]T. Cover, M. A. Freedman, and M. E. Hellman, "Optimal finite memory learning al

Published in:

Information Theory, IEEE Transactions on  (Volume:24 ,  Issue: 2 )