Skip to Main Content
Consider a controlled Markov chain whose transition probabilities depend upon an unknovn parameter ?? taking values in finite set A. To each a is associated a prespecified stationary control law ??(??). The adaptive control lay selects at each time t the control action indicated by ??(??t) where ??t is the maximum likelihood estimate of ??. It is shown that ??t converges to a parameter ??* such that the 'closed loop transition probabilities corresponding to ??* and ??(??*) are the same as those corresponding to ??0 and ??(??*) where ??0 is the true parameter. The situation vhen ??0 does not belong to the model set A is briefly discussed.