Skip to Main Content
We consider a controlled Markov chain with finite state space, whose transition probabilities are assumed to depend linearly upon an unknown real parameter α. In particular, we study the asymptotic behavior of the maximum likelihood estimate of α under an arbitrary realizable control when α is known to belong to a given bounded interval on the line. We show that the estimate converges with probability one and characterize those realizations for which convergence does not lead to the true value. We also suggest corrections to the control policy which guarantee almost sure convergence to the true value. For the adaptive situation, where the control depends only on the current estimate of α and the present state, we show that the maximum likelihood estimate converges to a value α*indistinguishable from the true one under the feedback law induced by α*.
Date of Publication: Apr 1982