Skip to Main Content
A hidden Markov model (HMM) consisting of a controlled Markov chain and a binary output is considered. The binary output reports the occurrence of a certain state (versus any other state) in the Markov chain. The transition matrix of the Markov chain depends on a control vector which is to be chosen to minimize a discounted infinite-horizon cost function. The information provided for decision making is the entire history of the binary output which presents an incomplete knowledge of the Markov chain and, as a consequence, leads to a stochastic control problem with partial observations. The solution to such a problem is a control policy consisting of two components: an estimator to derive the posterior probability distribution of the Markov chain from the observation set, and a nonlinear control law to map this estimate into the control vector. This paper introduces a nonlinear state-space representation for the estimator, develops analytical expressions for the control law, and presents numerical methods for efficient computation of the optimal control. Application of these results to optimal management of a class of inventory systems is discussed, and the performance of optimal control for this application is examined by numerical simulations.