Skip to Main Content
This technical note deals with the mean variance problem (known as the average variance (AV) minimization problem) for finite continuous time Markov decision processes. We first introduce a so called G-condition which is weaker than the well known ergodicity and unichain conditions and sufficient for the finiteness of the AV of a policy. Also, we present an example of a policy having infinite AV when the G-condition is not satisfied. Under the G-condition we prove that the AV criterion can be transformed into an equivalent mean (or expected) average criterion by using a martingale technique and an observation from the canonical form of a transition rate matrix, and thus the existence and calculation of an AV minimal policy over a class of mean optimal policies are obtained by a policy iteration algorithm in an finite number of iterations. As byproduct, we obtain some interesting new results about the mean average optimality.