Skip to Main Content
Markov decision process is optimal policy-making process which is based on the Markov process theory of random dynamical systems. It is also a theoretical tool to study optimization problems about multi-stage policy-making process in random environment. For its wide range of applications, developing the Markov decision process toolbox is of great significance for the scientific computing software SCILAB. Markov policy process consists of three main criterions: the expected total reward criterion, discount criterion and average criterion. Finally, taking the toys manufacturers as the example the effectiveness of the method is tested.