Abstract:
The one-step policy improvement (OSPI) method, which is to perform a single policy improvement step over a tractable base policy, is a well known tool for designing effec...Show MoreMetadata
Abstract:
The one-step policy improvement (OSPI) method, which is to perform a single policy improvement step over a tractable base policy, is a well known tool for designing effective heuristic policies with low complexity for intractable queueing routing control models. The policy improvement step calls for solving the Poisson equations for one-dimensional Markov chains arising from single-queue submodels, which often reduces to applying a simple recursion. This paper shows that the latter task may be hindered in practice by a severe numerical instability, due to an extremely sensitive dependence on a key parameter. Such a phenomenon is identified in a model for dynamic control of admission and routing of jobs with firm deadlines to parallel queues. Further, an approach to partially overcome such instability and approximately compute the OSPI policy is proposed and tested.
Published in: 2014 7th International Conference on NETwork Games, COntrol and OPtimization (NetGCoop)
Date of Conference: 29-31 October 2014
Date Added to IEEE Xplore: 08 June 2017
ISBN Information:
Conference Location: Trento, Italy