Loading [MathJax]/extensions/MathMenu.js
Probabilistic Framework of Howard's Policy Iteration: BML Evaluation and Robust Convergence Analysis | IEEE Journals & Magazine | IEEE Xplore
Scheduled Maintenance: On Monday, 30 June, IEEE Xplore will undergo scheduled maintenance from 1:00-2:00 PM ET (1800-1900 UTC).
On Tuesday, 1 July, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (1800-2200 UTC).
During these times, there may be intermittent impact on performance. We apologize for any inconvenience.

Probabilistic Framework of Howard's Policy Iteration: BML Evaluation and Robust Convergence Analysis


Abstract:

This article aims to build a probabilistic framework for Howard's policy iteration algorithm using the language of forward–backward stochastic differential equations (FBS...Show More

Abstract:

This article aims to build a probabilistic framework for Howard's policy iteration algorithm using the language of forward–backward stochastic differential equations (FBSDEs). As opposed to conventional formulations based on partial differential equations, our FBSDE-based formulation can be easily implemented by optimizing criteria over sample data and is, therefore, less sensitive to the state dimension. In particular, both on-policy and off-policy evaluation methods are discussed by constructing different FBSDEs. The backward-measurability-loss criterion is then proposed for solving these equations. By choosing specific weight functions in the proposed criterion, we can recover the popular deep BSDE method or the martingale approach for BSDEs. The convergence results are established under both ideal and practical conditions, depending on whether the optimization criteria are decreased to zero. In the ideal case, we prove that the policy sequences produced by the proposed FBSDE-based algorithms and the standard policy iteration have the same performance and, thus, have the same convergence rate. In the practical case, the proposed algorithm is still proved to converge robustly under mild assumptions on optimization errors.
Published in: IEEE Transactions on Automatic Control ( Volume: 69, Issue: 8, August 2024)
Page(s): 5200 - 5215
Date of Publication: 20 December 2023

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.