Journals & Magazines >IEEE Transactions on Automati... >Volume: 69 Issue: 8

Probabilistic Framework of Howard's Policy Iteration: BML Evaluation and Robust Convergence Analysis

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This article aims to build a probabilistic framework for Howard's policy iteration algorithm using the language of forward–backward stochastic differential equations (FBS...Show More

Metadata

Abstract:

This article aims to build a probabilistic framework for Howard's policy iteration algorithm using the language of forward–backward stochastic differential equations (FBSDEs). As opposed to conventional formulations based on partial differential equations, our FBSDE-based formulation can be easily implemented by optimizing criteria over sample data and is, therefore, less sensitive to the state dimension. In particular, both on-policy and off-policy evaluation methods are discussed by constructing different FBSDEs. The backward-measurability-loss criterion is then proposed for solving these equations. By choosing specific weight functions in the proposed criterion, we can recover the popular deep BSDE method or the martingale approach for BSDEs. The convergence results are established under both ideal and practical conditions, depending on whether the optimization criteria are decreased to zero. In the ideal case, we prove that the policy sequences produced by the proposed FBSDE-based algorithms and the standard policy iteration have the same performance and, thus, have the same convergence rate. In the practical case, the proposed algorithm is still proved to converge robustly under mild assumptions on optimization errors.

Published in: IEEE Transactions on Automatic Control ( Volume: 69, Issue: 8, August 2024)

Page(s): 5200 - 5215

Date of Publication: 20 December 2023

ISSN Information:

DOI: 10.1109/TAC.2023.3344870

Funding Agency:

Contents

References is not available for this document.

Probabilistic Framework of Howard's Policy Iteration: BML Evaluation and Robust Convergence Analysis

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Probabilistic Framework of Howard's Policy Iteration: BML Evaluation and Robust Convergence Analysis

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?