By Topic

On optimal and suboptimal policies in the choice of control forces for final-value systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Aoki, M. ; University of California, Los Angeles, CA, USA

A simple control system whose state variable xnis described by the difference equation (1) is considered. xn+l= α xn+rn+vn, n = 1, 2 ...N , o<α<1 (1) where Vn is the control force of the system and rn is the Bernoulli random noise with the probability parameter p. If 4 (x ~) is the performance index of the final-value system, then the problem is to find a sequence of control forces vn , n = 1, 2 . . .N, which minimizes the expecte d l value of 4 (xN). An optimal sequence of vn is determined by solving a recurrence relation on n of the criterion function hn (x;p) defined a s follows: hn (x;p) = Min Min . . . Min E [4 (XJ] v1v2 . .. vn = the minimum of the expected value of 4 (xn ) obtainable in the n-stage control process by employing an optimal policy, starting from the initial state variable x when the probability parameter of the random disturbance is p. The recurrence relation is obtained by the usual application of the principle of optimality of dynamic programming technique. It is proved that hn (x;p) = hn (-x; 1-p) holds and this fact is used in reducing by half the amount of computation when it is necessary to solve the recurrence relation numerically. If the values of control variable vn (x;p) are restricted to 1 and -1 as in contactor servo systems, the boundaries between 1 and -1 control forces become too complicated to be determined analytically except for few special cases. By solving vn (x;p) computationally for the 4 (XN) = XN case, It is seen that vn (x;p) agrees with v1 (x;p) fairly well for all n > 1, v1 (x;p) is the optimal control force when there remains only one chance of exerting control forces. Hence, the suboptimal policyof a l w w - plying control forces as if onLx one more erxor correction is possible may be expected cl=se t-oLh2 o-t!mal pohcy in minimizing E(x$). The control fzrce vi .(x;.p) is IiKear ii-~ x Gd-p &d F is a much more simple function to mechanize. : his suboptimal policy is tried by the Monte Carlo method and found to be only slightly inferior ; me-mti-mal policy. The behavior of X, is invest- igated by assuming the control forces are given by the suboptimal policy. This approximate analysis should be good in view of the agreements between the optimal and suboptimal policies. It is important to realize that if the adoption of a suboptimal policy results in a simplified mechanization of the optimal control forces, with only a slight reduction in the system performance, then the suboptimal policy might be optimal in a certain enlarged performance index. What seems to be the most desirable approach to engineering problems is a unified o r well-integrated one where the analytical and computational algorithms are used to supplement each other. In this paper, an attempt is made to illustrate the point, presenting at the same time a new approach to analysis and synthesis of a certain class of control systems.

Published in:

Automatic Control, IRE Transactions on  (Volume:5 ,  Issue: 3 )