By Topic

Exploitation of an opponent's imperfect information in a stochastic game with autonomous vehicle application

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
McEneaney, W.M. ; Mech. & Aerosp. Eng., California Univ., San Diego, La Jolla, CA, USA ; Singh, R.

We consider a finite state space, discrete stochastic game problem where only one player has perfect information. In the notation employed here only the "red" player has the perfect state information. The "blue" player only has access to observation-based information. To some degree the observations may be influenced by the controls of both players. A Markov chain model is used where the transition probabilities depend on the controls of the players. The game is zero-sum. It is known that application of the optimal control at a maximum likelihood estimate by the blue player is not optimal; under a saddle point condition, a form of certainty equivalence does exist for the blue player, but the structure is more complex than the above approach. In this work, the point of view of the red player is considered. Simulation is used to demonstrate that the optimal state feedback control for red is not the optimal control (even with perfect information for red). This is a significantly stronger statement than that certainty equivalence does not hold when the red player has imperfect information. A theory for the development of red control is presented. This yields "deceptive" controls in the presence of the simpler blue approach above, which provide superior performance in this case. An open question is whether (and under what conditions), this approach yields superior performance for red as compared with slate feedback when blue is allowed strategies including the more complex one above. Experimentation and theory are employed to answer this question.

Published in:

Decision and Control, 2004. CDC. 43rd IEEE Conference on  (Volume:5 )

Date of Conference:

14-17 Dec. 2004