By Topic

A Reward Scheme for Production Systems with Overlapping Conflict Sets

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)

The reward-allocation problem for production systems in delayed-payoff situations is formalized in a conceptual model in which the environment of the system is a finite automaton. The environment state and the state of the system's local memory determine which productions are in the current conflict set. Productions are selected from the conflict set with probabilities proportional to their activations. Each selected production updates the local memory and furnishes the next input symbol to the environment. A reward scheme examines the payoff that is output by the environment and adjusts the activations in an attempt to increase average payoff per unit time. A reward scheme is safe if it is generally biased towards improvement. The notion of reward scheme safety is formalized, an asymptotically safe reward scheme is exhibited, and its safety is demonstrated. The demonstration is an analog of the proof of Fisher's fundamental theorem of natural selection.

Published in:

Systems, Man and Cybernetics, IEEE Transactions on  (Volume:16 ,  Issue: 3 )