Learning algorithms for an automaton operating in a multiteacher environment are considered. These algorithms are classified based on the number of actions given as inputs to the environments and the number of responses (outputs) obtained from the environments. In this paper, we present a general class of learning algorithm for multi-input multi-output (MIMO) models. We show that the proposed learning algorithm is absolutely expedient and ε-optimal in the sense of average penalty. The proposed learning algorithm is a generalization of Baba's GAE algorithm and has applications in solving, in a parallel manner, multi-objective optimization problems in which each objective function is disturbed by noise
Published in:
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
(Volume:29
,
Issue:
5
)
Date of Publication: Oct 1999