Skip to Main Content
Reinforcement learning (RL) is a valuable learning method when the systems require a selection of control actions whose consequences emerge over long periods for which input-output data are not available. In most combinations of fuzzy systems and RL, the environment is considered to be deterministic. In many problems, however, the consequence of an action may be uncertain or stochastic in nature. In this paper, we propose a novel RL approach to combine the universal-function-approximation capability of fuzzy systems with consideration of probability distributions over possible consequences of an action. The proposed generalized probabilistic fuzzy RL (GPFRL) method is a modified version of the actor-critic (AC) learning architecture. The learning is enhanced by the introduction of a probability measure into the learning structure, where an incremental gradient-descent weight-updating algorithm provides convergence. Our results show that the proposed approach is robust under probabilistic uncertainty while also having an enhanced learning speed and good overall performance.