Skip to Main Content
Reinforcement learning is considered as a strong method for learning in multiagent systems environments. However, it still has some drawbacks, including modeling other learning agents present in the domain as part of the state of the environment, and some states are much less experienced than others, or some state-action pairs are never visited during the learning phase. Further, before the learning process is completed, an agent cannot exhibit a certain behavior in some states that may be sufficiently experienced. This shows that learning in a partially observable and dynamic multiagent systems environment still constitutes a difficult and major research problem that is worth further investigation. Motivated by this, in this paper, a novel learning approach that integrates online analytical processing (OLAP)-based data mining into the process is proposed. First, a data cube OLAP architecture that facilitates effective storage and processing of the state information reported by agents is described. This way, the action of the other agent, even one not in the visual environment of the agent under consideration, can simply be estimated by extracting online association rules, a well-known data mining technique, from the constructed data cube. Then, a new action selection model that is also based on association rules mining is presented. Finally, states that are not sufficiently experienced by mining multiple-level association rules from the proposed data cube are generalized. Experiments conducted on a well-known pursuit domain show the robustness and effectiveness of the proposed learning approach.