Skip to Main Content
In this paper, we have developed a self-organized approach to cooperation policy setting in a system of rational peers that have only partial views of the whole system in order to improve the overall welfare as a system-wide performance metric. The proposed approach is based on distributed reinforcement learning and sets cooperation policies of the peers through their self-organized interactions. We have analyzed this approach to demonstrate that it results in Pareto optimality in the system by disseminating the local value functions of the peers among the neighbors. We have also experimentally verified that this approach outperforms the other commonly used approaches in the literature, in terms of the performance of the system.