Skip to Main Content
This paper presents and investigates the application of coevolutionary training techniques based on particle swarm optimization (PSO) to evolve playing strategies for the nonzero sum problem of the iterated prisoner's dilemma (IPD). Three different coevolutionary PSO techniques are used, differing in the way that IPD strategies are presented: A neural network (NN) approach in which the NN is used to predict the next action, a binary PSO approach in which the particle represents a complete playing strategy, and finally, a novel approach that exploits the symmetrical structure of man-made strategies. The last technique uses a PSO algorithm as a function approximator to evolve a function that characterizes the dynamics of the IPD. These different PSO approaches are compared experimentally with one another, and with popular man-made strategies. The performance of these approaches is evaluated in both clean and noisy environments. Results indicate that NNs cooperate well, but may develop weak strategies that can cause catastrophic collapses. The binary PSO technique does not have the same deficiency, instead resulting in an overall state of equilibrium in which some strategies are allowed to exploit the population, but never dominate. The symmetry approach is not as successful as the binary PSO approach in maintaining cooperation in both noisy and noiseless environments-exhibiting selfish behavior against the benchmark strategies and depriving them of receiving almost any payoff. Overall, the PSO techniques are successful at generating a variety of strategies for use in the IPD, duplicating and improving on existing evolutionary IPD population observations.