This paper proposes a novel mechanism to discover delay-optimal diverse paths using distributed learning automata for Voice-over-IP (VoIP) routing in service overlay networks. In addition, a novel link failure detection method is proposed for detecting and recovering from link failures to reduce the number of dropped voice sessions. The main contributions of this paper are a decentralized, scalable method for minimizing delay on both a primary and secondary path between all pairs of overlay nodes, while at the same time maintaining the link disjointness between the primary and the secondary optimal paths. Simulations of a 50-node model of AT&T's backbone network show that the proposed method improves the quality of voice calls from unsatisfactory to satisfactory, as measured by the R-factor. With the proposed link failure detection mechanism, the time to recover from a link failure is considerably reduced.