Skip to Main Content
Today's social networks are getting larger, and the need to analyze datasets with millions of nodes and billions of edges is not uncommon any more. As a network of social relationships evolves by the addition of new nodes and edges, fast algorithms are desirable for the recomputation of key network measures such as actor centrality. The distributed computing paradigm offers a scalable approach to addressing the recomputation challenge. This paper develops a Map-Reduce implementation of an incremental All-Pairs Shortest Path (APSP) algorithm. The incremental nature of the approach allows for performing minimal work in updating centrality measures, while the Map-Reduce implementation makes it scalable to large data. The key idea of the incremental APSP algorithm  is based on the efficient use of past information about the shortest paths between any node and the neighbors of the newly added node. A presented parallelized version of the algorithm relies on a three-step iterative execution of the "map" and "reduce" jobs. Experiences with its implementation are reported in application to a real-world dataset containing 7115 nodes. The experimental runs were performed using the Amazon's EMR service.
Date of Conference: 26-29 Aug. 2012