Skip to Main Content
Maximum-flow algorithms are used to find spam sites, build content voting system, discover communities, etc., on graphs from the Internet. Such graphs are now so large that they have outgrown conventional memory-resident algorithms. In this paper, we show how to effectively parallelize a max-flow algorithm based on the Ford-Fulkerson method on a cluster using the MapReduce framework. Our algorithm exploits the property that such graphs are small-world networks with low diameter and employs optimizations to improve the effectiveness of MapReduce and increase parallelism. We are able to compute max-flow on a subset of the Face book social network graph with 411 million vertices and 31 billion edges using a cluster of 21 machines in reasonable time.