Abstract:
The problem of finding mutual X is essential in mining and analysis of complex social networks. X can be user's public data such as friends, education information, etc. H...Show MoreMetadata
Abstract:
The problem of finding mutual X is essential in mining and analysis of complex social networks. X can be user's public data such as friends, education information, etc. However, massive social networks pose a significant challenge at this problem as these networks consist of billions of nodes and hundreds of billions of edges. This paper presents a high-performance and memory-efficient solution for finding mutual X in social networks with billions of users, with three main contributions. First, a distributed algorithm for finding mutual X; second, an intra-node optimization strategy including pipelined workflow, NUMA-aware sub-partitioning, and Dual Sliding Window set intersection algorithm based on SIMD; third, a semicircular computing and communication scheme to further improve internode performance and avoid load imbalance. Our design is well validated using multiple real-world datasets, and it takes less than 10 minutes to find all mutual X in the WeChat social network. Compared with existing industrial solutions based on GraphX, we achieve 22-36× speedup and 36× memory reduction. Compared with PowerGraph, our solution achieves 12.7× speedup and 11× memory reduction.
Date of Conference: 09-12 December 2019
Date Added to IEEE Xplore: 24 February 2020
ISBN Information: