Skip to Main Content
Real-life graphs not only contain nodes and edges, but also have events taking place, e.g., product sales in social networks. Among different events, some exhibit strong correlations with the network structure, while others do not. Such structural correlations will shed light on viral influence existing in the corresponding network. Unfortunately, the traditional association mining concept is not applicable in graphs because it only works on homogeneous data sets like transactions and baskets. We propose a novel measure for assessing such structural correlations in heterogeneous graph data sets with events. The measure applies hitting time to aggregate the proximity among nodes that have the same event. To calculate the correlation scores for many events in a large network, we develop a scalable framework, called gScore, using sampling and approximation. By comparing to the situation where events are randomly distributed in the same network, our method is able to discover events that are highly correlated with the graph structure. We test gScore's effectiveness by synthetic events on the DBLP coauthor network and report interesting correlation results in a social network extracted from TaoBao.com, the largest online shopping network in China. Scalability of gScore is tested on the Twitter network. Since an event is essentially a temporal phenomenon, we also propose a dynamic measure, which reveals structural correlations at specific time steps and can be used for discovering detailed evolutionary patterns.