Skip to Main Content
In network analysis the ability to characterize nodes based on their attributes and surrounding network structure is a fundamental problem. For example, in financial transaction networks, it allows us to identify typical and anomalous behaviour -- important for uncovering fraudulent behaviour. Egocentric network motif analysis is a counting algorithm that tackles this problem -- although it is a computationally expensive algorithm. Fortunately, it is inherently parallelizable -- each node in the network can be characterized independently of all others. In this paper, we use the distributed stream-processing system Storm to perform node characterization in large dynamic networks. We report on the resources required within the Amazon Web Services (AWS) cloud computing platform in order to support this type of analysis on two real-world datasets from the financial domain. This approach allows us to analyze networks that are several orders of magnitude larger than could be tackled with alternative, non-distributed approaches. Our approach also enables live analysis, by treating datasets as streams (as opposed to depending on an offline, batched analysis).