Skip to Main Content
In this paper we apply different types of clustering, fuzzy (fuzzy c-Means) and crisp (k-Means) to graph statistical data in order to evaluate information loss due to perturbation as part of the anonymization process for a data privacy application. We make special emphasis on two major node types: hubs, which are nodes with a high relative degree value, and bridges, which act as connecting nodes between different regions in the graph. By clustering the graph's statistical data before and after perturbation, we can measure the change in characteristics and therefore the information loss. We partition the nodes into three groups: hubs/global bridges, local bridges, and all other nodes. We suspect that the partitions of these nodes are best represented in the fuzzy form, especially in the case of nodes in frontier regions of the graphs which may have an ambiguous assignment.
Date of Conference: 10-15 June 2012