Skip to Main Content
In this article we present novel preprocessing techniques, based on typological measures of the network, to identify clusters of proteins from protein-protein interaction (PPI) networks wherein each cluster corresponds to a group of functionally similar proteins. The two main problems with analyzing protein-protein interaction networks are their scale-free property and the large number of false positive interactions that they contain. Our preprocessing techniques use a key transformation and separate weighting functions to effectively eliminate suspect edges, potential false positives, from the graph. A useful side-effect of this transformation is that the resulting graph is no longer scale free. We then examine the application of two well-known clustering techniques, namely hierarchical and multilevel graph partitioning on the reduced network. We define suitable statistical metrics to evaluate our clusters meaningfully. From our study, we discover that the application of clustering on the pre-processed network results in significantly improved, biologically relevant and balanced clusters when compared with clusters derived from the original network. We strongly believe that our strategies would prove invaluable to future studies on prediction of protein functionality from PPI networks.