Skip to Main Content
The Web harbors a large number of community structures. Early detection of community structures has many purposes such as reliable searching and selective advertising. In this paper we investigate the problem of extracting and relating the web community structures from a large collection of Web-pages by performing hyper-link analysis. The proposed algorithm extracts the potential community signatures by extracting the corresponding dense bipartite graph (DBG) structures from the given data set of web pages. Further, the proposed algorithm can also be used to relate the extracted community signatures. We report the experimental results conducted on 10 GB TREC (Text REtrieval Conference) data collection that contains 1.7 million pages and 21.5 million links. The results demonstrate that the proposed approach extracts meaningful community signatures and relates them.