Skip to Main Content
We describe a new model for evaluating similarities among a large number of web logs, and compare several algorithms using the model. Possible uses of this include isolating and tracking like-minded networks for surveillance and improved categorization. Our model consists of similarity analysis combined with clustering. Experimental results show that our algorithm is able to separate blogs into categories, consistently achieving over 90% success rate.