Skip to Main Content
We propose a web clustering method using social bookmarking data with dimension reduction regarding similarity. To realize this idea we construct the similarity matrix between web pages based on their cooccurrence frequency. Since the similarity matrix includes various kind of noise, we map the similarity matrix onto lower dimension feature space to reduce the noise. Especially we carry out dimension reduction regarding web pages' similarity. This approach uses generalized eigenvectors and is different from usual eigenvalue problems. Using artificially generated data, we explain that the feature space constructed with our proposed method emphasizes the essential relationship between web pages. And using real social bookmarking data, we describe our proposed method can make good clusters.
Date of Conference: 9-11 Aug. 2010