We are currently experiencing intermittent issues impacting performance. We apologize for the inconvenience.
By Topic

Web Document Clustering using Semantic Link Analysis

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Arch-Int, S. ; Dept. of Comput. Sci., Khon Kaen Univ.

Searching and discovering the relevant information on the Web have always been challenging research areas. Web document clustering is a promising technique in preparing a huge collection of Web documents suitable for Web search engines. This paper proposes a semantic document clustering approach to categorize Web documents in a semantic manner. First, the formal methods and algorithms are introduced as techniques for document extraction and clustering. The approach incorporates WordNet and ontology knowledge as the assistant mechanisms such that the resulting set of concepts are thus utilized as formal representation for extracted documents. As a consequence, the semantic-based clusters are finally determined the cluster scores. Next, the semantic-based link analysis method is also proposed for clustering Web documents into semantic clusters that are scored based on the notion of semantic-based concepts and documents. Finally, these document scores are subsequently used for evaluating the semantic document similarity and document quality. As such, the precision criterion is employed for efficient evaluations by comparing with keywords-based search method. The experimental results reported that the proposed method was able to outperform the TF/IDF method up to 9% on average

Published in:

Computational Intelligence for Modelling, Control and Automation, 2005 and International Conference on Intelligent Agents, Web Technologies and Internet Commerce, International Conference on  (Volume:2 )

Date of Conference:

28-30 Nov. 2005