By Topic

A Methodology for Clustering XML Documents Based on Labeled Tree

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Lei Liu ; Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China ; Yongqing Zheng ; Baoshi Ding ; Haiyan Liu

The amount of XML documents is increasing rapidly. In order to analyze the information represented in XML documents efficiently, researches on XML document clustering are actively in progress. The key issue is how to devise the similarity measure between XML documents to be used for clustering. Since XML documents have hierarchical structure, it is not appropriate to cluster them by using a general document similarity measure. In this paper, we propose the novel similarity calculation measure by reducing Nesting and repeating in the whole XML document. Then propose an improved Edge-set comparison algorithm to calculate two XML documents' similarity. Our experiments show that the proposed method improves accuracy on the clustering, compared to the previous works.

Published in:

Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on  (Volume:1 )

Date of Conference:

14-16 Aug. 2009