By Topic

Extended Subtree: A New Similarity Function for Tree Structured Data

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Shahbazi, A. ; Dept. of Electr. & Comput. Eng., Univ. of Alberta, Edmonton, AB, Canada ; Miller, J.

Although several distance or similarity functions for trees have been introduced, their performance is not always satisfactory in many applications, ranging from document clustering to natural language processing. This research proposes a new similarity function for trees, namely Extended Subtree (EST), where a new subtree mapping is proposed. EST generalizes the edit base distances by providing new rules for subtree mapping. Further, the new approach seeks to resolve the problems and limitations of previous approaches. Extensive evaluation frameworks are developed to evaluate the performance of the new approach against previous proposals. Clustering and classification case studies utilizing three real-world and one synthetic labeled data sets are performed to provide an unbiased evaluation where different distance functions are investigated. The experimental results demonstrate the superior performance of the proposed distance function. In addition, an empirical runtime analysis demonstrates that the new approach is one of the best tree distance functions in terms of runtime efficiency.

Published in:

Knowledge and Data Engineering, IEEE Transactions on  (Volume:26 ,  Issue: 4 )