A lot of natural language processing (NLP) applications require the computation of similarities between pairs of syntactic or semantic trees. Tree edit distance (TED), in this context, is considered to be one of the most effective techniques. However, its main drawback is that it deals with single node operations only. We therefore extended TED to deal with subtree transformation operations as well as single nodes. This makes the extended TED with subtree operations more effective and flexible than the standard TED, especially for applications that pay attention to relations among nodes (e.g. in linguistic trees, deleting a modifier subtree should be cheaper than the sum of deleting its components individually). The preliminary results of extended TED with subtree operations were encouraging compared with the standard one when tested on different examples of dependency trees.
Published in:
Computer Science and Information Systems (FedCSIS), 2012 Federated Conference on
Date of Conference: 9-12 Sept. 2012