Skip to Main Content
As a core operation to twig query processing, finding all the occurrences of a twig pattern in an XML document has attracted much attention. Although existing methods are efficient for quires with ancestor-descendant edges, our analysis shows that they all suffer from redundant CPU cost. Moreover, large amount of redundant path solutions may be produced when parent-child edges appeared below branch nodes. We propose an optimized holistic twig join algorithm, namely OTS, towards efficient processing of a twig query. By pre-checking at three steps, OTS can not only reduce the CPU cost but also eliminate redundant path solutions. As a result, it broadens the class of queries with CPU complexity linear with the sum of sizes of the input lists and the output lists. Experimental results on various datasets indicate that OTS performs significantly better than the existing ones.
Date of Conference: 6-7 Jan. 2012