Skip to Main Content
There has been a great deal of work in recent years on processing and optimizing queries against XML data. Typically in these previous works, schema information is not considered, so that evaluation techniques can continue to be used even in the absence of one. However, schema information is often available and, in this paper, we show that when available it can be exploited to great advantage in ways that complement "traditional" XML query optimization. To be usable in practice, we require that aspects of schema, essential for our purposes, be captured in a schema information graph (SIG). We exploit such meta-data knowledge with a preprocessing enumeration phase that detects potentially interchangeable evaluation units - we call such units alternate paths. We show, within an algebraic framework, methods that can break down a pattern tree into elementary paths and substitute them by one or more less costly alternate paths. This approach allows us to present various rewritten forms of the XML query to the query optimizer, and allows the DBMS to explore a larger space of query evaluation plans. We assessed the benefits of the proposed techniques experimentally with the XMark data set and show that the SIG-based optimizations can result in significant performance improvements.