Skip to Main Content
The explosive growth of XML has led to an increasing need for scalable XML retrieval systems. Our XML retrieval system, the SQLGenerator, stores XML of any schema in a fixed schema relational database and supports a full-featured semistructured query language, XML-QL, through optimized translation of its semantics to relational SQL queries. We examine the scalability of this approach with respect to increasing data size. We index four XML collections ranging in size from 500MB to 2GB that were generated using a standard XML generator, XBench. We then compare the execution times of 11 standard XBench queries, covering a wide range of semistructured query features, whose semantics were directly translatable from their original XQuery language to XML-QL. Although it is difficult to estimate the theoretical baseline for scalability of these query features in an RDBMS, many of the queries' runtimes grow linearly with respect to the size of the document collection.