Skip to Main Content
Since its advent, the Extensible Markup Language (XML) has gained tremendous popularity in many different application areas. However, XML data is generally very verbose and redundant, and thus it requires a lot of disk space to store and bandwidth to transfer. To overcome this problem, many methods for compressing XML documents have been proposed. In general, data compression requires a model which is used to predict the next symbol in the data. In this paper, we compare different models suitable for XML compression. We also present a novel modeling method and measure the information content in a set of XML documents using different modeling methods.