By Topic

Static compression for dynamic texts

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Moffat, A. ; Dept. of Comput. Sci., Melbourne Univ., Parkville, Vic., Australia ; Sharman, N. ; Zobel, J.

The authors have explored the particular needs of large information retrieval systems, in which hundreds of megabytes of data are stored, retrieval is non-sequential, and new text is continually being appended. It has been shown that the word-based model can be adapted to cope well both with dynamic environments, and with situations in which decode-time memory is limited. In the latter case as little as 100 Kb of main memory is sufficient to achieve excellent compression, provided a suitable choice of tokens is used as the compression lexicon. To solve the former problem a new paradigm of compression has been introduced, in which some components of the compression model are required to remain static to ensure that all parts of the text can be decoded, and some parts are extensible, so that new text can also influence the assignment of codewords. An additional heuristic-Swap-to-Near-the-Front-allows collections to be seeded with as little as 1/1000 of their final text with minimal loss of compression efficiency. The resulting "almost static" compression method is ideal for large dynamic collections

Published in:

Data Compression Conference, 1994. DCC '94. Proceedings

Date of Conference:

29-31 Mar 1994