By Topic

Bounds on expansion in LZ'77-like coding

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
V. Castelli ; IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA ; L. A. Lastras-Montano

We investigate the maximum increase in number of phrases that results from changing k consecutive symbols in a string x having length n parsed using an LZ'77-like algorithm. We consider a class of compression algorithms that partition a sequence into a collection y of nonoverlapping, variable-length phrases and encode them. Each phrase either is a singleton or matches a substring that starts to its left. We show that changing a single symbol of x in position i can yield an expansion that is of order O(n-i)23/ as (n-i)→∞. Our lower bound requires an alphabet size of O(n-i)13/. We also show that changing k consecutive symbols starting from position i can yield an expansion having a similar but somewhat more involved form. The paper contains both analytically derived upper and lower bounds, and algorithms for numerically computing tighter bounds. While deriving the bounds, we provide a detailed analysis of how expansion can arise when changing consecutive symbols. This problem is motivated by management policies for computer systems, such as the IBM Memory eXpansion Technology (MXT) or the IBM iSeries compressed disks, that use LZ'77-like coding on small compression units, such as 1-4 kbyte, and store the compressed data in memory or on disk tracks. Here, when a change of a portion of the compression unit occurs, for example, an L2 cache line, or a 512-byte disk sector, the data is recompressed and potentially stored in a different location. Knowing the maximum expansion, rather than the average expansion, is an important factor for designing policies for allocation and management of memory or disk space.

Published in:

IEEE Transactions on Information Theory  (Volume:52 ,  Issue: 5 )