By Topic

Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171)

13-13 June 1997

Filter Results

Displaying Results 1 - 25 of 36
  • Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171)

    Publication Year: 1997
    Request permission for commercial reuse | PDF file iconPDF (170 KB)
    Freely Available from IEEE
  • Index of authors

    Publication Year: 1997, Page(s): 399
    Request permission for commercial reuse | PDF file iconPDF (52 KB)
    Freely Available from IEEE
  • Inferring lexical and grammatical structure from sequences

    Publication Year: 1997, Page(s):265 - 274
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (524 KB)

    In a wide variety of sequences from various sources, from music and text to DNA and computer programs, two different but related kinds of structure can be discerned. First, some segments tend to be repeated exactly, such as motifs in music, words or phrases in text, identifiers and syntactic idioms in computer programs. Second, these segments interact with each other in variable but constrained wa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A probabilistic approach to some asymptotics in source coding

    Publication Year: 1997, Page(s):97 - 106
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (544 KB)

    Renewal theory is a powerful tool in the analysis of source codes. In this paper, we use renewal theory to obtain some asymptotic properties of finite-state noiseless channels. We discuss the relationship between these results and earlier uses of renewal theory to analyze the Lempel-Ziv codes and the Tunstall code. As a new application of our results, we provide a simple derivation of the asymptot... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the approximate pattern occurrences in a text

    Publication Year: 1997, Page(s):253 - 264
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (588 KB)

    Consider a given pattern H and a random text T generated randomly according to the Bernoulli model. We study the frequency of approximate occurrences of the pattern H in a random text when overlapping copies of the approximate pattern are counted separately. We provide exact and asymptotic formulae for mean, variance and probability of occurrence as well as asymptotic results including the central... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the role of data compression in new products

    Publication Year: 1997
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (48 KB)

    Summary form only given. Discusses the role of data compression in storage subsystems (including caches and controllers) and operating systems (code compression) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A progressive Ziv-Lempel algorithm for image compression

    Publication Year: 1997, Page(s):136 - 144
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (496 KB)

    We describe an algorithm that gives a progression of compressed versions of a single image. Each stage of the progression is a lossy compression of the image, with the distortion decreasing in each stage, until the last image is losslessly compressed. Progressive encodings are useful in applications such as Web browsing and multicast, where the best rate/distortion tradeoff often is not known in a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Vector quantization and density estimation

    Publication Year: 1997, Page(s):172 - 193
    Cited by:  Papers (11)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1268 KB)

    The connection between compression and the estimation of probability distributions has long been known for the case of discrete alphabet sources and lossless coding. A universal lossless code which does a good job of compressing must implicitly also do a good job of modeling. In particular, with a collection of codebooks, one for each possible class or model, if codewords are chosen from among the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Practical implementation of the lossless compression algorithm

    Publication Year: 1997, Page(s):390 - 397
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB)

    A combination of the LZ78 method with a new scheme of model contexting is introduced. In the proposed scheme the hashing function is also used. This approach speeds up the searching process and has an improvement over model contexting View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Kolmogorov random graphs

    Publication Year: 1997, Page(s):78 - 96
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (956 KB)

    We investigate topological, combinatorial, statistical, and enumeration properties of finite graphs with high Kolmogorov complexity (almost all graphs) using the novel incompressibility method. Example results are: (i) the mean and variance of the number of (possibly overlapping) ordered labeled subgraphs of a labeled graph as a function of its randomness deficiency and (ii) a new elementary proof... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asymmetry in Ziv/Lempel '78 parsing

    Publication Year: 1997, Page(s):320 - 328
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (376 KB)

    We the compare the number of phrases created by Ziv/Lempel '78 parsing of a binary sequence and of its reversal. We show that the two parsings can vary by a factor that grows at least as fast as the logarithm of the sequence length. We then show that under a suitable condition, the factor can even become polynomial, and argue that the condition may not be necessary View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-string search in BSP

    Publication Year: 1997, Page(s):240 - 252
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (724 KB)

    We have studied the worst-case complexity of the multi-string search problem in the bulk synchronous parallel (BSP) model (Valiant 1990). For this purpose, we have devised a very simple way to distribute the blind trie data structure among the p processors so that the communication cost is balanced. In the light of the very efficient algorithms and data structures known for external memory and the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel decoder for LZ2 compression using the ID update heuristic

    Publication Year: 1997, Page(s):368 - 373
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (332 KB)

    The LZ2 compression method seems hardly parallelizable since some related heuristics are known to be P-complete. In spite of such negative result, the decoding process can be parallelized efficiently for the next character heuristic. We show an other parallel decoding algorithm for LZ2 compression using the ID update heuristic. The algorithm works in O(log2n) time with O(n/log(n)) proce... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multialphabet coding with separate alphabet description

    Publication Year: 1997, Page(s):56 - 65
    Cited by:  Papers (15)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (564 KB)

    For lossless universal source coding of memoryless sequences with an a priori unknown alphabet size (multialphabet coding), the alphabet of the sequence must be described as well as the sequence itself. Usually an efficient description of the alphabet can be made only by taking into account some additional information. We show that these descriptions can be separated in such a way that the encodin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardness of flip-cut problems from optical mapping [DNA molecules application]

    Publication Year: 1997, Page(s):275 - 284
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (524 KB)

    Optical mapping is a new technology for constructing restriction maps. Associated computational problems include aligning multiple partial restriction maps into a single “consensus” restriction map, and determining the correct orientation of each molecule, which was formalized as the exclusive binary flip cut (EBFC) problem by Muthukrishnan and Parida (see Proc. of the First ACM Confer... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compression of low entropy strings with Lempel-Ziv algorithms

    Publication Year: 1997, Page(s):107 - 121
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (668 KB)

    We compare the compression ratio of the Lempel-Ziv algorithms with the empirical entropy of the input string, We show that although these algorithms are optimal according to the generally accepted definition, we can find families of low entropy strings which are not compressed optimally. More precisely, we show that the compression ratio achieved by LZ78 (resp. LZ77) can be much higher than the ze... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A universal upper bound on the performance of the Lempel-Ziv algorithm on maliciously-constructed data

    Publication Year: 1997, Page(s):123 - 135
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (536 KB)

    We consider the performance of the Lempel-Ziv (1978) algorithm on finite strings and infinite sequences having unbalanced statistics. We show that such strings and sequences are compressed by the Lempel-Ziv algorithm. We show that the converse does not hold, i.e., that there are sequences with perfectly balanced asymptotic statistics that the Lempel-Ziv algorithm compresses optimally View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Code and parse trees for lossless source encoding

    Publication Year: 1997, Page(s):145 - 171
    Cited by:  Papers (27)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1844 KB)

    This paper surveys the theoretical literature on fixed-to-variable-length lossless source code trees, called code trees, and on variable-length-to-fixed lossless source code trees, called parse trees. In particular, the following code tree topics are outlined in this survey: characteristics of the Huffman (1952) code tree; Huffman-type coding for infinite source alphabets and universal coding; the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimization of the SW algorithm for high-dimensional compression

    Publication Year: 1997, Page(s):194 - 203
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (640 KB)

    This paper describes an algorithm and a software package SW (Spherical Wavelets) that implements a method for compression of scalar functions defined on 3D objects. This method combines discrete second generation wavelet transforms with an extension of the embedded zerotree coding method. We present some results on optimizing the performance of the SW algorithm via the use of arithmetic coding, di... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sequence sorting in secondary storage

    Publication Year: 1997, Page(s):329 - 346
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1000 KB)

    We investigate the I/O complexity of the problem of sorting sequences (or strings of characters) in external memory, which is a fundamental component of many large-scale text applications. In the standard unit-cost RAM comparison model, the complexity of sorting K strings of total length N is Θ(K log2 K+N). By analogy, in the external memory (or I/O) model, where the internal memo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Thresholding wavelets for image compression

    Publication Year: 1997, Page(s):374 - 389
    Cited by:  Patents (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (920 KB)

    The paper addresses the problem of thresholding wavelet coefficients in a transform-based algorithm for still image compression. Processing data before the quantization phase is a crucial step in a compression algorithm, especially in applications which require high compression ratios. In the paper, after a review on the applications of wavelets to image compression, a new solution to the problem ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Near-lossless image compression schemes based on weighted finite automata encoding and adaptive context modelling

    Publication Year: 1997, Page(s):66 - 77
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (552 KB)

    We study high-fidelity image compression with a given tight bound on the maximum error magnitude. We propose a weighted finite automata (WFA) recursive encoding scheme on the adaptive context modelling based quantizing prediction residue images. By incorporating the proposed recursive WFA encoding techniques into the context modelling based nearly-lossless CALIC (context based adaptive lossless im... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error resilient data compression with adaptive deletion

    Publication Year: 1997, Page(s):285 - 294
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (604 KB)

    In earlier work we presented the k-error protocol, a technique for protecting a dynamic dictionary method from error propagation as the result of any k errors on the communication channel or compressed file. Here we further develop this approach and provide experimental evidence that this approach is highly effective in practice against a noisy channel or faulty storage medium. That is, for LZ2-ba... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A criterion for model selection using minimum description length

    Publication Year: 1997, Page(s):204 - 214
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (500 KB)

    Rissanen (1978) proposed the idea that the goodness of fit of a parametric model of the probability density of a random variable could be thought of as an information coding problem. He argued that the best model was that which was able to describe the training data together with the model parameters using the fewest number of bits of information (Occam's razor). This paper builds upon that basic ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hashing on strings, cryptography, and protection of privacy

    Publication Year: 1997
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (64 KB)

    Summary form only given. The issues of privacy and reliability of personal data are of paramount importance. If L is a list of people carrying some harmful defective gene, we want questions as to whether a person is in L to be reliably answered without compromising the data concerning anybody else. Reliability means that once the list is formed, nobody can play with the answer. Thus the answer sho... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.