Notification:
We are currently experiencing intermittent issues impacting performance. We apologize for the inconvenience.
By Topic

Information Theory, IEEE Transactions on

Issue 7 • Date July 2004

Filter Results

Displaying Results 1 - 22 of 22
  • Table of contents

    Publication Year: 2004 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Information Theory publication information

    Publication Year: 2004 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (38 KB)  
    Freely Available from IEEE
  • Problems on Sequences: Information Theory and Computer Science Interface

    Publication Year: 2004 , Page(s): 1385 - 1392
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | PDF file iconPDF (224 KB) |  | HTML iconHTML  
    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Markov types and minimax redundancy for Markov sources

    Publication Year: 2004 , Page(s): 1393 - 1402
    Cited by:  Papers (19)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (312 KB) |  | HTML iconHTML  

    Redundancy of universal codes for a class of sources determines by how much the actual code length exceeds the optimal code length. In the minimax scenario, one designs the best code for the worst source within the class. Such minimax redundancy comes in two flavors: average minimax or worst case minimax. We study the worst case minimax redundancy of universal block codes for Markovian sources of any order. We prove that the maximal minimax redundancy for Markov sources of order r is asymptotically equal to 1/2mr(m-1)log2n+log2Amr-(lnlnm1(m-1)/)/lnm+o(1), where n is the length of a source sequence, m is the size of the alphabet, and Amr is an explicit constant (e.g., we find that for a binary alphabet m=2 and Markov of order r=1 the constant A21=16·G≈14.655449504 where G is the Catalan number). Unlike previous attempts, we view the redundancy problem as an asymptotic evaluation of certain sums over a set of matrices representing Markov types. The enumeration of Markov types is accomplished by reducing it to counting Eulerian paths in a multigraph. In particular, we propose exact and asymptotic formulas for the number of strings of a given Markov type. All of these findings are obtained by analytic and combinatorial tools of analysis of algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Power laws for monkeys typing randomly: the case of unequal probabilities

    Publication Year: 2004 , Page(s): 1403 - 1414
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (320 KB) |  | HTML iconHTML  

    An early result in the history of power laws, due to Miller, concerned the following experiment. A monkey types randomly on a keyboard with N letters (N>1) and a space bar, where a space separates words. A space is hit with probability p; all other letters are hit with equal probability (1-p)/N. Miller proved that in this experiment, the rank-frequency distribution of words follows a power law. The case where letters are hit with unequal probability has been the subject of recent confusion, with some suggesting that in this case the rank-frequency distribution follows a lognormal distribution. We prove that the rank-frequency distribution follows a power law for assignments of probabilities that have rational log-ratios for any pair of keys, and we present an argument of Montgomery that settles the remaining cases, also yielding a power law. The key to both arguments is the use of complex analysis. The method of proof produces simple explicit formulas for the coefficient in the power law in cases with rational log-ratios for the assigned probabilities of keys. Our formula in these cases suggests an exact asymptotic formula in the cases with an irrational log-ratio, and this formula is exactly what was proved by Montgomery. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Grammar-based lossless universal refinement source coding

    Publication Year: 2004 , Page(s): 1415 - 1424
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (280 KB) |  | HTML iconHTML  

    A sequence y=(y1,...,yn) is said to be a coarsening of a given finite-alphabet source sequence x=(x1,...,xn) if, for some function φ, yi=φ(xi) (i=1,...,n). In lossless refinement source coding, it is assumed that the decoder already possesses a coarsening y of a given source sequence x. It is the job of the lossless refinement source encoder to furnish the decoder with a binary codeword B(x|y) which the decoder can employ in combination with y to obtain x. We present a natural grammar-based approach for finding the binary codeword B(x|y) in two steps. In the first step of the grammar-based approach, the encoder furnishes the decoder with O(√nlog2n) code bits at the beginning of B(x|y) which tell the decoder how to build a context-free grammar Gy which represents y. The encoder possesses a context-free grammar Gx which represents x; in the second step of the grammar-based approach, the encoder furnishes the decoder with code bits in the rest of B(x|y) which tell the decoder how to build Gx from Gy. We prove that our grammar-based lossless refinement source coding scheme is universal in the sense that its maximal redundancy per sample is O(1/log2n) for n source samples, with respect to any finite-state lossless refinement source coding scheme. As a by-product, we provide a useful notion of the conditional entropy H(Gx|Gy) of the grammar Gx given the grammar Gy, which is approximately equal to the length of the codeword B(x|y). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compression of words over a partially commutative alphabet

    Publication Year: 2004 , Page(s): 1425 - 1441
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (376 KB) |  | HTML iconHTML  

    Concurrency is a fundamental concept in computer science which is concerned with the study of systems involving multiple processes. The order of events in a concurrent system is unpredictable because of the independence of events occurring in the individual processes. Trace theory is a successful model for the execution of concurrent processes which employs congruence classes of words over partially commutative alphabets. These congruence or interchange classes generalize the more familiar notions of strings and type classes. Motivated by recent work in the areas of program profiling and compression of executable code, we consider a rate distortion problem in which the objective is to reproduce a string which is equivalent to the original string. This leads to a generalization of Kolmogorov complexity and a new graph entropy called the interchange entropy. We provide some of the basic properties of the interchange entropy. We also consider some universal compression schemes for this problem and show that for a large collection of dependence alphabets we can asymptotically attain the interchange entropy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Linear time universal coding and time reversal of tree sources via FSM closure

    Publication Year: 2004 , Page(s): 1442 - 1468
    Cited by:  Papers (29)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (728 KB) |  | HTML iconHTML  

    Tree models are efficient parametrizations of finite-memory processes, offering potentially significant model cost savings. The information theory literature has focused mostly on redundancy aspects of the universal estimation and coding of these models. In this paper, we investigate representations and supporting data structures for finite-memory processes, as well as the major impact these structures have on the universal algorithms in which they are used. We first generalize the class of tree models, and then define and investigate the properties of the finite-state machine (FSM) closure of a tree, which is the smallest FSM that generates all the processes generated by the tree. The interaction between FSM closures, generalized context trees (GCTs), and classical data structures such as compact suffix trees brings together the information-theoretic and the computational aspects, leading to the first algorithm for linear time encoding/decoding of a lossless twice-universal code in the class of three models. The implemented code is a two-pass version of Context. The corresponding optimal context selection rule and context transitions use tools similar to those employed in efficient implementation of the popular Burrows-Wheeler transform (BWT), yielding similar computational complexities. We also present a reversible transform that displays the same "context deinterleaving" feature as the BWT but is naturally based on an optimal context tree. FSM closures are also applied to an investigation of the effect of time reversal on tree models, motivated in part by the following question: When compressing a data sequence using a universal scheme in the class of tree models, can it make a difference whether we read the sequence from left to right or from right to left? Given a tree model of a process, we show constructively that the number of states in the tree model corresponding to the reversed process might be, in the extreme case, quadratic in the number of states of the original tree. This result answers the above motivating question in the affirmative. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Universal compression of memoryless sources over unknown alphabets

    Publication Year: 2004 , Page(s): 1469 - 1481
    Cited by:  Papers (50)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB) |  | HTML iconHTML  

    It has long been known that the compression redundancy of independent and identically distributed (i.i.d.) strings increases to infinity as the alphabet size grows. It is also apparent that any string can be described by separately conveying its symbols, and its pattern-the order in which the symbols appear. Concentrating on the latter, we show that the patterns of i.i.d. strings over all, including infinite and even unknown, alphabets, can be compressed with diminishing redundancy, both in block and sequentially, and that the compression can be performed in linear time. To establish these results, we show that the number of patterns is the Bell number, that the number of patterns with a given number of symbols is the Stirling number of the second kind, and that the redundancy of patterns can be bounded using results of Hardy and Ramanujan on the number of integer partitions. The results also imply an asymptotically optimal solution for the Good-Turing probability-estimation problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the efficient evaluation of probabilistic similarity functions for image retrieval

    Publication Year: 2004 , Page(s): 1482 - 1496
    Cited by:  Papers (34)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (608 KB) |  | HTML iconHTML  

    Probabilistic approaches are a promising solution to the image retrieval problem that, when compared to standard retrieval methods, can lead to a significant gain in retrieval accuracy. However, this occurs at the cost of a significant increase in computational complexity. In fact, closed-form solutions for probabilistic retrieval are currently available only for simple probabilistic models such as the Gaussian or the histogram. We analyze the case of mixture densities and exploit the asymptotic equivalence between likelihood and Kullback-Leibler (KL) divergence to derive solutions for these models. In particular, 1) we show that the divergence can be computed exactly for vector quantizers (VQs) and 2) has an approximate solution for Gauss mixtures (GMs) that, in high-dimensional feature spaces, introduces no significant degradation of the resulting similarity judgments. In both cases, the new solutions have closed-form and computational complexity equivalent to that of standard retrieval approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Some stochastic properties of memoryless individual sequences

    Publication Year: 2004 , Page(s): 1497 - 1505
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (232 KB) |  | HTML iconHTML  

    An individual sequence of real numbers is memoryless if no continuous Markov prediction scheme of finite order can outperform the best constant predictor under the squared loss. It is established that memoryless sequences satisfy an elementary law of large numbers, and sliding-block versions of Hoeffding's inequality and the central limit theorem. It is further established that memoryless binary sequences have convergent sample averages of every order, and that their limiting distributions are Bernoulli. Several examples and sources of memoryless sequences are given, and it is shown how memoryless binary sequences may be constructed from aggregating methods for sequential prediction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Finite-memory universal prediction of individual sequences

    Publication Year: 2004 , Page(s): 1506 - 1523
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (448 KB) |  | HTML iconHTML  

    The problem of predicting the next outcome of an individual binary sequence under the constraint that the universal predictor has a finite memory, is explored. In this analysis, the finite-memory universal predictors are either deterministic or random time-invariant finite-state (FS) machines with K states (K-state machines). The paper provides bounds on the asymptotic achievable regret of these constrained universal predictors as a function of K, the number of their states, for long enough sequences. The specific results are as follows. When the universal predictors are deterministic machines, the comparison class consists of constant predictors, and prediction is with respect to the 0-1 loss function (Hamming distance), we get tight bounds indicating that the optimal asymptotic regret is 1/(2K). In that case of K-state deterministic universal predictors, the constant predictors comparison class, but prediction is with respect to the self-information (code length) and the square-error loss functions, we show an upper bound on the regret (coding redundancy) of O(K-23/) and a lower bound of Θ(K-45/). For these loss functions, if the predictor is allowed to be a random K-state machine, i.e., a machine with random state transitions, we get a lower bound of Θ(1/K) on the regret, with a matching upper bound of O(1/K) for the square-error loss, and an upper bound of O(logK/K) Throughout the paper for the self-information loss. In addition, we provide results for all these loss functions in the case where the comparison class consists of all predictors that are order-L Markov machines. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of grammar-based codes revisited

    Publication Year: 2004 , Page(s): 1524 - 1535
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (552 KB)  

    The compression performance of grammar-based codes is revisited from a new perspective. Previously, the compression performance of grammar-based codes was evaluated against that of the best arithmetic coding algorithm with finite contexts. In this correspondence, we first define semifinite-state sources and finite-order semi-Markov sources. Based on the definitions of semifinite-state sources and finite-order semi-Markov sources, and the idea of run-length encoding (RLE), we then extend traditional RLE algorithms to context-based RLE algorithms: RLE algorithms with k contexts and RLE algorithms of order k, where k is a nonnegative integer. For each individual sequence x, let r*sr,k(x) and r*sr|k(x) be the best compression rate given by RLE algorithms with k contexts and by RLE algorithms of order k, respectively. It is proved that for any x, r*sr,k is no greater than the best compression rate among all arithmetic coding algorithms with k contexts. Furthermore, it is shown that there exist stationary, ergodic semi-Markov sources for which the best RLE algorithms without any context outperform the best arithmetic coding algorithms with any finite number of contexts. Finally, we show that the worst case redundancies of grammar-based codes against r*sr,k(x) and r*sr|k(x) among all length- n individual sequences x from a finite alphabet are upper-bounded by d1loglogn/logn and d2loglogn/logn, respectively, where d1 and d2 are constants. This redundancy result is stronger than all previous corresponding results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the hardness of finding optimal multiple preset dictionaries

    Publication Year: 2004 , Page(s): 1536 - 1539
    Cited by:  Papers (2)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (144 KB)  

    We show that the following simple compression problem is NP-hard: given a collection of documents, find the pair of Huffman dictionaries that minimizes the total compressed size of the collection, where the best dictionary from the pair is used to compress each document. We also show the NP-hardness of finding optimal multiple preset dictionaries for LZ'77-based compression schemes. Our reductions make use of the catalog segmentation problem, a natural partitioning problem. Our results justify heuristic attacks used in practice. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Monotonicity-based fast algorithms for MAP estimation of Markov sequences over noisy channels

    Publication Year: 2004 , Page(s): 1539 - 1544
    Cited by:  Papers (16)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (248 KB)  

    In this correspondence, we study algorithmic approach to solving the problem of maximum a posteriori (MAP) estimation of Markov sequences transmitted over noisy channels, which is also known as the MAP decoding problem. For the class of memoryless binary channels that produce independent substitution and erasure errors, the MAP sequence estimation problem can be formulated and solved as one of the longest path in a weighted directed acyclic graph. But for algorithm efficiency, we transform the graph problem to one of matrix search. If the underlying matrix is totally monotone, then the complexity of MAP sequence estimation can be greatly reduced. We give a sufficient condition for the matrix induced by MAP sequence estimation to be totally monotone, which is indeed the case if the input sequence is Gaussian Markov. Under this condition, the complexity of MAP decoding can be reduced from O(N2M) to O(NM), where N is the size of source alphabet and M is the length of input sequence. Furthermore, for Markov sequences of fixed-length code we propose a block parsing strategy to reduce the complexity of MAP sequence estimation to O(M+N2M/logM) or to O(M+NM/logM), depending on if the total monotonicity holds. Another significance of this correspondence lies in the applicability of the presented algorithmic approach, which has been thoroughly studied in computer science literature, to many other discrete optimization problems encountered in both source and channel coding, ranging from optimal multiresolution and multiple-description quantizer design, to context quantization for minimum conditional entropy, and to optimal packetization with uneven error protection. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Shared information and program plagiarism detection

    Publication Year: 2004 , Page(s): 1545 - 1551
    Cited by:  Papers (54)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    A fundamental question in information theory and in computer science is how to measure similarity or the amount of shared information between two sequences. We have proposed a metric, based on Kolmogorov complexity, to answer this question and have proven it to be universal. We apply this metric in measuring the amount of shared information between two computer programs, to enable plagiarism detection. We have designed and implemented a practical system SID (Software Integrity Diagnosis system) that approximates this metric by a heuristic compression algorithm. Experimental results demonstrate that SID has clear advantages over other plagiarism detection systems. SID system server is online at http://software.bioinformatics.uwaterloo.ca/SID/. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Universal entropy estimation via block sorting

    Publication Year: 2004 , Page(s): 1551 - 1561
    Cited by:  Papers (25)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB)  

    In this correspondence, we present a new universal entropy estimator for stationary ergodic sources, prove almost sure convergence, and establish an upper bound on the convergence rate for finite-alphabet finite memory sources. The algorithm is motivated by data compression using the Burrows-Wheeler block sorting transform (BWT). By exploiting the property that the BWT output sequence is close to a piecewise stationary memoryless source, we can segment the output sequence and estimate probabilities in each segment. Experimental results show that our algorithm outperforms Lempel-Ziv (LZ) string-matching-based algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Contributors

    Publication Year: 2004 , Page(s): 1562 - 1565
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | PDF file iconPDF (61 KB)  
    Freely Available from IEEE
  • 2005 IEEE International Symposium on Information Theory

    Publication Year: 2004 , Page(s): 1566
    Save to Project icon | Request Permissions | PDF file iconPDF (280 KB)  
    Freely Available from IEEE
  • 2004 IEEE Membership Application

    Publication Year: 2004 , Page(s): 1567 - 1568
    Save to Project icon | Request Permissions | PDF file iconPDF (744 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Information Theory information for authors

    Publication Year: 2004 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (23 KB)  
    Freely Available from IEEE
  • Blank page [back cover]

    Publication Year: 2004 , Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (2 KB)  
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Information Theory publishes papers concerned with the transmission, processing, and utilization of information.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Frank R. Kschischang

Department of Electrical and Computer Engineering