Skip to Main Content
We study universal compression for discrete data sequences that were corrupted by noise. We show that while, as expected, there exist many cases in which the entropy of these sequences increases from that of the original data, somewhat surprisingly and counter-intuitively, universal coding redundancy of such sequences cannot increase compared to the original data. We derive conditions that guarantee that this redundancy does not decrease asymptotically (in first order) from the original sequence redundancy in the stationary memoryless case. We then provide bounds on the redundancy for coding finite length (large) noisy blocks generated by stationary memoryless sources and corrupted by some specific memoryless channels. Finally, we propose a sequential probability estimation method that can be used to compress binary data corrupted by some noisy channel. While there is much benefit in using this method in compressing short blocks of noise corrupted data, the new method is more general and allows sequential compression of binary sequences for which the probability of a bit is known to be limited within any given interval (not necessarily between 0 and 1). Additionally, this method has many different applications, including, prediction, sequential channel estimation, and others.