Skip to Main Content
In this correspondence, we present a new universal entropy estimator for stationary ergodic sources, prove almost sure convergence, and establish an upper bound on the convergence rate for finite-alphabet finite memory sources. The algorithm is motivated by data compression using the Burrows-Wheeler block sorting transform (BWT). By exploiting the property that the BWT output sequence is close to a piecewise stationary memoryless source, we can segment the output sequence and estimate probabilities in each segment. Experimental results show that our algorithm outperforms Lempel-Ziv (LZ) string-matching-based algorithms.