Skip to Main Content
The problem of joint universal source coding and density estimation is considered in the setting of fixed-rate lossy coding of continuous-alphabet memoryless sources. For a wide class of bounded distortion measures, it is shown that any compactly parametrized family of Rd -valued independent and identically distributed (i.i.d.) sources with absolutely continuous distributions satisfying appropriate smoothness and Vapnik-Chervonenkis (VC) learnability conditions, admits a joint scheme for universal lossy block coding and parameter estimation, such that when the block length n tends to infinity, the overhead per-letter rate and the distortion redundancies converge to zero as O(n-1 log n) and O(radicn-1 log n), respectively. Moreover, the active source can be determined at the decoder up to a ball of radius O(radicn-1 log n) in variational distance, asymptotically almost surely. The system has finite memory length equal to the block length, and can be thought of as blockwise application of a time-invariant nonlinear filter with initial conditions determined from the previous block. Comparisons are presented with several existing schemes for universal vector quantization, which do not include parameter estimation explicitly, and an extension to unbounded distortion measures is outlined. Finally, finite mixture classes and exponential families are given as explicit examples of parametric sources admitting joint universal compression and modeling schemes of the kind studied here.