Skip to Main Content
The problem of estimation of density functionals like entropy and mutual information has received much attention in the statistics and information theory communities. A large class of estimators of functionals of the probability density suffer from the curse of dimensionality, wherein the mean squared error decays increasingly slowly as a function of the sample size T as the dimension d of the samples increases. In particular, the rate is often glacially slow of order O(T-γ/d), where γ > 0 is a rate parameter. Examples of such estimators include kernel density estimators, k -nearest neighbor (k-NN) density estimators, k-NN entropy estimators, intrinsic dimension estimators, and other examples. In this paper, we propose a weighted affine combination of an ensemble of such estimators, where optimal weights can be chosen such that the weighted estimator converges at a much faster dimension invariant rate of O(T1). Furthermore, we show that these optimal weights can be determined by solving a convex optimization problem which can be performed offline and does not require training data. We illustrate the superior performance of our weighted estimator for two important applications: 1) estimating the Panter-Dite distortion-rate factor; and 2) estimating the Shannon entropy for testing the probability distribution of a random sample.