Skip to Main Content
In this paper we define a metric distance between probability distributions on two distinct finite sets of possibly different cardinalities. The metric is defined in terms of a joint distribution on the product of the two sets, which has the two given distributions as its marginals, and has minimum entropy. Computing the metric exactly turns out to be NP-hard. Therefore an efficient greedy algorithm is presented for finding an upper bound on the distance. We then study the problem of optimal order reduction in the metric defined here. It is shown that every optimal reduced-order approximation must be an aggregation of the original distribution, and that optimal reduced order approximation is equivalent to finding an aggregation with maximum entropy. This problem also turns out to be NP-hard, so again a greedy algorithm is constructed for finding a suboptimal reduced order approximation. Taken together, all the results presented here permit the approximation of an independent and identically distributed (i.i.d.) process over a set of large cardinality by another i.i.d. process over a set of smaller cardinality. In future work, attempts will be made to extend this work to Markov processes over finite sets.