Skip to Main Content
In practice, researchers often face the problem of being able to collect only one, possibly large, dataset, and they are forced to make inferences from a single sample. Based on the results of the polarisation operator technique of Bowman et al (1969), we computed the dependence of joint entropy and mutual information estimates on the sample size in terms of asymptotic series. These expressions enabled us to control the bias of the estimates caused by finite sample sizes and obtain an expression for the accuracies. The result is important in data mining when joint entropy and mutual information are used to find interdependences within large data sets with unknown underlying structures.