By Topic

Strong Lower Bounds for Approximating Distribution Support Size and the Distinct Elements Problem

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Raskhodnikova, S. ; Pennsylvania State Univ., State College ; Ron, D. ; Shpilka, A. ; Smith, A.

We consider the problem of approximating the support size of a distribution from a small number of samples, when each element in the distribution appears with probability at least 1/n. This problem is closely related to the problem of approximating the number of distinct elements in a sequence of length n. For both problems, we prove a nearly linear in n lower bound on the query complexity, applicable even for approximation with additive error. At the heart of the lower bound is a construction of two positive integer random variables. X1 and X2, with very different expectations and the following condition on the first k moments: E[X1]/E[X2] = E[X1 2]/E[X2 2] = ... = E[X1 k]/E[X2 k]. Our lower bound method is also applicable to other problems. In particular, it gives new lower bounds for the sample complexity of (1) approximating the entropy of a distribution and (2) approximating how well a given string is compressed by the Lempel-Ziv scheme.

Published in:

Foundations of Computer Science, 2007. FOCS '07. 48th Annual IEEE Symposium on

Date of Conference:

21-23 Oct. 2007