Skip to Main Content
We consider the problem of approximating the number of differences between sets held on remote hosts using minimum communication. Efficient solutions to this problem are important for streamlining a variety of communication sensitive network applications, including data synchronization in mobile networks, gossip protocols and content delivery networks. Using tools from the field of interactive communication, we show that this problem requires about as much communication as the problem of exactly determining such differences. As a result, we propose a heuristic solution based on the counting Bloom filter. We provide analytic bounds on the expected performance of our protocol and also experimental evidence that they can outperform existing difference approximation techniques.