Of considerable interest in recent years has been the problem of exchanging correlated data with minimum communication. We thus consider the problem of exchanging two similar strings held by different hosts. Our approach involves transforming a string into a multiset of substrings that are reconciled efficiently using known multiset reconciliation algorithms, and then put back together on a remote host using tools from graph theory. We present analyses, experiments, and results to show that the communication complexity of our approach for high-entropy data compares favorably to existing algorithms including rsync, a widely-used string reconciliation engine. We also quantify the trade-off between communication and the computation complexity of our approach
Published in:
Parallel and Distributed Systems, IEEE Transactions on
(Volume:17
,
Issue:
11
)
Date of Publication: Nov. 2006