Estimation of deduplication ratios in large data sets | IEEE Conference Publication | IEEE Xplore