A practical and effective sampling selection strategy for large scale deduplication | IEEE Conference Publication | IEEE Xplore