Abstract:
HyperLogLog Counting is widely used in cardinality estimation. It is the foundation of many algorithms in data analysis, commodity recommendation and database optimizatio...Show MoreMetadata
Abstract:
HyperLogLog Counting is widely used in cardinality estimation. It is the foundation of many algorithms in data analysis, commodity recommendation and database optimization. Facing the large scale internet business like electronic commerce, internet companies have an urgent requirement of distributed real-time cardinality estimation with high accuracy and low time cost. In this paper, we propose a distributed real-time cardinality estimation algorithm named Hermes. Hermes adjusts the estimated cardinality dynamically according to the result of HyperLogLog Counting and also optimizes the data distribution strategy of existing distributed cardinality estimation algorithms. Experiments have been carried out and the results show that Hermes has lower estimation error and time cost compared with existing algorithms.
Date of Conference: 24-29 July 2016
Date Added to IEEE Xplore: 03 November 2016
ISBN Information:
Electronic ISSN: 2161-4407