Loading [MathJax]/extensions/MathMenu.js
Logic-Partition Based Gaussian Sampling for Online Aggregation | IEEE Conference Publication | IEEE Xplore

Logic-Partition Based Gaussian Sampling for Online Aggregation


Abstract:

Online aggregation is a commonly used technology to return approximate query results over random samples, which provides a fast way for users to obtain a trade-off betwee...Show More

Abstract:

Online aggregation is a commonly used technology to return approximate query results over random samples, which provides a fast way for users to obtain a trade-off between time and accuracy. The key issue of online aggregation is how to guarantee the efficiency and effectiveness of random sample collection. However, the state-of-the-art approaches either adopt the random sampling method or adopt the sequential sampling with preprocessing to obtain the uniform samples. The former one suffers from the inefficient random access to the whole dataset especially for skewed data distribution, and the later one is limited by the time-consuming preprocessing. To make the sampling more efficient, we propose a scalable sampling algorithm called logic-partition based Gaussian sampling. The basic idea of our solution is convert the random sampling into a near-sequential sampling without any extra preprocessing, and achieve a balance between the sampling efficiency and sample quality. Extensive experiments using the TPC-H benchmark for skewed data distribution have demonstrated the superior performance of our solution.
Date of Conference: 13-16 August 2017
Date Added to IEEE Xplore: 07 September 2017
ISBN Information:
Conference Location: Shanghai, China

Contact IEEE to Subscribe

References

References is not available for this document.