Skip to Main Content
Decision support and knowledge discovery systems often compute aggregate values of interesting attributes by processing a huge amount of data in very large databases and/or warehouses. In particular, iceberg query is a special type of aggregation query that computes aggregate values above a user-provided threshold. Usually, only a small number of results will satisfy the threshold constraint. Yet, the results often carry very important and valuable business insights. Because of the small result set, iceberg queries offer many opportunities for deep query optimization. However, most existing iceberg query processing algorithms do not take advantage of the small-result-set property and rely heavily on the tuple-scan-based approach. This incurs intensive disk accesses and computation, resulting in long processing time especially when data size is large. Bitmap index, which builds one bitmap vector for each attribute value, is gaining popularity in both column-oriented and row-oriented databases in recent years. It occupies less space than the raw data and gives opportunities for more efficient query processing. In this paper, we exploited the property of bitmap index and developed a very effective bitmap pruning strategy for processing iceberg queries. Our index-pruning-based approach eliminates the need of scanning and processing the entire data set (table) and thus speeds up the iceberg query processing significantly. Experiments show that our approach is much more efficient than existing algorithms commonly used in row-oriented and column-oriented databases.