By Topic

Efficient Algorithms for On-line Analysis Processing On Compressed Data Warehouses

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Jianzhong Li ; Harbin Institute of Technology, China

Data compression is an effective technique to improve the performance of data warehouses. Aggregation and cube are important operations for on-line analytical processing (OLAP). It is a major challenge to develop efficient algorithms for aggregation and cube operations on compressed data warehouses. Many efficient algorithms to compute aggregation and cube for relational OLAP have been developed. Some work has been done on efficiently computing aggregation and cube for multidimensional data warehouses (MDWs) that store datasets in multidimensional arrays rather than in tables. However, to our knowledge, there is few to date in the literature describing aggregation algorithms on compressed data warehouses for multidimensional OLAP. The goal of this paper is to develop efficient algorithms to compute aggregation and cube on compressed MDWs,. For aggregation operations, four algorithms are proposed in this paper. These algorithms operate directly on compressed datasets, which are compressed by the mapping-complete compression methods, without the need to first decompress them. The algorithms have different performance behaviors as a function of the dataset parameters, sizes of outputs and main memory availability. The algorithms are described and the I/O and CPU cost functions are presented in this paper. A decision procedure to select the most efficient algorithm for a given aggregation request is also proposed. The analysis and experimental results show that the algorithms have better performance on sparse data than the previous aggregation algorithms. For cube operations, this paper presents a novel algorithm to compute cubes on compressed data warehouses. The proposed algorithm also operates directly on compressed datasets without the need of first decompressing them. The algorithm is applicable to a large class of mapping complete data compression methods. The complexity of the algorithm is analyzed in detail. The analytical and experimental results show that the - - algorithm is more efficient than all other existing cube algorithms. In addition, a heuristic algorithm to generate an optimal plan for computing cube on data warehouses is also proposed in the paper. In conclusion, direct manipulation of compressed data is an important tool for managing very large data warehouses. Aggregation and cube are just two (and important) such operation in this direction. Additional algorithms will be needed for OLAP on compressed multidimensional data OLAP on compressed multidimensional data warehouses. We are currently working on algorithms for other operations on compressed MDWs,. We are also working on algorithms for OLAP operations applicable to other kinds of compression methods other than mapping-complete compression methods.

Published in:

Network and Parallel Computing Workshops, 2007. NPC Workshops. IFIP International Conference on

Date of Conference:

18-21 Sept. 2007