Skip to Main Content
Data warehouses store large volumes of data according to a multidimensional model that provides a fast access for online analysis. The constant growth in quantity and complexity of data stored in data warehouses has led to a variety of data warehouse applications on distributed systems. The main benefits of these architectures are parallelized query execution and higher storage capacities. Computing grids in particular are built to combine a large number of heterogeneous distributed resources. Their lack of centralized control however conflicts with the centralized structure of classical data warehouses. Autonomous data management on grid nodes requires efficient communication during query evaluation. The architecture we present supports a global data localization method with the help of a specialized catalog service. Our work is based on a model for unique identification and efficient local indexing of the warehouse data. Local indexes integrate computable aggregates for maximum utilization of locally materialized data in order to facilitate cost-optimized query execution. The grid services implementing these functionalities are deployed on the GGM project's test environment.