Skip to Main Content
Distributed Data Warehouses (DDWs) afford several advantages over traditional environments. Such architecture improves system performance by allowing data to be spread across data marts. Subsequently, queries can be run over smaller data sets and therefore their execution time reduces. To design an effective distributed model, it is important to manage an appropriate methodology for data fragmentation and fragment allocation. Nevertheless, very little works address this problem in a distributed context. This paper is focuses on DDW. It proposes a data mining-based horizontal fragmentation methodology for a relational DDW environment. This methodology combines the known predicate construction technique with a clustering method to fragment Data Warehouse (DW) relations. Fragments are then allocated to the corresponding site according to their frequency of use. We show experimentally with the use of the APB-1 release II benchmark that DW decentralization gives better performance. Global queries execution time is fewer by 80%.