Skip to Main Content
The Internet is offering more than just regular Web pages to the users. Decision makers can now issue analytical, as opposed to transactional, queries that involve massive data (such as, aggregations of millions of rows in a relational database) in order to identify useful trends and patterns. Such queries are referred to as On-Line-Analytical-Processing (OLAP) queries. Typically, pages carrying query results do not exhibit temporal locality and, therefore, are not considered for caching at WWW proxies. In OLAP processing, this becomes a major hurdle as the cost of such queries is much higher than traditional transactional queries. This paper proposes a systematic technique to reduce the response time for OLAP queries originating from geographically distributed private LANs and issued through the Web towards the central data warehouse (DW) of an enterprise. An active caching scheme is proposed that enables the LAN proxies to cache some parts of the data, together with the semantics of the DW in order to process queries and construct the resulting pages. OLAP queries arriving at the proxy are either satisfied locally or from the DW, depending on the relative access costs. We formulate a cost model for characterizing the latencies of these queries, taking into consideration normal Web access as well as analytical processing. We propose a cache admittance and replacement algorithm that outperforms a widely accepted caching algorithm.