Skip to Main Content
Performance of proxy caches for database federations that serve a large number of users is crucially dependent on its physical design. Current techniques, automated or otherwise, for physical design depend on the identification of a representative workload. In proxy caches, however, such techniques are inadequate since workload characteristics change rapidly. This is remarkably shown at the proxy cache of SkyQuery, an Astronomy federation, which receives a continuously evolving workload. We present novel techniques for automated physical design that adapt with the workload and balance the performance benefits of physical design decisions with the cost of implementing these decisions. These include both competitive and incremental algorithms that optimize the combined cost of query evaluation and making physical design changes. Our techniques are general in that they do not make assumptions about the underlying schema nor the incoming workload. Preliminary experiments on the TPC-D benchmark demonstrate significant improvement in response time when the physical design continually adapts to the workload using our online algorithm compared with offline techniques.