Skip to Main Content
Metric-space similarity search has been proven suitable for searching large collections of complex objects such as images. A number of distributed index data structures and respective parallel query processing algorithms have been proposed for clusters of distributed memory processors. Previous work has shown that best performance is achieved when using global indexing as opposed to local indexing. However global indexing is prone to performance degradation when query load becomes unbalanced across processors. This paper proposes a query scheduling algorithm that solves this problem. It adaptively load balances processing of user queries that are dynamically skewed towards particular sections of the distributed index. Sections highly hit by queries can be kept replicated. Experimental results show that with 1%-10% replication performance improves significantly (e.g., 35%) under skewed work-loads.