Skip to Main Content
We propose and analyze a distributed cooperative caching strategy based on the Evolutive Summary Counters (ESC), a new data structure that stores an approximated record of the data accesses in each computing node of a search engine. The ESC capture the frequency of accesses to the elements of a data collection, and the evolution of the access patterns for each node in a network of computers. The ESC can be efficiently summarized into what we call ESC-summaries to obtain approximate statistics of the document entries accessed by each computing node. We use the ESC-summaries to introduce two algorithms that manage our distributed caching strategy, one for the distribution of the cache contents, ESC-placement, and another one for the search of documents in the distributed cache, ESC-search. While the former improves the hit rate of the system and keeps a large ratio of data accesses local, the latter reduces the network traffic by restricting the number of nodes queried to find a document. We show that our cooperative caching approach outperforms state-of-the-art models in both hit rate, throughput, and location recall for multiple scenarios, i.e., different query distributions and systems with varying degrees of complexity.