Skip to Main Content
Modern P2P systems use hybrid searches to improve search efficiency. They use a synopsis of neighborhood content to determine whether to use a structured or unstructured overlay to satisfy a particular query. Because of their size restrictions, a synopsis cannot hold all the terms from every file in the neighborhood. The challenge is to choose the terms that should be represented in the synopsis. In this work, we investigated the distribution of query terms and file terms in Gnutella networks. We observed that there was a mismatch between terms that were popular among file names and the terms that were popular among the queries generated by the user. Because the query behavior changed with time, a synopsis based on only static set of popular file terms was ill-suited to support efficient searches. We used these observations to design a synopsis creation algorithm that dynamically adapted to the query workload and selected terms for the synopsis to reflect popular terms in both the query workload and file distribution. Our preliminary experimental analysis showed that our Query-Adaptive synopsis improved the search performance over the traditional file-based synopsis model.