Skip to Main Content
Resource selection, also called server selection, collection selection or database selection, is a foundational problem in distributed information retrieval (DIR). This paper introduces a set-covering-based algorithm for resource selection in DIR, with consideration of overlapping extent between resources. Give different document with different weight according to its position in merged results for question Q. Only results that have not appeared in some earlier selected resource are focused on in later selected resources. The score of each resource is decided by the total weights of those merged results included in, and only the resource with max score is selected in each selecting step. So, the selecting order is the actual rank of selected resources which are used to search the question Qpsila, which is similar to question Q. The approach saves big searching time due to overlapping between databases and, at the same time, enhances user's recall rate and precision.