Skip to Main Content
Distributed Peer to peer systems such as Chord allow peers to perform efficient searches using object identifiers rather than keywords. More specifically, they use a specific structure with some hashing scheme that allows peers to perform object lookup operations getting in return the address of the node storing the object. Lookups are achieved by following a path that increasingly progresses to the destination. These systems have been designed to optimize object retrieval by minimizing the number of messages and hops required to retrieve the object. The disadvantage is that they consider only the problem of searching for keys, and thus cannot capture the relevance of the documents stored in the system. This common problem with existing traditional distributed hash table (DHT) is done because they usually ignore the information retrieval algorithms, and thereby rely on keyword based searches. In this paper, we first propose to augment the P2P DHT system Chord with mechanisms for locating data using the information retrieval system LSI to facilitate content-based full-text search in large distributed information systems. Chord-LSI uses latent semantic indexing (LSI) to guide content placement in a Chord such that documents relevant to a query are likely be collocated on a small number of nodes. During a search, Chord-LSI transmit a small amount of data and search a small number of nodes. Simulation results show that Chord-LSI model is 17% more effective than Chord models.