Skip to Main Content
A natural language information retrieval system ranks related documents according to criteria based on user query keywords and document similarities. However, many efforts have been made to make more useful query keywords because users do not use many keywords in their natural language search query when retrieving information on the Web. Because a keyword does not provide much information, however, relevance feedback is generally used to complement the weakness of general retrieval methods. This paper proposes a term cluster query expansion model based on classification information of retrieved documents. This model generates classification information from the upper ranked n documents retrieved by retrieval system. On the basis of the extracted classification information, the term cluster (m) that represents each group is generated, and then the model allows user to select term cluster that corresponds to user information needs. The query keywords are expanded by using a relevance feedback algorithm based on the selected classification information. As a result of the experiments with test collection, the retrieval effectiveness was improved by 13.2% compared to the initial query when the Rocchio method was used.