Skip to Main Content
This investigation presents a novel approach to semantic segment extraction and matching for retrieving information from Internet FAQs with natural language queries. Two semantic segments, the question category segment (QS) and the keyword segment (KS), are extracted from the input queries and the FAQ questions with a semiautomatically derived question-semantic grammar. A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the FAQ collection. Additionally, the vector space model (VSM) is adopted to measure the similarity between the query and the answers of the QA pairs. Finally, a multistage ranking strategy is adopted to determine the optimally performing combination of similarity metrics. The experimental results illustrate that the proposed method achieves an average rank of 4.52 and a top-10 recall rate of 90.89 percent. Compared with the query-expansion method, this method improves the performance by 4.82 places in the average rank of correct answers, 25.34 percent in the top-5 recall rate, and 5.21 percent in the top-10 recall rate.