Skip to Main Content
This paper is based on identification and extraction of topic elements for Chinese interactive text. A topic segmentation method based on time sequence is achieved. Then a novel identification and extraction algorithm of topic elements is proposed. Firstly, noise filtering and Chinese word segmentation on the original corpus are executed. Secondly, the identifying and extracting method in group of mixed turn is used to extract the topic elements, such as time, place and figure. Finally, performance evaluation of identification recall, identification accuracy and extraction accuracy are achieved. The experimental results show the effectiveness of the algorithm.