Skip to Main Content
In this paper, the role of named entity based patterns is emphasized in measuring the document sentences and topic relevance for topic-focused extractive summarization. Patterns are defined as the informative, semantic-sensitive text bi-grams consisting of at least one named entity or the semantic class of a named entity. They are extracted automatically according to eight pre-specified templates. Question types are also taken into consideration if they are available when dealing with topic questions. To alleviate problems with coverage, pattern and uni-gram models are integrated together to compensate each other in similarity calculation. Automatic ROUGE evaluations indicate that the proposed idea can produce a very good system that tops the best-performing system at Document Understanding Conference (DUC) 2005.