Skip to Main Content
MMR (maximum marginal relevance) is widely used in summarization for its simplicity and efficacy, and has been demonstrated to achieve comparable performance to other approaches for meeting summarization. How to appropriately represent the similarity of two text segments is crucial in MMR. In this paper, we evaluate different similarity measures in the MMR framework for meeting summarization on the ICSI meeting corpus. We introduce a corpus- based measure to capture the similarity at the semantic level, and compare this method with cosine similarity and centroid score that only considers the salient words in the segments. Our experimental results evaluated by the ROUGE summarization metrics show that both the centroid score and the corpus-based similarity measure yield better performance than the commonly used cosine similarity. In addition, adding part-of-speech information in the corpus-based approach helps for the human transcripts condition, but not when using ASR output.