Skip to Main Content
In this paper, experiments have addressed the calculation of inter-annotator inconsistency in selecting the content in both manual and automatic summarization of sample TOEFL essays. A new finding is that the linguistic quality of source essay has a very strong correlation with the degree of disagreement among human assessors to what should be included in a summary. This leads to a fully automated essay evaluation technique based on degree of disagreement among automated summarizes. ROUGE evaluation is used to measure the degree of inconsistency among the participants (human summarizers and automatic summarizers). This automated essay evaluation technique is potentially an important contribution with wider significance.
Computer Science and Information Engineering, 2009 WRI World Congress on (Volume:5 )
Date of Conference: March 31 2009-April 2 2009