By Topic

Probabilistic quality assessment of articles based on learning editing patterns

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Jingyu Han ; School of Computing, Nanjing University of Posts & Telecommunications, China 210003 ; Chuandong Wang ; Xiong Fu ; Kejia Chen

As a new model of distributed, collaborative information source, such as Wikipedia, is emerging, its content is constantly being generated, updated and maintained by various users and its data quality varies from time to time. Thus the quality assessment of the content is a pressing concern now. We observe that each article usually goes through a series of editing phases such as building structure, contributing text, discussing text, etc., gradually getting into the final quality state and that the articles of different quality classes exhibit specific edit cycle patterns. We propose a new approach to Assess Quality based on article's Editing History (AQEH) for a specific domain as follows. First, each article's editing history is transformed into a state sequence borrowing Hidden Markov Model(HMM). Second, edit cycle patterns are first extracted for each quality class and then each quality class is further refined into quality corpora by clustering. Now, each quality class is clearly represented by a series of quality corpora and each quality corpus is described by a group of frequently co-occurring edit cycle patterns. Finally, article quality can be determined in probabilistic sense by comparing the article with the quality corpora. Experimental results demonstrate that our method can capture and predict web article's quality accurately and objectively.

Published in:

Computer Science and Service System (CSSS), 2011 International Conference on

Date of Conference:

27-29 June 2011