Skip to Main Content
The work paradigm of crowd sourcing holds huge potential for organizations by providing access to a large workforce. However, an increase of crowd work entails increasing effort to evaluate the quality of the submissions. As evaluations by experts are inefficient, time-consuming, expensive, and are not guaranteed to be effective, our paper presents a concept for an automated classification process for crowd work. Using the example of crowd generated patent transcripts we build on interdisciplinary research to present an approach to classifying them along two dimensions - correctness and readability. To achieve this, we identify and select text attributes from different disciplines as input for machine-learning classification algorithms and evaluate the suitability of three well regarded algorithms, Neural Networks, Support Vector Machines and k-Nearest Neighbor algorithms. Key findings are that the proposed classification approach is feasible and the SVM classifier performs best in our experiment.