Skip to Main Content
A learner corpus is a computerized textual database of the language produced by foreign language learners. Annotated learner corpora contain invaluable meta-information about learners and the errors they make. With proper feature extractions and machine learning techniques, it is possible to extract implicit and explicit knowledge from learner corpora and develop useful applications to support effective foreign language teaching and learning, such as automatic proficiency level checking, error-driven and personalized learning etc. In this paper, we use a learner corpus and experiment with feature extraction and machine learning techniques to explore such applications. In particular, we reported our experimental results in automatic proficiency checking with ID3 and C4.5 decision tree algorithms, Bayesian Net and SVM. We also briefly outline other potential applications of learner corpora such as in error-driven learning by using implicit and explicit features along with machine learning.