Skip to Main Content
Active Learning is an important prospect of machine learning for information extraction to deal with the problems of high cost of collecting labeled examples. It makes more efficient use of the learner's time by asking them to label only instances that are most useful for the trainer. We propose a novel method for solving this problem and show that it favorably results in the increased performance. Our proposed framework is based on an ensemble approach, where Support Vector Machine and Conditional Random Field are used as the base learners. The intuition is that both learning approaches are somewhat orthogonal in their advantages, so a combination of them can yield superior results. The proposed approach is applied for solving the problem of named entity recognition (NER) in two Indian language, namely Hindi and Bengali. Results show that the proposed technique indeed improves the performance of the system.