Importance sampling of delta-AUC: A basis for active learning for improved keyword search | IEEE Conference Publication | IEEE Xplore

Importance sampling of delta-AUC: A basis for active learning for improved keyword search


Abstract:

We present an importance sampling based approach to the active learning problem of selecting additional training data to supplement a seed model. Our proposed Δ-AUC selec...Show More

Abstract:

We present an importance sampling based approach to the active learning problem of selecting additional training data to supplement a seed model. Our proposed Δ-AUC selection optimizes AUC improvement in keyword search and is evaluated on the Spanish Fisher corpus. We show that over different training data sizes, Δ-AUC selection consistently outperforms random sampling by 1.05% to 2.69% absolute AUC and requires no more than 60% of the transcriptions needed by random sampling to achieve the same AUC. On terms not seen in the original seed model training, the proposed algorithm achieves a 3.47% better AUC and 4.66% reduction in word error rate. We also introduce a regression analysis model that can refine our Δ-AUC strategy in the future.
Date of Conference: 20-25 March 2016
Date Added to IEEE Xplore: 19 May 2016
ISBN Information:
Electronic ISSN: 2379-190X
Conference Location: Shanghai, China

Contact IEEE to Subscribe

References

References is not available for this document.