Loading [MathJax]/extensions/MathMenu.js
Language Model Adaptation for Downstream Tasks using Text Selection | IEEE Conference Publication | IEEE Xplore

Language Model Adaptation for Downstream Tasks using Text Selection


Abstract:

Previous research shows that the domain of the training data has a large impact on the performance of the downstream tasks. Selecting data from an appropriate domain lead...Show More

Abstract:

Previous research shows that the domain of the training data has a large impact on the performance of the downstream tasks. Selecting data from an appropriate domain leads to improvements on the performance. Using text classification can help discriminate the data which belong to different domains. In this paper, we use a text classification method to select data from a particular domain (task-specific target domain). We experiment with different sizes of target domain corpus to explore the effect of the method. A pretrained RoBERTa model is adapted to the target domain corpus using the selected data prior to training the model on the downstream tasks. Our experiments show that using a simple domain classifier to select a small dataset to adapt the model can help stabilize the performance of downstream tasks.
Date of Conference: 25-27 March 2022
Date Added to IEEE Xplore: 19 September 2022
ISBN Information:
Conference Location: Xi'an, China

Contact IEEE to Subscribe

References

References is not available for this document.