Loading [a11y]/accessibility-menu.js
Data Selection Driven by Item Difficulty: On Investigating Data Efficient Practice for Hyperparameter Search | IEEE Conference Publication | IEEE Xplore

Data Selection Driven by Item Difficulty: On Investigating Data Efficient Practice for Hyperparameter Search


Abstract:

Foundation Models shift the interest to adapting models instead of creating proprietary models from scratch. Despite this change, performing hyperparameter optimization (...Show More

Abstract:

Foundation Models shift the interest to adapting models instead of creating proprietary models from scratch. Despite this change, performing hyperparameter optimization (HPO) is still needed. Users adapting systems powered by those models on proprietary data should not considerably increase the overall resource footprint with extensive hyperparameter search. Given that this footprint is also proportional to the data used in HPO, we aim to investigate how a user can effectively reduce the amount of data used, leveraging the deep learning model’s empirical facility to output the expected correct result for an item in the dataset.In this work, we describe a methodology for accomplishing this data reduction through estimating a measure of an item’s difficulty. This method allows keeping only a portion of data that conserves the overall proportions of item difficulty throughout the dataset while helping order them meaningfully. The rationale is derived from results from curriculum learning research as we try to answer if the adapted models could help organize and select subsets of data representative of the whole. Preliminary results of evaluating the method are provided for image recognition and scientific name entity recognition (NER). We observe that the amount of data for HPO can be reduced as far as 60% and still point to the same choice of hyperparameters compared to using the whole training set.
Date of Conference: 14-15 April 2024
Date Added to IEEE Xplore: 18 June 2024
ISBN Information:
Conference Location: Lisbon, Portugal

Contact IEEE to Subscribe

References

References is not available for this document.