Abstract:
Foundation Models shift the interest to adapting models instead of creating proprietary models from scratch. Despite this change, performing hyperparameter optimization (...Show MoreMetadata
Abstract:
Foundation Models shift the interest to adapting models instead of creating proprietary models from scratch. Despite this change, performing hyperparameter optimization (HPO) is still needed. Users adapting systems powered by those models on proprietary data should not considerably increase the overall resource footprint with extensive hyperparameter search. Given that this footprint is also proportional to the data used in HPO, we aim to investigate how a user can effectively reduce the amount of data used, leveraging the deep learning model’s empirical facility to output the expected correct result for an item in the dataset.In this work, we describe a methodology for accomplishing this data reduction through estimating a measure of an item’s difficulty. This method allows keeping only a portion of data that conserves the overall proportions of item difficulty throughout the dataset while helping order them meaningfully. The rationale is derived from results from curriculum learning research as we try to answer if the adapted models could help organize and select subsets of data representative of the whole. Preliminary results of evaluating the method are provided for image recognition and scientific name entity recognition (NER). We observe that the amount of data for HPO can be reduced as far as 60% and still point to the same choice of hyperparameters compared to using the whole training set.
Published in: 2024 IEEE/ACM 3rd International Conference on AI Engineering – Software Engineering for AI (CAIN)
Date of Conference: 14-15 April 2024
Date Added to IEEE Xplore: 18 June 2024
ISBN Information:
Conference Location: Lisbon, Portugal