Abstract:
Transformer based Large language models such as BERT, have demonstrated the ability to derive contextual information from the words surrounding it. However, when these mo...Show MoreMetadata
Abstract:
Transformer based Large language models such as BERT, have demonstrated the ability to derive contextual information from the words surrounding it. However, when these models are applied in specific domains such as medicine, insurance, or scientific disciplines, publicly available models trained on general knowledge sources such as Wikipedia, it may not be as effective in inferring the appropriate context compared to domain-specific models trained on specialized corpora. Given the limited availability of training data for specific domains, pre-trained models can be fine-tuned via transfer learning using relatively small domain-specific corpora. However, there is currently no standardized method for quantifying the effectiveness of these domain-specific models in acquiring the necessary domain knowledge. To address this issue, we explore hidden layer embeddings and introduce domain_gain, a measure to quantify the ability of a model to infer the correct context. In this paper, we show how our measure could be utilized to determine whether words with multiple meanings are more likely to be associated with domain-related meanings rather than their colloquial meanings.
Published in: 2023 IEEE Conference on Artificial Intelligence (CAI)
Date of Conference: 05-06 June 2023
Date Added to IEEE Xplore: 02 August 2023
ISBN Information: