Abstract:
With the large-scale growth of medical data, Chinese medical named entity recognition (NER) has attracted extensive attention, which plays a crucial prerequisite role in ...Show MoreMetadata
Abstract:
With the large-scale growth of medical data, Chinese medical named entity recognition (NER) has attracted extensive attention, which plays a crucial prerequisite role in extracting high-quality medical information. Many studies have confirmed that additional lexical and radical information can improve the Chinese medical NER model effectively. Still, a method that can effectively utilize both types of information has been lacking. In this paper, various forms of introducing lexical and radical information into the NER task are deeply investigated. On this basis, TsERL: Two-stage Enhancement of Radical and Lexicon is proposed for Chinese medical NER, which constructs the relation graph among radicals, characters, and words. In terms of lexical enhancement, TsERL utilizes word graph to match characters with dynamic word counts from three aspects according to the match position. TsERL follows adaptive embedding (AE) paradigm that is transferable, while maintaining better performance than dynamic architecture (DA) paradigm. In terms of radical enhancement, TsERL constructs a global radical graph according to the role of radicals in the Chinese system. On the graph, character nodes can perceive the information of other character nodes and radical nodes through multi-hop relationships. With the power of GAT, TsERL can obtain character type information and lexical boundary information from the radical graph and the word graph, respectively. Experiments on two public datasets show that TsERL outperforms other models in performance.
Date of Conference: 06-08 December 2022
Date Added to IEEE Xplore: 02 January 2023
ISBN Information: