Skip to Main Content
Patient Data is critical in healthcare domain and it should be secure, consistent and coded for the secure transfer from one potential user to another. SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms) is a standardized reference terminology that consists of millions of SNOMED CT concepts with SNOMED CT codes. This paper describes the extraction of natural language concepts from free text discharge summary reports and mapping with SNOMED CT codes. For the evaluation of the medical concepts, we selected 300 discharge summaries corpus provided by University of Pittsburgh Medical Centre, and compared it with the SNOMED CT concept file which is preprocessed and cleaned file listing SNOMED CT concepts. In this paper we present the ongoing research on SNOMED CT concept extraction from discharge summaries using natural language processing and introducing SNOMED CT core concepts as a gazetteer list for concept extraction. Out of 390023 concepts, 21563 concepts were found in the test set of discharge summaries from SNOMED CT core concepts gazetteer list.