Skip to Main Content
This paper presents an ontology-based health care information extraction system - VnHIES. In the system, we develop and use two effective algorithms called "semantic elements extracting algorithm" and "new semantic elements learning algorithm" for health care semantic words extraction and ontology enhancement. The former algorithm will extract concepts (Cs), descriptions of concepts (Ds), pairs of concept and description(C-D) and Names of diseases (Ns) in health care information domain from Web pages. Those extracted semantic elements are used by latter algorithm that will render suggestions in which might contain new semantic elements for later use by domain users to enrich ontology. After extracting semantic elements, a "document weighting algorithm" is applied to get summary information of document with respect to all extracted semantic words and then to be stored in knowledge base which contains ontology and database to be used later in other applications. Our experiment results show that the approach is very optimistic with high accuracy in semantic extracting and efficiency in ontology upgrade. VnHIES can be used in many health care information management systems such as medical document classification, health care information retrieval system. VnHIES is implemented in Vietnamese language.