Skip to Main Content
Traditional text mining systems employ shallow parsing techniques and focus on concept extraction and taxonomic relation extraction. This paper presents a novel system called CRCTOL for mining rich semantic knowledge in the form of ontology from domain-specific text documents. By using a full text parsing technique and incorporating both statistical and lexico-syntactic methods, the knowledge extracted by our system is more concise and contains a richer semantics compared with alternative systems. We conduct a case study wherein CRCTOL extracts ontological knowledge, specifically key concepts and semantic relations, from a terrorism domain text collection. Quantitative evaluation, by comparing with a state-of-the-art ontology learning system known as text-to-onto, has shown that CRCTOL produces much better precision and recall for both concept and relation extraction, especially from sentences with complex structures.
Data Mining, Fifth IEEE International Conference on
Date of Conference: 27-30 Nov. 2005