Skip to Main Content
The problem addressed in this paper is the challenge of automated construction of knowledge discovery workflows, given the types of inputs and the required outputs of the knowledge discovery process. Our methodology consists of two main ingredients. The first one is defining a formal conceptualization of knowledge types and data mining algorithms by means of knowledge discovery ontology. The second one is workflow composition formalized as a planning task using the ontology of domain and task descriptions. Two versions of a forward chaining planning algorithm were developed. The baseline version demonstrates suitability of the knowledge discovery ontology for planning and uses Planning Domain Definition Language (PDDL) descriptions of algorithms; to this end, a procedure for converting data mining algorithm descriptions into PDDL was developed. The second directly queries the ontology using a reasoner. The proposed approach was tested in two use cases, one from scientific discovery in genomics and another from advanced engineering. The results show the feasibility of automated workflow construction achieved by tight integration of planning and ontological reasoning.