Skip to Main Content
Ontologies are successfully used as semantic guides when navigating through the huge and ever increasing quantity of digital documents. Nevertheless, the size of numerous domain ontologies tends to grow beyond the human capacity to grasp information. This growth is problematic for a lot of key applications that require user interactions such as document annotation or ontology modification/evolution. The problem could be partially overcome by providing users with a subontology focused on their current concepts of interest. A subontology restricted to this sole set of concepts is of limited interest since their relationships can generally not be explicit without adding some of their hyponyms and hypernyms. This paper proposes efficient algorithms to identify these additional key concepts based on the closure of two common graph operators: the least common-ancestor (lca) and greatest common descendant (gcd). The resulting method produces ontology excerpts focused on a set of concepts of interest and is fast enough to be used in interactive environments. As an example, we use the resulting program, called OntoFocus (http://www.ontotoolkit.mines-ales.fr/), to restrict, in few seconds, the large Gene Ontology (~30,000 concepts) to a subontology focused on concepts annotating a gene related to breast cancer.