Skip to Main Content
The coverage of the semantic gap in video indexing and retrieval has gone through a continuous increase of the vocabulary of high - level features or semantic descriptors, sometimes organized in light - scale, corpus - specific, computational ontologies. This paper presents a computer - supported manual annotation method that relies on a very large scale, shared, commonsense ontologies for the selection of semantic descriptors. The ontological terms are accessed through a linguistic interface that relies on multi - lingual dictionaries and action/event template structures (or frames). The manual generation or check of annotations provides ground truth data for evaluation purposes and training data for knowledge acquisition. The novelty of the approach relies on the use of widely shared large - scale ontologies, that prevent arbitrariness of annotation and favor interoperability. We test the viability of the approach by carrying out some user studies on the annotation of narrative videos.