Skip to Main Content
We present an ontology-guided, symbol-based, image parser which involves the use of semantic, spoken language descriptions of entities in images as well as the real-world spatial relationships defined between these entities. Our parsing approach explicitly describes objects and the relationships between them with linguistically meaningful modes of colors, textures and [coarse] expressions of shapes. The image parser is built on a syntactic image grammar-based framework and performs a (near) global optimization using superpixels as an initial set of subpatterns. It hypothesizes the entities in images using their local semantic attributes and verifies them globally using their more global features and their relative spatial locations,. Evaluations of the parser are performed on selected images which we make publicly available along with their manual segmentations and our labeling results.