Skip to Main Content
This paper presents a method of learning and recognizing generic object categories using part-based spatial models. The models are multiscale, with a scene component that specifies relationships between the object and surrounding scene context, and an object component that specifies relationships between parts of the object. The underlying graphical model forms a tree structure, with a star topology for both the contextual and object components. A partially supervised paradigm is used for learning the models, where each training image is labeled with bounding boxes indicating the overall location of object instances, but parts or regions of the objects and scene are not specified. The parts, regions and spatial relationships are learned automatically. We demonstrate the method on the detection task on the PASCAL 2006 Visual Object Classes Challenge dataset, where objects must be correctly localized. Our results demonstrate better overall performance than those of previously reported techniques, in terms of the average precision measure used in the PASCAL detection evaluation. Our results also show that incorporating scene context into the models improves performance in comparison with not using such contextual information.
Date of Conference: 17-22 June 2007