Skip to Main Content
We present a method to learn probabilistic object models (POMs) with minimal supervision, which exploit different visual cues and perform tasks such as classification, segmentation, and recognition. We formulate this as a structure induction and learning task and our strategy is to learn and combine elementary POMs that make use of complementary image cues. We describe a novel structure induction procedure, which uses knowledge propagation to enable POMs to provide information to other POMs and ldquoteach themrdquo (which greatly reduces the amount of supervision required for training and speeds up the inference). In particular, we learn a POM-IP defined on interest points using weak supervision ,  and use this to train a POM-mask, defined on regional features, which yields a combined POM that performs segmentation/localization. This combined model can be used to train POM-edgelets, defined on edgelets, which gives a full POM with improved performance on classification. We give detailed experimental analysis on large data sets for classification and segmentation with comparison to other methods. Inference takes five seconds while learning takes approximately four hours. In addition, we show that the full POM is invariant to scale and rotation of the object (for learning and inference) and can learn hybrid objects classes (i.e., when there are several objects and the identity of the object in each image is unknown). Finally, we show that POMs can be used to match between different objects of the same category, and hence, enable objects recognition.