Skip to Main Content
A novel model for object category recognition in real-world scenes is proposed. Images in our model are represented by a set of triangular labelled graphs, each containing information on the appearance and geometry of a 3-tuple of distinctive image regions. In the learning stage, our model automatically learns a set of codebooks of model graphs for each object category, where each codebook contains information about which local structures may appear on which parts of the object instances of the target category. A two-stage method for optimal matching is developed, where in the first stage a Bayesian classifier based on ICA factorization is used efficiently to select the matched codebook, and in the second stage a nearest neighbourhood classifier is used to assign the test graph to one of the learned model graphs of the selected codebook. Each matched test graph casts votes for possible identity and poses of an object instance, and then a Hough transformation technique is used in the pose space to identify and localize the object instances. An extensive evaluation on several large datasets validates the robustness of our proposed model in object category recognition and localization in the presence of scale and rotation changes.