Skip to Main Content
Detecting an object part relies on two sources of information - the appearance of the part itself and the context supplied by surrounding parts. In this paper we consider problems in which a target part cannot be recognized reliably using its own appearance, such as detecting low-resolution hands, and must be recognized using the context of surrounding parts. We develop the `chains model' which can locate parts of interest in a robust and precise manner, even when the surrounding context is highly variable and deformable. In the proposed model, the relation between context features and the target part is modeled in a non-parametric manner using an ensemble of feature chains leading from parts in the context to the detection target. The method uses the configuration of the features in the image directly rather than through fitting an articulated 3-D model of the object. In addition, the chains are composable, meaning that new chains observed in the test image can be composed of sub-chains seen during training. Consequently, the model is capable of handling object poses which are infrequent, even non-existent, during training. We test the approach in different settings, including object parts detection, as well as complete object detection. The results show the advantages of the chains model for detecting and localizing parts of complex deformable objects.