This paper proposes a method for detecting object classes that exhibit variable shape structure in heavily cluttered images. The term "variable shape structure" is used to characterize object classes in which some shape parts can be repeated an arbitrary number of times, some parts can be optional, and some parts can have several alternative appearances. Hidden state shape models (HSSMs), a generalization of hidden Markov models (HMMs), are introduced to model object classes of variable shape structure using a probabilistic framework. A polynomial inference algorithm automatically determines object location, orientation, scale, and structure by finding the globally optimal registration of model states with the image features, even in the presence of clutter. Experiments with real images demonstrate that the proposed method can localize objects of variable shape structure with high accuracy. For the task of hand shape localization and structure identification, the proposed method is significantly more accurate than previously proposed methods based on chamfer-distance matching. Furthermore, by integrating simple temporal constraints, the proposed method gains speed-ups of more than an order of magnitude and produces highly accurate results in experiments on nonrigid hand motion tracking.