Tracking multiple humans in complex situations is challenging. The difficulties are tackled with appropriate knowledge in the form of various models in our approach. Human motion is decomposed into its global motion and limb motion. In the first part, we show how multiple human objects are segmented and their global motions are tracked in 3D using ellipsoid human shape models. Experiments show that it successfully applies to the cases where a small number of people move together, have occlusion, and cast shadow or reflection. In the second part, we estimate the modes (e.g., walking, running, standing) of the locomotion and 3D body postures by making inference in a prior locomotion model. Camera model and ground plane assumptions provide geometric constraints in both parts. Robust results are shown on some difficult sequences.