Linear and multilinear models (PCA, 3DMM, AAM/ASM, and multilinear tensors) of object shape/appearance have been very popular in computer vision. In this paper, we analyze the applicability of these heuristic models from the fundamental physical laws of object motion and image formation. We prove that under suitable conditions, the image appearance space can be closely approximated to be multilinear, with the illumination and texture subspaces being trilinearly combined with the direct sum of the motion and deformation subspaces. This result provides a physics-based understanding of many of the successes and limitations of the linear and multilinear approaches existing in the computer vision literature, and also identifies some of the conditions under which they are valid. It provides an analytical representation of the image space in terms of different physical factors that affect the image formation process. Numerical analysis of the accuracy of the physics-based models is performed, and tracking results on real data are presented.