This paper presents an approach to shape and motion estimation that integrates heterogeneous knowledge into a unique model-based framework. We describe the observed scenes in terms of structured geometric elements (points, line segments, rectangles, 3D corners) sharing explicitly Euclidean relationships (orthogonality, parallelism, colinearity, coplanarity). Camera trajectories are represented with adaptative models which account for the regularity of usual camera motions. Two different strategies of automatic model building lead us to reduced models for shape and motion estimation with a minimal number of parameters. These models increase the robustness to noise and occlusions, improve the reconstruction, and provide a high-level representation of the observed scene. The parameters are optimally computed within a sequential Bayesian estimation procedure that gives accurate and reliable results on synthetic and real video imagery.