Multibody structure-and-motion (MSaM) is the problem in establishing the multiple-view geometry of several views of a 3D scene taken at different times, where the scene consists of multiple rigid objects moving relative to each other. We examine the case of two views. The setting is the following: Given are a set of corresponding image points in two images, which originate from an unknown number of moving scene objects, each giving rise to a motion model. Furthermore, the measurement noise is unknown, and there are a number of gross errors, which are outliers to all models. The task is to find an optimal set of motion models for the measurements. It is solved through Monte-Carlo sampling, careful statistical analysis of the sampled set of motion models, and simultaneous selection of multiple motion models to best explain the measurements. The framework is not restricted to any particular model selection mechanism because it is developed from a Bayesian viewpoint: different model selection criteria are seen as different priors for the set of moving objects, which allow one to bias the selection procedure for different purposes.