We present an original multiview stereo reconstruction algorithm which allows the 3D-modeling of urban scenes as a combination of meshes and geometric primitives. The method provides a compact model while preserving details: Irregular elements such as statues and ornaments are described by meshes, whereas regular structures such as columns and walls are described by primitives (planes, spheres, cylinders, cones, and tori). We adopt a two-step strategy consisting first in segmenting the initial mesh-based surface using a multilabel Markov Random Field-based model and second in sampling primitive and mesh components simultaneously on the obtained partition by a Jump-Diffusion process. The quality of a reconstruction is measured by a multi-object energy model which takes into account both photo-consistency and semantic considerations (i.e., geometry and shape layout). The segmentation and sampling steps are embedded into an iterative refinement procedure which provides an increasingly accurate hybrid representation. Experimental results on complex urban structures and large scenes are presented and compared to state-of-the-art multiview stereo meshing algorithms.