This paper explores the combination of inertial sensor data with vision. Visual and inertial sensing are two sensory modalities that can be explored to give robust solutions on image segmentation and recovery of 3D structure from images, increasing the capabilities of autonomous robots and enlarging the application potential of vision systems. In biological systems, the information provided by the vestibular system is fused at a very early processing stage with vision, playing a key role on the execution of visual movements such as gaze holding and tracking, and the visual cues aid the spatial orientation and body equilibrium. In this paper, we set a framework for using inertial sensor data in vision systems, and describe some results obtained. The unit sphere projection camera model is used, providing a simple model for inertial data integration. Using the vertical reference provided by the inertial sensors, the image horizon line can be determined. Using just one vanishing point and the vertical, we can recover the camera's focal distance and provide an external bearing for the system's navigation frame of reference. Knowing the geometry of a stereo rig and its pose from the inertial sensors, the collineations of level planes can be recovered, providing enough restrictions to segment and reconstruct vertical features and leveled planar patches.