In this paper, we present a novel method to fuse observations from an inertial measurement unit (IMU) and visual sensors, such that initial conditions of the inertial integration, including gravity estimation, can be recovered quickly and in a linear manner, thus removing any need for special initialization procedures. The algorithm is implemented using a graphical simultaneous localization and mapping like approach that guarantees constant time output. This paper discusses the technical aspects of the work, including observability and the ability for the system to estimate scale in real time. Results are presented of the system, estimating the platforms position, velocity, and attitude, as well as gravity vector and sensor alignment and calibration on-line in a built environment. This paper discusses the system setup, describing the real-time integration of the IMU data with either stereo or monocular vision data. We focus on human motion for the purposes of emulating high-dynamic motion, as well as to provide a localization system for future human-robot interaction.