This paper develops a square root unscented Kalman filter (SRUKF) for performing video-rate visual simultaneous localization and mapping (SLAM) using a single camera. The conventional UKF has been proposed previously for SLAM, improving the handling of nonlinearities compared with the more widely used extended Kalman filter (EKF). However, no account was taken of the comparative complexity of the algorithms: In SLAM, the UKF scales as O(N3) in the state length, compared to the EKF's O(N2), making it unsuitable for video-rate applications with other than unrealistically few scene points. Here, it is shown that the SRUKF provides the same results as the UKF to within machine accuracy and that it can be reposed with complexity O(N2) for state estimation in visual SLAM. This paper presents results from video-rate experiments on live imagery. Trials using synthesized data show that the consistency of the SRUKF is routinely better than that of the EKF, but that its overall cost settles at an order of magnitude greater than the EKF for large scenes.