We present a vision-based approach to mobile robot localization, that integrates an image retrieval system with Monte-Carlo localization. The image retrieval process is based on features that are invariant with respect to image translations, rotations, and limited scale. Using the local features the system is robust against distortion and occlusions, which is especially important in populated environments. By using the sample-based Monte-Carlo localization technique our robot is able to globally localize itself to reliably keep tracking of its position, and to recover from localization failures. Both techniques are combined by extracting for each image a set of possible view-points using a two-dimensional map of the environment. Our technique was implemented and tested extensively. We present several experiments demonstrating the reliability and robustness of our approach even in the context of dynamics in the environment and larger errors in the odometry.