Vision-based topological localization and mapping for autonomous robotic systems have received increased research interest in recent years. The need to map larger environments requires models at different levels of abstraction and additional abilities to deal with large amounts of data efficiently. Most successful approaches for appearance-based localization and mapping with large datasets typically represent locations using local image features. We study the feasibility of performing these tasks in urban environments using global descriptors instead and taking advantage of the increasingly common panoramic datasets. This paper describes how to represent a panorama using the global gist descriptor, while maintaining desirable invariance properties for location recognition and loop detection. We propose different gist similarity measures and algorithms for appearance-based localization and an online loop-closure detection method, where the probability of loop closure is determined in a Bayesian filtering framework using the proposed image representation. The extensive experimental validation in this paper shows that their performance in urban environments is comparable with local-feature-based approaches when using wide field-of-view images.