The conventional wide-baseline image matching solely relies on identifying the same local features of two wide-baseline images under matching to establish pixel-to-pixel correspondence based on the nearest neighbor matching criterion. However, a large number of mismatches would be incurred especially for those images containing complicated scene. In order to effectively reduce mismatches, we propose to utilize the information provided by the established coherent region-to-region correspondence to verify whether each pixel-to-pixel match constructed by the scale invariant feature transform (SIFT) descriptors previously is indeed a correct match or a mismatch. In order to establish coherent one-to-one region correspondence, over-segmentation is first performed on the entire image, and the obtained image segments are merged into larger regions by the proposed segment-merging operations. A bipartite graph model is then applied to these merged regions, and the Hungarian method is exploited to establish one-to-one coherent region pairs. Extensive experimental results have demonstrated that our proposed mismatch removal method for wide-baseline image matching significantly reduces incorrect SIFT-based pixel-to-pixel matching pairs.