Abstract:
Referring expression comprehension(REC) aims to ground the referring expression in an image. Many mainstream frameworks are implemented in a two-stage process: the first-...Show MoreMetadata
Abstract:
Referring expression comprehension(REC) aims to ground the referring expression in an image. Many mainstream frameworks are implemented in a two-stage process: the first-stage model generates candidate region proposals, and the second-stage model locates the referent among the proposals. Existing proposal generators create proposals entirely based on images, so there is a gap between generated proposals and referring expression, which leads to a bottleneck limiting the performance of the whole model. In order to break the bottle-neck, we introduce a novel language-guided proposal generation network: LPGN. Moreover, we introduce an uncertainty-aware proposal generation strategy to tackle the vagueness of language, so as to improve the training effectiveness. LPGN is convenient to integrate it into the existing two-stage REC models, because it is agnostic to the second stage model. Through extensive experiments on benchmark datasets, we demonstrate that our LPGN can generate proposals of higher quality than existing proposal generators and effectively alleviate the proposal bottleneck of the existing two-stage REC model.
Date of Conference: 18-22 July 2022
Date Added to IEEE Xplore: 26 August 2022
ISBN Information: