Skip to Main Content
Video caption detection and extraction is an important step for information retrieval in video databases. In this paper, we extract text information in video by fully utilizing the temporal information contained in the video. First we create a binary abstract sequence from a video segment. By analyzing the statistical pixel changes in the sequence, we can effectively locate the (dis)appealing frames of captions. Finally we extract the captions to create a summary of the video segment.