Skip to Main Content
With escalating claim for information indexing and retrieval a lot of attempts have been done on hauling the text from images and videos. Hauling the scene text from image and video is challenging due to complex background, changeable font size, dissimilar style, unknown layout, poor resolution and blurring, position, viewing angle and so on. The primary objective of the proposed system is to detect and haul out the text from digital videos. Through this paper we propose a hybrid approach to haul out the text from videos by integration of the two popular text extraction methods: region and connected component (CC) based method. Primarily fragment the videos into frames and acquire the key frames. Text region indicator (TRI) is being developed to figure out the text prevailing confidence and candidate region by performing binarization. Artificial Neural network (ANN) is used as the classifier to filter out the text and non-text components where Optical Character Recognition (OCR) is used for verification. Text is grouped by constructing the minimum spanning tree using the bounding box (BB) distance.