Skip to Main Content
Automatic TV commercial block detection (CBD) and commercial block segmentation (CBS) are two key components of a smart commercial digesting system. In this paper, we focus our research on CBD and CBS by the means of collaborative exploitation of visual-audio-textual characteristics embedded in commercials. Rather than utilizing exclusively visual-audio characteristics like most previous works, an abundance of textual characteristics associated with commercials are fully exploited. Additionally, Tri-AdaBoost, an interactive ensemble learning manner, is proposed to form a consolidated semantic fusion across visual, audio, and textual characteristics. In order to segment a detected commercial block into multiple individual commercials, additional informative descriptors including textual characteristics are introduced to boost the robustness in the detection of frame marked with product information (FMPI). Together with the characteristics of audio spectral variation pointer and silent position, FMPI can provide a kind of complementary representation architecture to model the similarity of intra-commercial and the dissimilarity of inter-commercial. Experiments are conducted on a large video dataset from both China central television (CCTV) channels and TRECVID'05, and promising experimental results show the effectiveness of the proposed scheme.