Skip to Main Content
Automatic video segmentation is the first and necessary step for organizing a long video file into several smaller units. The smallest basic unit is a shot. Relevant shots are typically grouped into a high-level unit called a scene. Each scene is part of a story. Browsing these scenes unfolds the entire story of a film, enabling users to locate their desired video segments quickly and efficiently. Existing scene definitions are rather broad, making it difficult to compare the performance of existing techniques and to develop a better one. This paper introduces a stricter scene definition for narrative films and presents ShotWeave, a novel technique for clustering relevant shots into a scene using the stricter definition. The crux of ShotWeave is its feature extraction and comparison. Visual features are extracted from selected regions of representative frames of shots. These regions capture essential information needed to maintain viewers' thought in the presence of shot breaks. The new feature comparison is developed based on common continuity-editing techniques used in film making. Experiments were performed on full-length films with a wide range of camera motions and a complex composition of shots. The experimental results show that ShotWeave outperforms two recent techniques utilizing global visual features in terms of segmentation accuracy and time.