Skip to Main Content
Segmentation algorithms traditionally employ low-level features to divide images into different regions that show a certain degree of homogeneity. However, low-level features, spatial or temporal, are not always reliable when processing real-world video sequences, because of issues like illuminations or complex backgrounds. Furthermore, real world objects can be composed of different regions with heterogeneous features. Although the inclusion of motion can mitigate some of these effects, many problems are still present. This paper proposes the utilization of some spatio-temporal mid-level features that are related, on the one hand, to geometric properties of real objects and, on the other, to well-known motion patterns. Specifically, the proposed algorithm uses a mid-level module that controls the subsequent segmentation using these kinds of features. Some experiments and evaluations show that the inclusion of mid-level features can help to obtain perceptually more meaningful segmentations, thus resulting in regions that are closer to semantic concepts.