Skip to Main Content
Visual quality of compressed video sequences depends on factors including spatial texture content and cognition-based factors such as prior knowledge and task in hand. The MOSp metric is a full reference objective quality metric which predicts perceived quality of sequences with video compression-induced impairments based on the spatial texture content and the mean squared error between original and compressed video sequences. In this paper, we extend the MOSp metric to incorporate cognition-based factors to identify regions in a video scene that attract human attention. The proposed metric has been tested on a variety of multimedia sequences of common intermediate format resolution compressed at a wide range of bitrates using the H.264/AVC coding standard. This metric shows a higher correlation with mean opinion score (MOS) than popular metrics, such as peak signal-to noise ratio, National Telecommunications and Information Administration/Institute for Telecommunication Sciences video quality metric, PSNRplus, and the Yonsei University metric. Results also show that by extending the MOSp metric to incorporate cognition-based factors such as skin information, its correlation with subjective scores (MOS) can be significantly improved in video content where humans are present. This algorithm is particularly useful for real-time quality estimation of multimedia sequences with block-based video compression-induced impairments because all the parameters of the metric can be calculated automatically with a modest amount of processing overhead.