Abstract:
The integration of OpenCV, an open-source computer vision library, with deep learning models has led to significant advancements in various computer vision applications. ...Show MoreMetadata
Abstract:
The integration of OpenCV, an open-source computer vision library, with deep learning models has led to significant advancements in various computer vision applications. In this paper, we propose an enhancement to the OpenCV-Python library integrated within the YOLOv5 deep learning model by extending its support for HEVC/H.265, VP9, and AV1 video codecs through the utilization of the FFmpeg multimedia framework. Furthermore, we explore the transcoding capabilities of MPEG-4 and AVC/H.264 codecs to process videos with annotated bounding boxes highlighting detected objects. To objectively evaluate the performance of these codecs, we employ well-established video quality metrics (PSNR, SSIM). Through the experiment, we analyze key parameters such as compression efficiency and processing speed. The conducted measurements indicate that the HEVC/H.265 codec attains the best compression performance, resulting in a 16.6% smaller output file size compared to the input video file, while the pre-configured MPEG-4 codec in OpenCV-Python YOLOv5 offers a 20.7% higher transcoding rate compared to HEVC/H.265. However, the latest generation AV1 compression standard does not provide significant performance parameters in our case, as anticipated.
Published in: 2023 Communication and Information Technologies (KIT)
Date of Conference: 11-13 October 2023
Date Added to IEEE Xplore: 02 November 2023
ISBN Information: