Prediction of Imaging Data Based on Signal Processing Boosted by Neural Networks with Continual Learning | IEEE Conference Publication | IEEE Xplore

Prediction of Imaging Data Based on Signal Processing Boosted by Neural Networks with Continual Learning


Abstract:

Modern video codecs rely on data prediction techniques to attain good compression performance. Such prediction techniques work on a block-by-block basis and rely on tradi...Show More

Abstract:

Modern video codecs rely on data prediction techniques to attain good compression performance. Such prediction techniques work on a block-by-block basis and rely on traditional signal processing, namely interpolation and extrapolation techniques, to predict each frame. Neural networks (NN) trained off-line have recently been shown to improve these traditional prediction techniques. However, as with any NN that requires training before operation, the prediction performance is conditioned on the training data. Furthermore, the learned parameters must be included in the compressed bitstream to reverse the prediction process when decompressing the frames, thus increasing bitrates. This work proposes a novel strategy to improve the prediction obtained by traditional signal processing by using a boosting technique that is based on fully-connected NNs (FC-NNs) that continually learn as the frames are predicted block-by-block. The proposed strategy does not require storing the learned parameters to reverse the prediction process because these parameters are refined in an on-line manner using only the data being predicted. Our evaluations show that the proposed strategy can improve the traditional block-based prediction and other related NN-based strategies with prediction accuracy gains of up to 17.064 dB PSNR. Our evaluations also show that the increase in computational complexity is negligible, with an average increase of 1.1%.
Date of Conference: 22-25 September 2024
Date Added to IEEE Xplore: 04 November 2024
ISBN Information:

ISSN Information:

Conference Location: London, United Kingdom
References is not available for this document.

1. Introduction

The High Efficiency Video Coding (HEVC) [1] and the Versatile Video Coding (VVC) [2] standards define prediction strategies to improve the performance of video codecs by reducing the amount of data needed to represent each frame. Among these strategies, the so-called intra-prediction strategy first divides each frame into non-overlapping blocks and then predicts these block sequentially following a specific order. Such a sequential prediction relies on several prediction modes that use the pixels surrounding each block as reference. Specifically, all prediction modes are tested on each block and the one that provides the best prediction is selected. Once a block is predicted, a residual block is computed as the difference between the original block and its prediction. If predicted accurately, the residual block should contain values very close to zero, which can be compressed into a small bit-stream very efficiently by applying a linear transformation, e.g., the Discrete Cosine Transform, quantization, and entropy coding. Each compressed residual block is decompressed before continuing with the prediction of the next block because the decompressed blocks are used as reference to predict subsequent blocks (see Fig. 1). To reconstruct a frame after compression, a similar block-wise process is followed: The first compressed block is first decompressed. The recovered residual block is added the corresponding predicted block, which can be computed because the initial reference pixels are readily available and the prediction mode used is signaled into the bitstream. This first reconstructed block is then employed to reconstruct the next one, repeating the process until the last block is reconstructed.

Select All
1.
G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the high efficiency video coding (HEVC) standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649–1668, 2012.
2.
S. Cho, S.-C. Lim, and J. Kang, “Evaluation Report of SDR Test Sequences (4K5–9 and 1080p1-5),” in Document JVETE0053, ISO/IEC/JTC1/SC29/WG11 ITU-T SG16 Q.6, 2017.
3.
P. Helle, J. Pfaff, M. Schafer, R. Rischke, H. Schwarz, D. Marpe, and T. Wiegand, “Intra Picture Prediction for Video Coding with Neural Networks,” in 2019 Data Compression Conference (DCC), 2019, pp. 448–457.
4.
J. Li, B. Li, J. Xu, R. Xiong, and W. Gao, “Fully Connected Network-Based Intra Prediction for Image Coding,” IEEE Transactions on Image Processing, vol. 27, no. 7, pp. 3236–3247, 2018.
5.
J. Pfaff, P. Helle, D. Maniry, S. Kaltenstadler, B. Stallenberger, P. Merkle, M. Siekmann, H. Schwarz, D. Marpe, and T. Wiegand, “Intra prediction modes based on neural networks,” Doc. JVET-J0037-v2, Joint Video Exploration Team of ITUT VCEG and ISO/IEC MPEG, 2018.
6.
M. Santamaria, S. Blasi, E. Izquierdo, and M. Mrak, “Analytic simplification of neural network based intra-prediction modes for video compression,” in 2020 IEEE International Conference on Multimedia & Expo Workshops. IEEE, 2020, pp. 1–4.
7.
T. Dumas, F. Galpin, and P. Bordes, “Iterative Training of Neural Networks for Intra-prediction,” IEEE Transactions on Image Processing, vol. 30, pp. 697–711, 2020.
8.
W. Cui, T. Zhang, S. Zhang, F. Jiang, W. Zuo, Z. Wan, and D. Zhao, “Convolutional Neural Networks Based Intra Prediction for HEVC,” in Data Compression Conference, 2017, pp. 436–436.
9.
I. Schiopu, H. Huang, and A. Munteanu, “CNN-based Intraprediction for Lossless HEVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 7, pp. 1816–1828, 2019.
10.
D. Minnen, G. Toderici, M. Covell, T. Chinen, N. Johnston, J. Shor, S. J. Hwang, D. Vincent, and S. Singh, “Spatially adaptive image compression using a tiled deep network,” in 2017 IEEE Int. Conference on Image Processing, 2017, pp. 2796–2800.
11.
V. Sanchez, M. Hernandez-Cabronero, and J. Serra-Sagrista, “Block-wise intra-prediction of imaging data based on overfitted neural networks with on-line learning,” in IEEE 31st Int. Workshop on Machine Learning for Signal Processing, 2021, pp. 1–6.
12.
V. Sanchez, M. Hernandez-Cabronero, and J. Serra-Sagrista, “Hybrid Intra-Prediction in Lossless Video Coding using Overfitted Neural Networks,” in Data Compression Conference, 2021, pp. 369–369.
13.
F. Bossen, “Common test conditions and software reference configurations,” JCTVC-L1100, vol. 12, 2013.
14.
H. Yu, R. Cohen, K. Rapaka, and J. Xu, “Common test conditions for screen content coding,” Joint Collaborative Team on Video Coding (JCTVC), vol. JCTVCT1015, 2015.
15.
J. Pfaff, A. Filippov, S. Liu, X. Zhao, J. Chen, S. De-Luxan-Hernandez, T. Wiegand, V. Rufitskiy, A. K. Ramasubramonian, and G. Vander Auwera, “Intra prediction and mode coding in vvc,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3834–3847, 2021.
16.
J. Lainema, F. Bossen, W.-J. Han, J. Min, and K. Ugur, “Intra coding of the hevc standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1792–1801, 2012.
17.
B. Bross, Y.-K. Wang, Y. Ye, S. Liu, J. Chen, G. J. Sullivan, and J.-R. Ohm, “Overview of the versatile video coding (vvc) standard and its applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, 2021.

Contact IEEE to Subscribe

References

References is not available for this document.