Loading [MathJax]/extensions/MathMenu.js
Jean-Francois Daneault - IEEE Xplore Author Profile

Showing 1-25 of 130 results

Filter Results

Show

Results

Dependent quantization (DQ) is one of the key coding tools in the Versatile Video Coding (VVC) standard. DQ employs two scalar quantizers, with the per-coefficient quantizer selection being governed by a parity-driven 4-state state machine. As the design is normatively enforced, usage of DQ requires a rate-distortion optimized quantization (RDOQ) with a per coefficient decision and state update, e...Show More
The continuous improvements on image compression with variational autoencoders have lead to learned codecs competitive with conventional approaches in terms of rate-distortion efficiency. Nonetheless, taking the quantization into account during the training process remains a problem, since it produces zero derivatives almost everywhere and needs to be replaced with a differentiable approximation w...Show More
This paper presents a convolutional neural network (CNN)-based enhancement to inter prediction in Versatile Video Coding (VVC). Our approach aims at improving the prediction signal of inter blocks with a residual CNN that incorporates spatial and temporal reference samples. It is motivated by the theoretical consideration that neural network-based methods have a higher degree of signal adaptivity ...Show More
Modern hybrid video codecs like Versatile Video Coding (VVC) heavily rely on transform coding tools. Given a prediction signal at the encoder, the residual is transformed using trigonometric transforms. Rate-distortion-optimized quantization (RDOQ) and entropy coding of the transformed residual is well-understood due to the orthogonality and the energy compaction of these transforms. Within this s...Show More
In the last years, deep video coding has attracted a lot of research interest. Usually, it employs the concept of inter coding by transmitting features in a latent space that represent a motion field or a residual. However, in such a setting there are still redundancies between the features of consecutive frames. In previous approaches, these redundancies are exploited for compression by adding an...Show More
In this paper, we present a novel in-loop filter for video coding which is based on a convolutional neural network (CNN). For that, the adaptive loop filter (ALF) of Versatile Video Coding (VVC) is generalized to define the model architecture for a CNN-based in-loop filter which requires significantly lower computational complexity compared to other existing CNN-based in-loop filters. Experimental...Show More
Distributed learning requires a frequent communication of neural network update data. For this, we present a set of new compression tools, jointly called differential neural network coding (dNNC). dNNC is specifically tailored to efficiently code incremental neural network updates and includes tools for federated BatchNorm folding (FedBNF), structured and unstructured sparsification, tensor row sk...Show More
The research on deep-learned end-to-end video compression has attracted a lot of attention over the course of recent years. A central component of many approaches is to perform motion-compensated prediction by using convolutional neural networks (CNN) which determine a compressed representation of the motion field as features. Often, this task is divided into searching motion vectors by one networ...Show More
In August 2022, ISO/IEC MPEG published the first international standard on compression of neural networks, namely Neural Network Coding (NNC, MPEG-7 part 17). It compresses neural networks to about 5% to 15% in size at virtually no performance loss. In NNC, the model weights are usually quantized and then encoded into the bitstream using DeepCABAC entropy coding. In order to improve the coding eff...Show More
Data-driven optimization is employed to study alternative approaches [1] to the probability estimator of the the Enhanced Compression Model (ECM) (which includes additional coding tools on top of the Versatile Video Coding standard). In ECM, each context model uses a weighted sum of two hypotheses for probability estimation with different associated adaptation rates. Four alternative approaches ar...Show More
Copy prediction is a renowned category of prediction techniques in video coding where the current block is predicted by copying the samples from a similar block that is present somewhere in the already decoded stream of samples. Motion-compensated prediction, intra block copy, template matching prediction etc. are examples. While the displacement information of the similar block is transmitted to ...Show More
Recently, convolutional neural network (CNN)-based in-loop filters have been introduced for video coding and they show huge coding gains. However, one of the main issues of this approach is the high computational complexity of these filters. In this paper, we present various settings for CNN-based in-loop filters targeting on the reduction of their decoder complexity and describe the corresponding...Show More
Variational autoencoders have shown promising results for still image compression and have gained a lot of consideration in this field. Recently, noteworthy attempts were made to extend such end-to-end methods to the setting of video compression. Here, low-latency scenarios have been commonly investigated. In this paper, it is shown that the compression efficiency in this setting is improved by ap...Show More
This paper presents an improved probability estimation scheme for the entropy coder of Incremental Neural Network Coding (INNC), which is currently under standardization in ISO/IEC MPEG. More specifically, the paper first analyzes the compression performance of INNC and how the bitstream size relates to the neural network (NN) layers. For the layers requiring the most bits, it analyzes the coded N...Show More
The performance of variational auto-encoders (VAE) for image compression has steadily grown in recent years, thus becoming competitive with advanced visual data compression technologies. These neural networks transform the source image into a latent space with a channel-wise representation. In most works, the latents are scalar quantized before being entropy coded. On the other hand, vector quanti...Show More
This paper presents a convolutional neural network (CNN) based solution for inter prediction in Versatile Video Coding (VVC). Our approach aims at improving the motion-compensated prediction signal of inter blocks with a residual CNN that incorporates spatial and temporal reference samples, i.e., intra-inter prediction. It is motivated by two considerations. Firstly, incorporating intra reference ...Show More
In this paper, a data-driven generalization of the adaptive loop filter (ALF) of Versatile Video Coding (VVC) is presented. It is shown how the conventional ALF process of classification and FIR filtering can be generalized to define a natural model architecture for convolutional neural network (CNN) based in-loop filters. Experimental results show that over VVC, average bit-rate savings of 3.85%/...Show More
Federated learning (FL) scenarios inherently generate a large communication overhead by frequently transmitting neural network updates between clients and server. To minimize the communication cost, introducing sparsity in conjunction with differential updates is a commonly used technique. However, sparse model updates can slow down convergence speed or unintentionally skip certain update aspects,...Show More
Deep-learned variational auto-encoders (VAE) have shown remarkable capabilities for lossy image compression. These neural networks typically employ non-linear convolutional layers for finding a compressible representation of the input image. Advanced techniques such as vector quantization, context-adaptive arithmetic coding and variable-rate compression have been implemented in these auto-encoders...Show More
The novel Neural Network Compression and Representation Standard (NNR), recently issued by ISO/IEC MPEG, achieves very high coding gains, compressing neural networks to 5% in size without accuracy loss. The underlying NNR encoder technology includes parameter quantization, followed by efficient arithmetic coding, namely DeepCABAC. In addition, NNR also allows very flexible adaptations, such as sig...Show More
Given the capabilities of massive GPU hardware, there has been a surge of using artificial neural networks (ANN) for still image compression. These compression systems usually consist of convolutional layers and can be considered as non-linear transform coding. Notably, these ANNs are based on an end-to-end approach where the encoder determines a compressed version of the image as features. In con...Show More
This paper presents two new methods for fast VVC intra-picture encoding. Both are based on an approach that uses a CNN for blockadaptive parameter estimation. The parameters restrict the multitype-tree (MTT) partitionings tested by the encoder. The methods aim for an improvement of the approach by further constraints with additional parameters. Adding parameters increases the time required for tra...Show More
The final version of the Versatile Video Coding (VVC) standard incorporates the Matrix-Based Intra Prediction (MIP) tool. It consists of additional intra prediction modes which were derived from a data-driven training. These modes, in general, are applied to the luma component only. This paper describes how to apply the MIP modes to the chroma components in certain cases. In the generic case, if a...Show More
This paper presents a new method for fast VVC intra-picture encoding using a CNN. The CNN operates on the original samples of $\mathbf{32}\times \mathbf{32}$ blocks. Given a current block, it derives for each of the block's multi-type trees (MTTs), which are nested in quad-tree (QT) nodes, a parameter pair. The parameter pairs constrain the minimum width and height of the sub-blocks in their MTTs....Show More
This paper presents a CNN to reduce the encoding time of a VVC-based intra-picture encoder. For encoding a 32 x 32 block, the CNN estimates two partitioning parameters that restrict the allowed coding block width and height. To estimate them such that the encoder skips testing inefficient partitioning modes, we train the CNN as follows: First, we generate training data by encoding sequences withou...Show More