Rate-Distortion Optimization Using Adaptive Lagrange Multipliers

In current standardized hybrid video encoders, the Lagrange multiplier determination model is a key component in rate-distortion optimization. This originated some 20 years ago based on an entropy-constrained high-rate approximation and experimental results obtained using an H.263 reference encoder on limited test material. In this paper, we present a comprehensive analysis of the results of a Lagrange multiplier selection experiment conducted on various video content using H.264/AVC and HEVC reference encoders. These results show that the original Lagrange multiplier selection methods, employed in both video encoders, are able to achieve optimum rate-distortion performance for I and P frames, but fail to perform well for B frames. The relationship is identified between the optimum Lagrange multipliers for B frames and distortion information obtained from the experimental results, leading to a novel Lagrange multiplier determination approach. The proposed method adaptively predicts the optimum Lagrange multiplier for B frames based on the distortion statistics of recent reconstructed frames. After integration into both H.264/AVC and HEVC reference encoders, this approach was evaluated on 36 test sequences with various resolutions and differing content types. The results show consistent bitrate savings for various hierarchical B frame configurations with minimal additional complexity. BD savings average approximately 3% when constant quantization parameter (QP) values are used for all frames, and 0.5% when non-zero QP offset values are employed for different B frame hierarchical levels.


I. INTRODUCTION
Video compression, has been a key enabler for video storage, conferencing, broadcasting and streaming since the early 1980s, when the first widely adopted international coding standard, H.261 [1], was established.With recent advances in video and communication technologies, the demand for video content is ever increasing, with 73% of all Internet bandwidth consumed by video in 2016.This figure is predicted to increase to 82% in 2021 [2].
The latest video compression standard, High Efficiency Video Coding (HEVC) [3], offers improved compression performance over its predecessors, especially on higher resolution video content.This improvement is due to the introduction of new coding tools, such as more flexible macroblock sizes for prediction and transformation, finer intra prediction modes, improved de-blocking and loop filters, and enhanced interpolation in motion compensation.However the rate-distortion optimization (RDO) module in the HEVC reference encoder (and also that used as the basis for many actual HEVC deployments) employs a model almost identical to those used in most video encoders since H.263 [4].
The RDO approach employed in both HEVC and H.264/AVC [5] reference software (although non-normative) is based on an entropy-constrained high-rate approximation [6,7].This method formulates the coding parameter selection problem as finding the minimum of a Lagrange cost function, trading off rate (R) and distortion (D), and exploits the relationship between R and D using an approximate logarithmic function with constant parameters [8].Model parameters were determined based on I and P frame coding results for three sequences using H.263, and were reported to be content independent [8].As coding tools have advanced, especially with the frequent use of bi-directional inter prediction and different referencing structure in modern encoders, the optimality of this model has not been fully re-assessed.
In this paper, we address three specific research questions in order to improve the Lagrange multiplier (λ) determination method in RDO: 1) Optimality -does the RDO model employed by modern hybrid encoders still provide optimum rate distortion (RD) performance for all I, P, and B frames? 2) Independence -are the optimum Lagrange multipliers still approximately constant across various video content with identical quantization parameters (QP)? 3) Predictability -if a negative answer is observed in question 2), can any video features or coding statistics be used to predict the optimum Lagrange multipliers?This paper provides a comprehensive extension of our previous work in [9], where the Lagrange multiplier selection approach was originally introduced, applied to a simple Group of Pictures (GOP) structure (GOP length 4 with non-hierarchical B frames) 1 .We explore the answers to the questions proposed above, and an adaptive λ determination model, extended from that in [9], is presented for application scenarios with various GOP lengths.Both hierarchical and non-hierarchical GOP structures are employed to test the performance of this approach.
The remainder of this paper is structured as follows.Section II describes the RDO problem and outlines some of the most influential Lagrange multiplier determination models.The experiment on multiplier selection and its results are presented in Section III.In Section IV, a content-based adaptive Lagrange multiplier determination method is proposed, while the evaluation results of this approach are reported and discussed in Section V. Finally, Section VI provides conclusions and implications on future research directions.

II. BACKGROUND
This section is divided into three subsections.The first introduces the rate distortion optimization (RDO) problem in the context of hybrid video encoders.Previous research work on RDO modeling is then briefly reviewed, followed by a description of the Lagrange multiplier determination models employed in both H.264/AVC and HEVC reference encoders.

A. The Rate Distortion Optimization Problem
Typically, hybrid video encoders select optimum coding parameters p opt by minimizing a Lagrange cost function of rate R and distortion D [10]: where p is the vector of coding parameters including prediction modes, block partitions, etc., and λ represents the Lagrange multiplier.This optimization process is iterated during the compression process at various block levels for different types of frame.
Finding the minimum of a function is a common problem in calculus [11].In our case this can be solved, when the cost function J is a convex function of p, and both R and D are continuous and differentiable everywhere [12].The Lagrange multiplier λ can then be derived by setting the derivative of J to zero.Then: In order to determine λ, the RD curve should be known beforehand.However, this leads to a chicken and egg problem -it is, in general, difficult to predict accurate RD characteristics of videos before encoding them.In the literature, various solutions have been proposed to solve this problem and these are discussed below.

B. Rate Distortion Modeling
One important group of rate distortion models are based on the distribution of transformed residuals.For example, the generalized Gaussian distribution based RD models, presented in [13][14][15], demonstrate the precision on modeling residual energy.This type of model is however only appropriate for two pass compression, since its model parameters are content dependent and they can only be determined after the first pass.RD models based on the Cauchy distribution [16] have been proposed which overcome this shortcoming, providing more accurate estimation of transform residuals.It should be noted however that the parameter determination of Cauchy distribution based RDO is difficult due to the diverging characteristic of the model statistics.The Laplace distribution is considered as a specific case of a generalized Gaussian distribution, and RD models based on this [12], offer a trade-off between prediction accuracy and algorithm complexity.Methods in this class also include ρ-domain algorithms [17].
Another group of RDO methods employ heuristic approaches to estimate Lagrange multipliers [12].Typical examples of these include methods based on bitrate statistics [18,19] and local context [20].However this type of approach sometimes fails to perform well due to inaccurate empirical RD models.
Alongside advances in quality assessment [21][22][23], perceptual video compression algorithms [24][25][26] have been presented that achieve improved rate quality performance [27].The structural similarity index (SSIM) [21] is one of the most commonly used quality assessment methods for in-loop rate quality optimization (RQO) due to its efficiency and simplicity.Recent work, [28][29][30], has demonstrated the rate quality performance improvement possible with SSIM-based RQO when compared to conventional RDO approaches.It should be emphasized that RQO inspired video compression is still in its infancy, and quality metrics with lower computational complexity (e.g.SSIM) do not always correlate well with subjective quality opinions [23].In contrast, more advanced methods, such as MOVIE [22] and STMAD [31], are not appropriate for in-loop application due to their high complexity and/or latency characteristics.More recent contributions, such as PVM (Perception-based Video Metric) [23], offer the potential for lower latency and complexity, but are still immature in this respect.
In the context of the above discussion, our focus in this paper is on the enhancement of existing rate-distortion optimization methods using mean squared error (MSE) to assess video quality.While the use of perceptual metrics may provide more robustness in the future, this approach is not applicable at the present time due to the complexity and consistency issues associated with existing metrics.

C. RDO in H.264/AVC and HEVC Reference Encoders
The RD model most commonly used in modern hybrid video encoders was proposed by Sullivan and Wiegand [8] for entropy-constrained quantization based on a high rate approximation [6,7], where R is formulated as the logarithmic function of D, where a and b are two parameters characterizing the relationship between R and D. According to the high rate approximation, the distortion D can be modeled using the quantization interval Q as: where Q can be obtained from the quantization parameter (QP) in H.264 or HEVC using: If (3)-( 5) are substituted into (2), it provides the result: in which c = ln2/(3a).
In order to determine the value of c, Sullivan and Wiegand [8] conducted a Lagrange multiplier selection experiment on three video sequences using an H.263 reference encoder.The experimental results show that the parameter c is approximately independent of video content, with a fixed value of 0.85 for inter frames.
Extended from this model, the Lagrange multiplier determination approaches employed in H.264 and HEVC reference encoders (JM and HM respectively), as described in ( 7) and ( 8), have been developed with the consideration of bi-directional inter frames.In this paper, we follow the same definitions of I, P and B frames as in the H.264/AVC [5] and HEVC [32] standards.For clarity, we further adopt the following definitions here.B p frames are defined as B frames which are inter-predicted only from temporally previous frames.Frames using both previous and subsequent reference frames are defined as B b frames.
In equation ( 8), f is referred to as the 'QPfactor' in the HEVC HM reference encoder, having a default value of 0.5.The 'QPfactor' can be configured differently for frames at various temporal layers in a GOP [33,34].N B is the number of consecutive B b frames in a GOP.The Lagrange multiplier model for the HEVC reference encoder has been modified according to the recent recommended configurations in JCTVC-X0038 [35] (using 'QPfactor' value of 1 and larger 'QPoffset' for each hierarchical B frames) to achieve improved rate distortion performance.It is noted that, in H.264/AVC and HEVC reference encoders, the Lagrange multiplier is modeled as a function of QP, and is independent of video content.Alternative solutions, such as [36], have also been proposed using a fixed Lagrange multiplier to determine QPs for frames at various hierarchical levels.

III. AN EXPERIMENT ON λ SELECTION
It was noted in Section II that the Lagrange multiplier determination methods used in H.264/AVC and HEVC reference encoders employ a basic model whose parameter was empirically derived in [8] using an H.263 reference encoder.The optimality of this model has not been fully validated on modern video encoders using different referencing structures, which can lead to significant changes in RD characteristics.

A. Experimental Methodology
In order to investigate the optimality of the λ determination approaches, we conducted a Lagrange multiplier selection experiment comparing the RD performance using various test multiplier values λ test with that using the corresponding original multipliers λ orig , derived from ( 7) and ( 8).The range of λ test is given by: This experiment was conducted using various test material at CIF (352 × 288) resolution (YUV 4:2:0) 2 .In total nine sequences from DynTex [37], the BVI texture database [38], and standard test sequence pools were employed.This dataset was further divided into three classes according to dominant video content: (A) slow movement videos, (B) dynamic texture clips, and (C) mixed content.TABLE I provides a list of these sequences, while Fig. 1 shows their sample frames.In this experiment, five subtests were conducted with different objectives.The first two assessed the optimality of the model for I and P/B p frames respectively, while the last three investigated that for B b frames with different GOP structures.Note that we only modified the Lagrange multipliers for the tested frame types.The GOP configurations for these five tests are given in TABLE II.
JM 15.1 and HM 14.0 were used for H.264 and HEVC respectively; identical QP values were employed for all types of frames; the range of tested QP values was from 27 to 42 with an interval of 5; Main profile and non-hierarchical B frames were selected for JM; Main profile and hierarchical B frames were tested for HM.
It should be noted that constant QP values (QP offset equals zero) are used for all frames in this experiment for the HEVC HM encoder.This differs from the recommended configurations in [33,39] and [35], where fixed QP offset values are used for different hierarchical B frame levels to improve overall rate-distortion (R-D) performance.Based on the recent work in [40], using constant QP offset values does not always offer optimum R-D performance for all types of content, and QP offset values in the HEVC reference encoder should be adapted based upon video content.Since the purpose of this paper is solely to investigate the influence of Lagrange multipliers on R-D performance, constant QP values are employed in our training process, as this eliminates the confounding influence of QP offset.

B. Results for I and P/B p Frames
The performance of the Lagrange multiplier determination methods for I and P/B p frames is shown in Fig. 2.(a-d), where original multiplier values (λ orig ) are plotted alongside the optimum ones (λ opt ).These optimum Lagrange multipliers were selected to have the best overall RD performance for all frames compared to the original RD curves generated using λ orig .It can be observed that λ orig curves associated with I and P/B p frames do correlate well with corresponding λ opt values for both H.264 and HEVC encoders, although several outliers exist for the case of P/B p frames.Among all 9 test sequences and 4 QP values, only 3 λ orig values out of 36 are not able to offer optimum RD performance for H.264 P frame coding, while 4 outliers appear for HEVC P/B p frames.This indicates that the original λ determination models used in both encoders perform well for I and P frame encoding.

C. Results for B b Frames
Fig. 2.(e,f) illustrates the test results for B b frames with various GOP sizes.The test multiplier values were only applied on B b frames, which use both temporally previous and subsequent frames as references for inter-prediction.In these cases, λ orig values fail to correlate well with λ opt for both HEVC and H.264 regardless of whether the GOP length is 4, 8 or 16.The failure becomes more evident for static scene content (Class A) and dynamic textures (Class B).These results confirm our conjecture in Section I, that the conventional RDO module does need to be improved for modern video encoders.
As a result of this model failure, the distortion difference between B b and P/B p frames varies among test sequences.
Here we define the ratio between the mean squared error (MSE) of P/B p frames (MSE P ) and that of B b frames (MSE B ) as follows (only Y components of reconstructed and original frames are used for calculating MSE).
In order to investigate the relationship between r MSE and the mismatch between λ opt and λ orig for B b frames, a second ratio is defined as: Fig. 4 demonstrates the relationship between r MSE and r λ for B b frame coding with various GOP structures.It can be seen that the scatter plots for GOP length 4 using H.264 and HEVC encoders both fit well to a power function, and those for GOP 8 and 16 follow similar fitting curves as GOP 4 cases, only with a shift to the left.Based on this observation, we employ a four parameter power function to fit the correlation between r λ and r MSE for all cases, as given below: Here a, b, c and d are parameters which are determined using the dataset in The overall bitrate saving at each λ opt for B b frames over the original RD curve using the corresponding λ orig value for three tested GOP settings is illustrated in Fig. 3.The savings are content dependent and vary from 0% to 25% for H.264 and from 0% to 18% for HEVC.It can also be clearly seen that, for both encoders and all three GOP lengths, bitrate savings are below 2% if r λ falls within the range between the two blue dotted lines.

D. Summary
In summary, our Lagrange multiplier selection experiment assessed the optimality of existing λ models in both H.264 and HEVC reference encoders, and we explored the answers to the questions proposed in Section I. Based on the experimental results above, four important findings are summarized as below.
1) Existing Lagrange multiplier determination models in both H.264 and HEVC reference encoders are close to optimum for I and P/B p frames, but do not perform well for B b frames.2) Optimum λ values for B b frames in both encoders are content dependent -higher for static scenes and lower in cases with significant dynamic content.3) Distortion statistics could be used to predict optimum Lagrange multipliers for B b frames.

IV. PROPOSED ALGORITHM
In order to adaptively predict optimum Lagrange multipliers for B b frames, a novel content-based determination approach is proposed, inspired by the experimental results in Section III and our preliminary model in [9], which uses lower Lagrange multiplier values for dynamic scenes, and higher ones for static content.This method operates under the assumption that within a few temporally localized frames, providing there are no significant content changes, the RD characteristics are approximately uniform.Lagrange multipliers could thus be adaptively modified according to distortion statistics from recently encoded frames.This assumption may of course break when there are scene cuts.To account for this, a shot cut detector should be employed prior to λ adaptation.
A diagrammatic illustration of the proposed method is shown in Fig. 5. Before encoding each frame, possible shot transitions are firstly identified using a scene cut detection approach based on histogram differences.In cases with scene cuts, when the uniform assumption is not applicable, all statistical variables are reset, and the original Lagrange mul- The proposed algorithm consists of three primary sub stages: scene cut detection, distortion information updating and Lagrange multiplier modification.These are described in detail below.

A. Scene Cut Detection
Scene cut detection can be based on numerous measures including histogram differences (HistD), edge change ratio, and sum of absolute differences [41].In the context of video compression, we employ a simple but efficient HistD-based approach with a constant threshold.
In our method, the normalized luma histogram of the current frame, Hist t , is firstly computed alongside that of its previous coded frame (if applicable), Hist tp .Their average absolute difference HistD t is then compared with a fixed threshold TH SC to identify the scene cut.This process is described by ( 13) and ( 14) and HistD t ≥ TH SC , there is scene cut in this frame HistD t < TH SC , there is no scene cut in this frame .
(14) where L represents bit depth, and TH SC is chosen as 0.002 for normalized histogram of 8 bit luminance.This value is empirically obtained through a preliminary training process on limited sequences, and it was found not to be significantly sensitive to content type.
As shown in Fig. 5, when a scene cut is detected, all existing statistical variables are reset, and the original Lagrange multipliers for this frame will be used in rate-distortion optimization.

B. Distortion Information Updating
To adaptively adjust Lagrange multipliers, sufficient distortion information -from at least one consecutive GOP must be recorded.Here two distortion statistics, D P and D B , are defined for P/B p and B b frames respectively.
in which D P/B,k represents the accumulated distortion D P or D B based on frame type, having its initial value set to zero.k is the number of P/B p or B b frames which have been coded, which is counted following the encoding order.MSE P/B,k stands for the mean squared error of the most recently coded frame.θ 1 and θ 2 are pre-configured weighting parameters, combining the existing distortion with the latest MSE.Here θ 1 +θ 2 = 1, and θ 2 > θ 1 .This configuration is to place greater emphasis on recently encoded frames.

C. Lagrange Multiplier Modification
With sufficient distortion information recorded from previously coded frames, the Lagrange multiplier for the current B b frame (λ n ) is adaptively modified from that of the most recently coded B b frame (λ n−1 ) following: where r MSE is the distortion ratio which is derived as follows: in which m represents the number of encoded P/B p frames when n B b frames have been processed.It is noted that the proposed model predicts the optimum Lagrange multiplier values based on the distortion statistics of previously encoded frames rather than those for the current frame.This may lead to a slightly inaccurate estimation, when the rate-distortion performance between frames is not identical.This inaccuracy can be avoided if the Lagrange multiplier is only modified when a(r MSE + d) b + c becomes significantly different from 1.In Section III-C, the bitrate savings obtained using optimum Lagrange multipliers were observed to becomes less significant (below 2%) when r λ falls within a certain range (r 1 , r 2 ).We thus exploit this observation in our approach, keeping the modified Lagrange multiplier constant: In cases when there is significant difference between λ n and λ n−1 , we confine this change to within a ±5% range to avoid noticeable quality variations, i.e.
Using this adaptive algorithm, Lagrange multipliers for all B b frames can be iteratively obtained, and used for mode, partition and prediction selections at various levels in the ratedistortion optimization process.

D. Model Parameters
There are in total nine parameters employed in our adaptive algorithm: a, b, c, d, r 1 , r 2 , θ 1 , θ 2 and TH SC .The former three are obtained based on the power function fitting described in Section III-C.Parameter d is related to the used GOP sizes, and its values for GOP length 4, 8 and 16 are also determined based on the fitting.These three GOP lengths are commonly used in both H.264 and HEVC reference encoders.For other GOP configurations, the value parameter d may vary and could be obtained using the same approach.Moreover, thresholding parameters r 1 and r 2 are configured based on the experimental results in Fig. 3. Finally θ 1 and θ 2 are weighting parameters for updating distortion information.All parameter values are listed in TABLE III.

V. RESULTS AND DISCUSSION
After integration into the H.264/AVC and HEVC reference software, the proposed adaptive Lagrange multiplier determination method was tested on a video dataset with various content at different resolutions.The RD performance of the proposed method is compared with that of the original λ determination model in both encoders under multiple test conditions.The computational complexity of this approach is also estimated.

A. Test Dataset
Thirty-six test clips are used, all in progressive YUV 4:2:0 format, obtained from public video databases including the HEVC recommended test pool [39,42], the DynTex database [37], the BVI video texture dataset [38] alongside other commonly used sequences.These test sequences can be divided into three content classes: (A) static scenes, (B) dynamic scenes, and (C) mixed scenes, as in Section III.Videos in each class can be further classified into four groups according to their spatial resolutions: three at CIF (352×288) resolution, three at 416×240, three at 832×480 and three at 1920×1080.The latter three groups contain videos at different spatial resolutions with identical content, in order to investigate the influence of various resolutions.A description of these videos is provided in Table IV, and their sample frames are shown in Fig. 6.
In order to quantify the content of this dataset, three lowlevel feature descriptors were computed for each original video: mean spatial information (SI), colorfulness (CF) and mean temporal information (TI).The detailed description of these features can be found in [43,44].The coverage and distribution of these features on the test dataset are shown in Fig. 7.It is noted that this dataset offers good coverage over these descriptors, compared with other public video databases reported in [43].

B. Test Conditions
The proposed algorithm was fully tested under six groups of test condition with different GOP structures (GOP length 4, 8 and 16 with hierarchical and non-hierarchical B frames), as summarized in Table VI.Other primary configuration include: JM 15.1 and HM 14.0 were employed as reference modules for H.264 and HEVC respectively; uniform QPs were used for all test frames -from 22 to 42 with an interval as 5; High profile and Main profile were selected for H.264 and HEVC encoders respectively; only one I frame was encoded for each test.
The compression performance of the proposed algorithm for both H.264 and HEVC was benchmarked against the corresponding anchor encoders based on the Bjøntegaard delta measurements (BD-rate and BD-PSNR) [45] for the cases (i) all frames (ii) only B b frames.

C. Test Results for Various GOP Structures
The average bitrate savings together with the mean PSNR gains over the anchor encoders are shown in Table V, where the results under various test conditions are provided for all frames and B b frames only.For seven test configurations, the average BD-Rate and BD-PSNR values are summarized for four resolution groups (I, II, III, and IV) and three content classes (A, B and C).It can be observed that the proposed method consistently offers superior overall performance for different test groups (resolution) and classes (content).The average bitrate savings for hierarchical B frame structure configurations are approximately 3% over both H.264 and HEVC anchor encoders.It is also noted that this improvement becomes more significant on static and dynamic content (Class A and B) than on video clips with mixed content (Class C) for the various test conditions.
In order to provide some indication of perceptual quality improvements using the proposed method, Table VII shows additional comparative results using the PVM metric [23].PVM was chosen as it provides improved correlation with subjective scores across a wide variety of content types and distortions.The results in Table VII show close agreement   with the PSNR-based results in Table V, further validating the benefits of our approach.
Example RD curves comparing the proposed method with conventional H.264 and HEVC reference encoders for three test sequences are shown in Fig. 8.The selected sequences represent typical content from each class based on various test conditions.Evident bitrate savings can be observed from the proposed method over the anchor approach, especially for the 'Fungus' sequence in Class A and 'Shadow' in Class B. The proposed model was also tested on the HEVC reference encoder using the Random Access (RA) configurations (GOP length 8) in JCTVC-L1100 [33] and JCTVC-X0038 [35], in which fixed non-zero QP offset values are utilized for different B frame hierarchical levels.These configurations have been shown to yield improved overall rate-distortion performance.However they may also produce significant temporal quality variations due to the large QP differences between frames.
The compression results for different test groups (classified by resolution) are summarized in TABLE VIII.It should be noted that our method was trained using a constant QP configuration (QP offset equals 0).Therefore this will clearly not be optimum when large QP offsets are employed.Nevertheless, the proposed approach still shows consistent overall bitrate savings across video groups at various spatial resolutions, with an average BD-rate values at 1.1% and 0.5% for JCTVC-L1100 and JCTVC-X0038 respectively.According to the results in [40], the use of fixed QP offset values does not provide optimum R-D performance for all types of content.More significant bitrate savings may therefore be possible if our Lagrange multiplier determination approach is combined with a content-based adaptive QP model.This is a topic for future research.TABLE VIII: HEVC compression results (BD-rate against original HEVC HM) based on the RA configurations (GOP 8) in [33] and [35].

E. Complexity estimation
Finally, the computational complexity of the proposed algorithm was estimated based on the relative execution time   of the proposed and original anchor encoders.The average encoding times for both the proposed method and the anchor were calculated.The results are shown in Table IX which presents the percentage increase in encoding time for the proposed method, referenced against the anchor.The average additional complexity of our approach was found to be insignificant, with only 5% and 2% increases over H.264 JM and HEVC HM modules respectively.The results also indicate that the increases are mainly (more than 90% of the additional complexity) due to the scene cut detection method used, which only consumes linear time O(n) -where n is the number of pixels in each frame.All complexity figures were obtained using an Intel Core i7-2600 CPU @3.40Ghz PC platform.

VI. CONCLUSIONS
In this paper, we conducted a Lagrange multiplier selection experiment using modern hybrid video encoders.Experimental results demonstrate the optimality of existing λ determination methods in H.264/AVC and HEVC reference encoders for encoding I and P/B p frames, but highlight the shortcomings of these models for B b frames.The relationship between two ratio indices -the distortion ratio between P/B p and B b frames, and the ratio between the optimum Lagrange multipliers and the original ones was discovered which led to a new adaptive determination method for B b frame encoding.This approach has been evaluated for various content types and test conditions.The results show consistent RD performance improvement over the anchor encoders, for both H.264 and HEVC with various hierarchical B-frame configurations.BDrate savings average 3% when constant QP values are used for all frames, and 0.5% when non-zero QP offset values are employed for different levels in the B-frame hierarchy.In terms of future work, the authors suggest combining adaptive λ determination with varied quantisation parameters, and also performance evaluation using subjective quality assessment.

Fig. 1 :
Fig. 1: Sample frames from test sequences used in the λ selection experiment.

λFig. 2 :
Fig. 2: The optimum Lagrange multipliers (λopt) versus corresponding original values (λorig).Results for (a) H.264 JM I frame, (b) HEVC HM I frame, (c) H.264 JM P frame, (d) HEVC HM P frame, (e) H.264 JM B frame for GOP length 4, 8 and 16 and (f) HEVC HM B frame for GOP length 4, 8 and 16.The position of each number represents the λopt values for that sequence at a given QP, while the red curves represent the λorig values as a function of QP.In subfigures (e) and (f), numbers in blue, pink, and black colors refer to the results for GOP length 4, 8 and 16 respectively.

Fig. 3 :
Fig. 3: The bitrate savings at various r λ for GOP length 4, 8 and 16.These are based on the results for all frames.Results for (a) H.264 and (b) HEVC.The position of each number represents the bitrate saving at the corresponding r λ ratio for that sequence.

Fig. 8 :
Fig. 8: RD curves for HD sequences Cactus, Drops and Squirrel under various test conditions.

TABLE I :
The video dataset used in the λ selection experiment.

TABLE II :
The GOP settings of the conducted λ selection tests.

TABLE III :
Model parameters used in the proposed λ determination method for both H.264/AVC and HEVC reference encoders.

TABLE IV :
Test clips used for evaluation compression performance.

TABLE V :
Summary of compression results.
N.B.X-HB represents GOP length X with hierarchical B frames, in which X stands for either 4, 8, or 16.X-NHB stands for GOP length X non-hierarchical B frames.

TABLE VI :
Various test conditions.

TABLE VII :
Compression results in terms of BD-rate based on PVM.

TABLE IX :
Complexity analysis.