Fast Depth Map Coding Based on Bayesian Decision Theorem for 3D-HEVC

The depth map compression of 3D High-Efficiency Video Coding (3D-HEVC) inherits prediction structure adopted by HEVC and develops supplementary coding modes to better express depth image. These new coding modes combine with existing technology implement high coding efficiency, but lead to an extremely huge increase in coding time. This article proposes a rapid coding method for depth map in 3D-HEVC. The proposed scheme utilizes the Bayesian decision rule and the correlations between corresponding texture video and spatially adjacent treeblocks to analyze the treeblock features of depth map. According to the analysis, we propose two approaches, including early SKIP/Merge mode selection, and adaptive CU pruning termination. Simulation consequences illustrate that this paper can save 51.2% complexity while maintaining the rate distortion (RD) performance.


I. INTRODUCTION
During the last few years, three-dimensional (3D) video applications, such as 3D movies, 3D games, and FTV, have attracted a greatly attention [1]. MPEG is currently developing mulit-view video plus depth (MVD), which is depth-enhanced 3D representation [2], [3]. The depth map denotes the geometric distance between the objects and camera that can generate intermediate views by utilizing the depth image-based rendering (DIBR) [4]. Therefore, JCT-3V exploit 3D-HEVC for enhancing the denseness performance of depth map [5], [6].
Since depth image has large homogeneous areas demarcated by sharp object edges, there is a difference between the characteristic of depth map and texture video [7]. Different from texture coding, it is significant to conserve the depth sharp edge compared with the visual quality. According to the feature, some prediction techniques are designed in 3D-HEVC, such as Depth Modeling Modes (DMMs) [8], Depth Intra Skip [9], and segment-wise depth coding (SDC) [10] for a better compression of edges in depth image. The increase in coding performance is accompanied by a substantial increase in coding time. Consequently, The associate editor coordinating the review of this manuscript and approving it for publication was You Yang . a rapid depth coding approach is devised in this article, which decrease the coding runtime with maintaining coding quality.
A number of literatures about fast depth coding methods [11]- [17] have been reported to decrease the complexity of multi-view video coding (MVC). All these algorithms show good performance in terms of reducing coding complexity, while loss of video quality for MVC is very small. However, these approaches are not well appropriate for 3D-HEVC, where new additional quadtree structured and depth intra modes introduced for 3D-HEVC are not taken.
Recently, studies in 3D-HEVC have been conducted to decrease the depth complicacy. In [18], a fast method for depth map of 3D-HEVC is introduced for allocating the complexity, in which used the edge classification for the intra prediction selectively omits DMMs. In [19], a fast coding method is devised to speed up the depth map prediction processes, where it employs the correlation of neighboring CUs depth and the correlation of texture-depth to adaptively adjust the mode decision processing. Based on an adaptive threshold pattern, an early SKIP is developed in [20] to decrease complicacy of the inter frames. In [21], a fast approach is utilized based on detecting edge regions to increase depth coding runtime saving, when the treeblock does not contain object edges in depth map, the DMMs are not tested. In [22], a fast approach and early mode decision method are presented to expedite the depth compression. Reference [23] devices a fast depth coding approach to decrease the number of modes. A quadtree limitation method based on data mining is introduced in [24] for reducing the complexity of depth map. A low complexity intra decision approach is utilized based on the correlations of spatiotemporal, inter-component and inter-view in [25] for accelerating coding. Reference [26] designs a fast edge detection-based approach for the depth map, which increases the time saving of depth map while maintaining the encoding efficiency. By discovering the flat region and texture orientation of the depth map, [27] devise a fast edge detection-based decision method that can decrease the encoding runtime. Reference [28] introduces fast static decision trees-based coding method for the depth map of 3D-HEVC. Fast approaches for the mode decision of 3D-HEVC are devised in our previous works [29] and [30] to reduce the coding complexity. The above methods for 3D-HEVC are well devised depth compression with achieving coding time saving. Nevertheless, most of the above schemes significantly reduce the RD performance.
Bayesian theorem that is a promising method has been proved to be able to effectively reduce the coding complexity, so it is widely used in video compression applications. In some recent works of HEVC, Bayesian has been researched [31]- [34]. To the best of our knowledge, there are few studies on its depth map compression using Bayesian theorem in 3D-HEVC. This paper designs the depth map coding of 3D-HEVC based on the fast Bayesian theorem, which composes early SKIP/Merge mode decision and adaptive CU pruning termination.
The remaining of this paper is divided as follows: The mode statistical analysis of depth map coding is introduced in Section 2. Section 3 describes the proposed rapid method to proceed in detail. Sections 4 discusses experimental results and Sections 5 considers conclusion.

II. OBSERVATIONS AND ANALYSIS
At all the possible depth levels in depth coding of 3D-HEVC, the optimum mode is the mode with the smallest RD cost in the mode decision method, which is performed as, where SSE luma and SSE chroma are the distortion of the present treeblock on luminance and its rehabilitated treeblock on chromaticity, respectively. ω chroma denotes the weight parameter in chroma; λ mode and R mode are Lagrange multiplier and the total bitrate cost, respectively. The ''try all then select the best'' scheme provides the excellent RD performance, however also requires heavy complicacy which restricts encoder from the real-time application.
To better understand the features of depth map, some statistical researches are carried out, including the prediction 3D-HEVC coder pattern and the allocation of calculation complicacy. The experiment is tested on 3D-HEVC  Table 1 shows that an inter mode distribution among different QPs are studied by utilizing the HTM 16.1. It can be seen that most treeblocks select SKIP/Merge as the optimal mode in inter coding, the percentage of SKIP/Merge treeblocks increases with the increase of QP. About 87.1% select SKIP, while no more than 5.1% select Merge, 2N×2N, Intra, and other inter modes in depth map treeblock. Since depth videos with more fixed area mostly chooses the SKIP/Merge mode. Particular for small global motion sequence selecting SKIP mode is extremely high, that is 98.9% in QP 45. Therefore, if the proposed scheme pre-determined SKIP/Merge as the best inter mode of depth map, lots of computation can be reduced. Table 2 illustrates the depth level distribution. The CU depth levels of depth map includes ''Level 0'', ''Level 1'', ''Level 2'' and ''Level3''. It is observed from Table 2 that 66.5%, 17.9%, 9.2% and 6.5% of depth map treeblocks select depth ''Level 0'', ''Level 1'', ''Level 2'', and ''Level 3'', respectively. The percentage of selecting ''Level 0'' is considerably high on the average value, such as sequence ''Poz-nanhall2'' with a large homogeneous texture is 78.6%. Simultaneously, it is found that about 84.4% of treeblocks select the ''Level 0'' and ''Level 1'', and selecting ''Level 2'' and ''Level 3'' as the best depth level is only about 15.7% of depth treeblocks. Fig.1 shows the complexity in different maximum CU depth levels. From the figure we can conclude that coding complexity increases considerably with the increase of depth. In addition, Table 2 demonstrates that the percentage of a depth map treeblock coded with ''Level 2'' and ''Level 3'' are very small. During CU pruning procedure, if we utilize an adaptive termination scheme, the coding time saving is increased in a larger CU depth, but the RD performance will slightly degrade. Thus, if an early termination method is determined in CU pruning procedure, the increase of compression time will be ignored in larger CU depth.

III. PROPOSED ALGORITHM A. EARIL SKIP/MERGE MODE SELECTION BAESD ON BAYESIAN DECISION RULE
In depth compression of 3D-HEVC, the inter treeblocks are selected including SKIP/Merge that has no reference to picture index and coded motion vector delta in encoding process, normal inter, and intra modes. Thus, when SKIP/Merge is pre-decided, the mutable inter mode calculation of the current CU in depth coding is notably decreased. As analyzed in Section 2, if depth map treeblock is early discerned by looking for an appropriate condition, then SKIP/Merge mode will be chosen probably. In 3D-HEVC encoders, it can simplify the inter mode decision and skip large computational complexity.
Bayesian classifier is a formula that computes conditional probability and calculate posterior probability given several observations, and it also can detect if the inter mode for the current treeblock is optimal. According to this concept, an early SKIP/Merge decision based on Bayesian theorem is developed before the full RD calculating for depth coding. If SKIP/Merge is selected based on Bayesian decision rule, then the current treeblock is coded as SKIP/Merge mode, and other inter-intra modes are no longer calculated. Meanwhile, if the Bayesian determines the current treeblock which is not SKIP/Merge, and then other inter-intra modes are tested by the 3D-HEVC encoder.
The SKIP/Merge probability depends on the depth video itself. The SKIP/Merge distribution of depth map is given in Section 2. It represents the SKIP/Merge probability for different QPs in depth coding. Now, Table 1 shows that the percentage of SKIP/Merge varies markedly by the QP value increase, and selecting SKIP/Merge mode is more frequent than the other inter modes. According to this characteristic, early SKIP/Merge selection based on Bayesian decision rule is introduced to avoid testing unnecessary inter modes of the 3D-HEVC depth map. Let κ 1 , κ 2 , κ 3 , κ 4 denote SKIP/Merge, Inter 2N×2N, Intra modes, and other inter modes, respectively. For determining the SKIP/Merge mode, Bayesian theorem is utilized to obtain the posterior probability P κ i x , where p (x) denotes the mixture density function, P (κ i ) and p x κ i denote a priori probability of κ i and the conditional probability density function, respectively. According to Bayesian decision theorem, that coding mode is defined by, where P (κ i ) and p x κ i are unknown in depth encoding, we can estimate them from the corresponding texture treeblocks (as shown in Fig.2). Simulation results verified that the a priori probabilities are similar in corresponding texture treeblocks [30]. Thus, P (κ i ) and p x κ i from the corresponding texture treeblocks are estimated. According to this content, a tolerance threshold ε is introduced to choose the optimal inter mode κ i for depth coding, The inter mode prediction will stop if this condition is met, and determine the SKIP/Merge mode for the current depth treeblock early.

B. ADAPTIVE CU PRUNING TERMINATION BASED ON BAYESIAN DECISION RULE
In Section 3.2, the Bayesian decision rule allows for an early SKIP/Merge mode selection. If the Bayesian decision rule is skipped, then the current depth treeblock is coded as SKIP/Merge pattern and other depth levels are not computed. On the other hand, if the Bayesian rule determines SKIP/Merge mode as non-effective, the CU should be split. Therefore, an adaptive CU pruning termination method for depth treeblock based on Bayesian classifier is exploited to further decrease the computational complexity.
Each CU of depth map has two conditions, not-splitting and splitting. Let η 1 and η 2 represent the pattern of notsplitting and splitting, respectively. For a CU of depth map, we use Bayesian theorem for the adaptive CU pruning termination, where P η i y denotes a posteriori probability of choosing pattern η i based on an observation y. P (η i ) denotes a priori probability, P η i y and p (y) denote the probability density and the mixture density function, respectively. In fact, the prediction modes of the depth map treeblock and the corresponding texture and spatially adjacent treeblocks are similar. Based on this analysis, a mode complexity parameter (DMC) is defined as, where σ denotes the weight factor, i σ i = 1; ς denotes the directions weight factor. ς is defined based on the impact of the relevant treeblock of the current depth treeblock; is a set of predictors including the spatial and co-located texture video treeblocks. Our method uses the DMC in a depth treeblock as the observed feature on Bayesian decision rule. It can be observed that most of not-splitting CUs are with small value of DMC. At the same time, the percentage of not-splitting CUs gradually obtain smaller value of DMC. Based on the extensive experiments, when the value of DMC is larger than 1.2, the p y η 2 is larger than p y η 1 .
In 3D-HEVC, making the wrong decision will cause coding efficiency loss. The proposed algorithm uses CL 1 which represents the loss of compression efficiency for a splitting CU wrong choice as the not-splitting CU. Meanwhile, CL 2 is represented as the loss of compression efficiency for a notsplitting CU wrong choice as the splitting CU. Based on the above explanation, Bayesian costBC η i y is calculated as follow, where E [·] represents the expected value. When BC η 1 y is smaller than BC η 2 y , it means that mode η 1 should be choose and we can run early CU pruning termination procedure. To explain the Bayesian error, p (η i )and p y η i are predicted from previously encoded depth map, we present the tolerance parameterχ, where the tolerance parameter range is χ ≤ 1. When a treeblock of depth map be in line with Eq. (8), the processing of the CU pruning can be terminated. In this section, p (η i ) and p y η i are not clear in the current depth treeblock, we can access these parameters from previously coded frames of depth map. Therefore, the proposed method does not require a different training sequence. Actually, the proposed method uses a few depth map frames (about 5 frames) to compute the primary values for the prior and likelihood. Therefore, Bayesian decision rule does not provide any select in the first training frame of the depth video. Moreover, the full RDO of the HTM encoder supplies all the determinations to compute the appropriate parameters. And the proposed scheme is used for the successive frames of depth map.

C. OVERALL ALGORITHM
In the light of the analysis in the previous section, the overall method consists of early SKIP/Merge mode selection and adaptive CU pruning termination. The specific steps of the overall proposed method are as below: Step 1: Begin execution mode decision with a depth treeblock. VOLUME 10, 2022 Step 2: Perform early SKIP/Merge selection. Compute the a priori probabilities P (ν i ) andp x ν i from the corresponding texture treeblocks.
Step 3: Determine the RD cost the current depth treeblock based on Eq. (4), if merely SKIP/Merge utilized for optimal mode, and get to Step 6.
Step 4: Execute adaptive CU pruning termination. Calculate the p (η i ) and p y η i from previously coded frames of depth map. Compute the DMC of current treeblock in Eq. (6), look up the p y η i based on DMC.
Step 5: If terminate the CU pruning processing, else the current treeblock of depth map is split.
Step 6: Determine the optimum mode.

IV. RESULTS AND ANALYSIS
In order to accurately assess the performance of the suggested approach, eight MVD videos in CTC [35] are utilized to conduct the experiment on HTM 16.1 [36]. Test sequences mainly include the composite video with high exactness depth map and the natural video with estimated depth map in Table 3. The simulation settings are as follows: three-view case, QPs: (25,34), (30,39), (35,42) and (40, 45). The ''VSRS-1D-Fast'' software is utilized for tests [37]. The Bjontegaard Delta Bitrate (BDBR) [38] is utilized to evaluate the coding performance of the suggested algorithm. The coding runtime saving ''TS'' is employed to measure the complexity, which is defined as, where Time proposed andTime Anchor represent the coding runtime saving of the suggested algorithm and anchor approach, respectively.

A. RESULTS OF INDIVIDUAL STAGE
Tables 3-4 illustrate the results of the suggested algorithm, which includes early SKIP/Merge mode selection (ESMMS), and adaptive CU pruning termination (ACUPT). Table 3 illustrates the performance of the ESMMS approach with 3-different ε values compared with HTM16.1. It is observed from Table 3 that ESMMS approach can extremely decrease depth encoding time with resemble coding performance. It saves the depth encoding time by 37.6%, 45.9% and 52.1% when ε = 0.1, 0.5, and 0.9, respectively. The large homogeneous sequence, such as ''Poznan_Hall2'', can extremely reduce coding runtime, but it is still distinct for large motion sequence, such as ''Undo_Dancer''. Meanwhile, the loss of RD performance is acceptable, less 1.16% BDBR increase on synthesized view when ε = 0.5, and 0.9, respectively. The simulation results demonstrate that ESMMS method can efficiently skip unnecessary inter mode. Table 4 shows that about 10.9%, 12.5%, and 13.7% encoding time have been saved on average when χ = 0.2, 0.5, and 0.8, respectively. However, the average BDBR of the synthesized view increase is less 0.27% with the ignored loss of coding efficiency. Thus, the ACUPT approach can maintain the coding performance and reduce the CU processing time.
B. THE PERFORMANCE OF THE PROPOSED OVERALL SCHEME Table 5 indicates the results of the overall scheme that includes ESMMS and ACUPT method. According to the simulation results in Section 4.1, we set the ε = 0.5, and χ = 0.8. This is because that the RD performance and encoder complexity are rebalanced by setting two parameters. It can be found from Table 5 Fig.3 illustrates more detailed results of the entire approach for test sequence ''Kendo'' and ''Undo_Dancer''. It is found from Fig.3 that the overall scheme has consistent encoding runtime saving while maintaining the similar RD performance. In addition, the coding time savings will increase when the compression bitrate increase and QP reduce. Specifically, the percentage of SKIP/Merge is due to ESMMS. With the increment of QP, the percentage of not-splitting for ACUPT are both increased.

C. COMPLEXITY ANALYSIS
As depicted in Section 3.2, the coding parameters of the density distribution are first offline trained, and then the parameters are updated by using several frames (about 5 frames). Therefore, only the compression time of the training frame will influence the complexity. Based on this, the ratio of training time OT is described as follows, where T t and T o denote the coding time of training frame and other testing frames, respectively. Fig.4 shows the proportion time of OT with the proposed scheme. It can observe that the training frame spends 4.4% of entire compression time on average, which can be acceptable ratio. Furthermore, when the computational complexity is decreased, the encoding runtime consumed by the training frame is included in the coding results of the entire algorithm. Simulation consequences from Fig.4 show that the suggested method can be used in practical applications.

D. RESULTS OF THE PROPOSED SCHEME COMPARED WITH THE STATE-OF-THE-ART METHODS
Except for HTM16.1 encoder, the suggested entire scheme is also compared to the well-known rapid and efficient 3D-HEVC methods that includes EBIMS [18], CRDSD [20], FMDGI [23], EISSI [25], and FMDTE [30] in Figs. 5 and 6. Table 5 shows that the coding time saving of the overall method is 51.2% and the BDBR increases only 1.07% for synthesized view. We can see from Figs. 5 and 6 that five recent methods have fine coding performance, yet the encoding runtime savings of the last methods are less than the proposed scheme. Specifically, the BDBR of the EISSI scheme is the smallest, while the BDBR of the FMDGI approach is the largest in these methods. For all the 3D sequences, the proposed overall algorithm exhibits the largest computation decrease. Compared with five recent methods, the proposed overall method has superior computation decrease. About 8.0%-29.7% encoding time of the depth coding is further VOLUME 10, 2022 saved. Moreover, the BDBR loss can be negligible, less than 0.7% BDBR increase compared with EBIMS CRDSD, EISSI, and FMDTE, and with a better coding efficiency than FMDGI method. Compared with five recent methods, the proposed overall scheme has best encoding time saving with almost equal or better RD performance. Simulation results show that the overall scheme can surpass the latest scheme and maintain a better coding performance.

V. CONCLUSION
To decrease encoding time, a fast scheme is introduced for the depth map compression of 3D-HEVC in this article, which consists of the early SKIP/Merge mode selection scheme and adaptive CU pruning termination scheme. The proposed scheme mainly utilizes the Bayesian decision rule and the correlations between corresponding texture video and spatially adjacent treeblocks to analyze the treeblock properties of depth map and avoid the calculation of the unnecessary modes. Simulation consequences indicate that the proposed algorithm can effectively decrease coding complexity with only an ignorable loss of RD performance. Furthermore, the computational complexity and coding efficiency are rebalanced by regulating the parameters.