Efficient Unequal Error Protection Techniques for Tile-Based Transmission of HEVC Videos

The video bitstream compressed by the efficient high efficiency video coding (HEVC) standard is extremely vulnerable to the channel error. For robust transmission of such compressed videos, techniques can be formed based on the specific characteristics of the compression standard. With the introduction of new coding features in HEVC, such as flexible block partitioning and tiles, unequal error protection (UEP) schemes are proposed in this paper aiming to enhance the quality of the important regions. The proposed algorithms are implemented in two and three-levels. For the two-level UEP, tiles are prioritized based on their motion density, which is defined as the ratio of motion vector magnitudes to the block size in a compressed video frame. Furthermore, a three-level UEP is proposed to improve the protection of low-important tiles, which may include moving objects. For this purpose, clustering algorithms, utilizing kernel density estimation (KDE) and density-based spatial clustering of applications with noise (DBSCAN), are modified based on the motion density of coding tree units (CTUs). Indeed, this represents implementation of an object detection algorithm based in compressed domain. Simulation results confirm that proposed UEP schemes achieve better objective quality compared to conventional UEP and equal error protection (EEP) approaches.


I. INTRODUCTION
The usage of services such as video conferencing, multimedia messaging, video sharing and Internet TV has unprecedentedly grown in recent years.During transmission, the probability of error derived from the network congestion and interference presents significant challenges to achieve high video quality at the receivers' end.Therefore, adopting an efficient strategy in the transmission system is crucial to deliver video data at a high quality for an available bitrate.In this role, compression and error correction are essential techniques to maintain an adequate quality of the delivered video stream.In compression, redundancies are reduced so that the compressed video can be transmitted over the band-limited channel.High efficiency video coding (HEVC) is represented as the latest compression standard, whose efficiency is 50 % more than its predecessor, i.e.H.264/ advance video coding (AVC).However, due to its high compression rate, the resulting bitstream becomes highly vulnerable to errors occurred during the transmission [1].
The associate editor coordinating the review of this manuscript and approving it for publication was Nilanjan Dey.
Unequal error protection (UEP) is recognized as one of the most effective techniques for protection of the compressed video bitstream against the channel error.In this method, different parts of the bitstream are unequally protected based on their importance.Application of UEP in the data transmission over the noisy channel was discussed in [2].Since then, the concept has been widely investigated for different applications.The early works relating to UEP scheme in video communications are presented in [3]- [5].From the last decade, a number of UEP techniques were proposed to enhance protection of the compressed video bitstream.These techniques are based on specific characteristics of the compressed video such as scalability [1], group of pictures [5], flexible macroblock ordering [6] and data partitioning [7].Despite the effectiveness of these techniques for videos compressed by H.264 standard, some of them including flexible macroblock ordering and data partitioning are not applicable to HEVC.Recently, some studies investigated UEP techniques suitable for HEVC bitstream, which are based on the scalable video coding [1], [8].Scalable video coding results loss in compression efficiency compared to non-scalable coding.In addition, due to the strong spatio-temporal dependencies between scalable video layers, any error occurred in the hierarchical encoding significantly deteriorates quality of the received video.
In HEVC, video frames can be partitioned into a number of tiles [9].Tiles are a rectangular-shaped group of coding tree units (CTUs), which are separated by vertical and/or horizontal boundaries.They are designed to break prediction dependencies across the boundaries from each other.This is done by disabling both intra-prediction and motion vector prediction across tiles boundaries.Therefore, each tile can be independently processed.This feature improves video coding efficiency as multiple tiles can be processed at the same time.Video applications, such as video conferencing and video surveillance, benefit from region-of-interest functionality of tiles.Region based prioritized encoding of tiles can be done to achieve better visual quality in the specific region of frames.This idea is also extended for transmission of tiles, which are prioritized based on the region-of-interest.Furthermore, flexibility in defining tiles inside a picture increases their suitability in video applications.
In a conducted work, a content aware UEP scheme was presented for the transmission of HEVC frames [10].A motion density based scheme was applied to identify important tiles in video frames.This paper extends this UEP scheme presented in [10] to provide a low-complex and a more efficient protection technique for HEVC bitstream.This is done based on the high temporal correlation between consecutive frames.Due to this correlation, temporally neighbouring tiles can be used to determine importance of the current tile.As a result, the newly proposed technique requires less process in determining the importance of tiles in the current frame.The importance of tiles is expressed in two levels, where motion density of each tile is compared with a threshold value.
Improvement on the video quality can be achieved by increasing protection of high-important tiles, whose motion densities are greater than the threshold value.There is usually a non-uniform distribution of motion density around the threshold value.Tiles containing moving objects tend to have very high values of motion densities compared to other tiles.These large value of motion densities affect the threshold value.As a result, most of the tiles in a frame are classified as the low-important one.In many situations, tiles that are in the neighbourhood of a high-important tile may contain part of a moving object.When such tiles are transmitted as low-important ones, that part of the object can be decoded with an error.Although it may not significantly impact the objective video quality, but results in degradation of the quality of video received by the viewer.Therefore, it is expected that by allocating better protection to such neighbouring tiles, the video quality will be improved.For this purpose, a three level UEP scheme is proposed, where low-important tiles in the video frame are further divided into low-high and low-low sub-levels.Simulation results indicate improvement on the video performance compared to the previous UEP and equal error protection (EEP) techniques.
The paper is organized as follows.Section II discusses motion density, which is applied to detect important regions in a video frame.An analysis of the correlation between tiles in consecutive frames is done in section III.The proposed two-level UEP scheme is presented in section IV.Section V discusses simulation results for two-level UEP method.Section VI discusses the implementation of three-level UEP scheme.Simulation results for three-level UEP are given in Section VII.Finally, conclusions are provided in Section VIII.

II. PRELIMINARIES A. CORRELATION BETWEEN COLLOCATED TILES IN CONSECUTIVE FRAMES
In video sequences, the correlation coefficient between blocks1 is calculated as follows: where N P is the total number of pixels inside one block.X i and Y i are luminance values of i th pixel in blocks of consecutive frames.X and Y are the mean values of X i s and Y i s, respectively.As expected, the above-mentioned correlation coefficient is between −1 and 1.In this paper, if |ρ| < 0.5, a low correlation between blocks is concluded.Otherwise, considered blocks have a high correlation.A high value of the correlation coefficient means that there is a high similarity between considered blocks.In this case, blocks either have consistent motion activity or belong to the same object.
Such this definition can be applied for determining the correlation between collocated tiles in two consecutive frames.For this purpose, in equation ( 1) N P is represented as the number of pixels inside a tile.Figure 1 shows the correlation coefficients calculated for tiles in Sunflower.yuvand Pedestrian.yuvvideo sequences.In the obtained results, correlation coefficients fluctuate between 0.5 and 0.98 for 10 th and 13 th tiles of the given video sequences, respectively.For all frames, correlation coefficient of the 10 th tile of the Sunflower.yuvvideo is greater than or equal to 0.85.This is because the tile had the same object throughout the video sequence.On the other hand, the correlation coefficients for 13 th tile is comparatively lower than those of the 10 th .These are due to the presence of motion activity in the tile.
Collocated tiles may not share similar motion information.A tile in a frame may have higher or lower motion activity compared to the collocated tile of the previous frame.Therefore, measurement of motion activity of tiles is essential.

B. MOTION DENSITY
In HEVC, compressed video frames constitute a number of coding tree units (CTUs), which are divided into coding units (CUs).CUs may be of variable sizes, which are determined based on the size of CTU and coding tree depth.These CUs represent the basic processing unit, in which a coding mode is assigned.When a CU is encoded in inter-prediction mode, it splits into one, two or four prediction units (PUs).The motion vectors constituted by these PUs represent the displacement of CUs between two frames.Therefore, it is obvious that the motion vector magnitude, calculated as an absolute value of horizontal and vertical motion vectors, and size of a CU are important to determine the motion activity of a CU.This concludes importance of motion density (MD) in determining motion activity of CUs, which is defined as follows: where MV is the total magnitude of motion vectors inside a CU and S is the index determined based on the size of CU.For a CU including M × M pixels, where M = 2 n and 3 ≤ n ≤ 6, S is given by: The motion density of a CTU is determined as the average of motion densities of all of its CUs.That is: where N CU is the number of CUs and MD k CU is the motion density of k th CU inside the CTU, calculated by equation (2).
Similarly, motion density of a tile can be defined on the basis of motion densities of its CTUs, as follows: where N CTU is the number of CTUs inside the tile and MD k CTU is the motion density of k th CTU obtained from equation (4).
Figure 2 shows the motion density of tiles of Sunflower.yuvand Pedestrian.yuvvideos.It is observed that a tile has different motion densities in different frames.Even when a tile shares visual similarities in consecutive frames, as indicated by Figure 1, they may have different motion activities.

III. ANALYSIS OF MOTION DENSITY OF TILES IN CONSECUTIVE FRAMES
Although collocated tiles in consecutive frames are visually similar, they may not have similar motion activity.It has been realized that collocated tiles in consecutive frames may have different values of motion density.This is mainly evident when motion information of blocks in consecutive frames are not highly correlated.Such this low correlation is realized when position of a fast moving object is different in two consecutive frames.In this case, the motion information of blocks in the same position of adjacent frames may not be consistent.As a result, motion density of collocated tile in the previous frame may not directly represent motion activity of collocated tile in the current frame.
To overcome the above-mentioned problem, motion activity can be measured for a larger area.This helps to compensate for the difference in motion activity that arises due to fast motion of small objects.Instead of calculating correlation between collocated tiles, the correlation between frames based on the motion density of tiles is considered.For this purpose, equation ( 1) can be modified, as follows: where N T is the total number of tiles in a frame.Similarly, MD n tile,i and MD tile,i are the motion densities of i th tile in n th and (n − 1) th frames, respectively.In sunflower.yuvvideo, correlation coefficients greater than 0.5 and 0.8 are realized for a total of 47 and 149 frames, respectively.Similarly, in pedestrian.yuvvideo, 78 and 173 frames have correlations higher than 0.5 and 0.8, respectively.Considering such this high correlation between number of frames, it is concluded that the motion density of a tile positioned in a previous frame can be utilized to estimate motion activity of the collocated tile in the current frame.This leads to evaluate the importance of a tile in the entire bitstream on the basis of the motion density of tiles from its previous frame.

IV. PROPOSED TWO-LEVEL UEP APPROACH
In the previous section, a high correlation between consecutive frames based on the motion density of tiles was observed.This means that motion densities of tiles in the previous frame can be utilized for determining the importance of tiles in the current frame.This is accomplished by defining a difference between motion densities of considered tiles, as follows: where N f is the total number of video frames.MD i k and MD i−1 k are the motion densities of k th tiles in i th and (i − 1) th frames, respectively.
Let us consider the threshold for evaluating priority of tiles based on the average of motion densities of tiles in the same frame.Then, for k th tile of (i − 1) th frame, the threshold is represented by: The difference between motion density of the k th tile in (i − 1) th frame and its threshold value is given by: Assume MD From ( 7) and ( 11), Thus, k th tile in i th frame (T i k ) is transmitted as a low-important tile.
On the other hand, when MD i k > 0, T i k is transmitted as a low-important tile if, Otherwise, T i k is transmitted as a high-important tile.Alternatively, consider MD From ( 7) and ( 15), it is understood that, Hence, T i k is represented as a high-important tile.On the other hand, when Otherwise, it is a low-important tile.
The common term appearing in equations ( 14) and ( 18) can be represented as the threshold value for determining the priority of k th tile in the i th frame.This is given by: To determine the priority of tiles in the first frame of the video, the average value of motion densities of tiles in the same frame is considered.If the value of motion density of a tile is higher than the average, it is considered as the high-important tile.Otherwise, the tile is a low-important one.For all other frames, 2 ≤ i ≤ N f , UEP method is implemented based on the following algorithm: In order to implement this motion density based UEP scheme, the HM encoder can be modified to calculate the motion density and prioritize tiles during the encoding process.Let M be the number of cores used in encoding of a video bitstream.The configuration for HEVC encoder is set to generate N T tiles per frame.For simplicity of the analysis, it is assumed that all tiles have a similar computational complexity.Let t be the total processing time for each tile by a core, which includes calculation of the motion density and evaluation of the tile's priority.Then, the processing time required to prioritize a tile based on the motion density of tiles at the same frame is approximately given by: When a tile's priority is evaluated based on the motion density of tiles from the previous frame, it can be prioritized without the knowledge of other tiles of the current frame.Therefore, the processing time of a tile is reduced to t.For one frame, the difference in encoding time is given by: For F n frames, the amount of time saved in the encoding process is given by: V. SIMULATION ANALYSIS The results are compared with previous schemes (UEP_OLD) proposed in [10] and Equal Error Protection (EEP).Another technique of prioritizing of tiles based on the number of PUs (UEP_PU) is also considered.In [8], it has been mentioned that the number of PUs inside tiles can be used to determine the processing complexity of tiles.Hence, tiles are prioritized based on their processing complexity.
The error protection is done by a multi-rate (3730, 2238) quasi-cyclic low-density parity check (QC-LDPC) code [12], which is effective to combat bit errors occurred in the physical layer.Out of 2238 message bits, 746 bits are protected with the high priority at rate 0.5, while remaining 1492 bits are protected with the low priority at rate 0.67.In this case, the average of code rate is approximately 0.6.For EEP, a code with the rate of 2  3 is formed based on the method described in [13].To construct block of bits suitable for specifications of the channel code, zero padding is conducted for each tile.Furthermore, size of message bits in chosen error protection codes matches the size of tiles to significantly reduce the number of zero bits.In the provided simulations, the number of padded zero bits is maintained approximately at less than 2% of the overall number of transmitted bits.Codewords are modulated by binary phase-shift-keying (BPSK) and transmitted over additive white gaussian noise (AWGN) channel.
For the convenience of video quality assessment, peaksignal-to-noise-ratio (PSNR) is a most commonly used method [14].There are other objective based methods including structural similarity index (SSIM) and video quality metric (VQM), but they are not used as frequently as PSNR [15].Indeed, PSNR has the limitations of disregarding viewing conditions and characteristics of the human visual system.However, for a particular video content and similar encoding configurations, PSNR values provide reliable interpretations of the video quality [14], [15].In order to illustrate subjective benefits of the technique, visual comparison of original and received video frames is provided.Therefore, this paper provides comparisons of different UEP techniques based on PSNR and visual comparison of video frames.

1) CONFIGURATION OF TILES CONSIDERED FOR SIMULATIONS
Due to partitioning flexibility, tiles can be partitioned in different ways for video transmission.In general, there are two different tile partitioning configurations, uniform and nonuniform.Non-uniform tile partitioning methods are aimed to improve the speed of HEVC encoding by balancing spatial load across different tiles [16], [17].On the other hand, uniform methods have been equally effective in videos with balanced workload distribution.Uniform tile partitioning also avoids the complexity of determining optimal tile boundaries at each frames.In this paper, uniform tile partitioning is considered.
It is well known that increasing number of tiles improves encoding and decoding time.However, high number of tiles may affect coding efficiency and visual quality [16].By contrast, lowering number of tiles may improve visual quality, but it may significantly increase the load imbalance between tiles and the encoding time.Table 1 presents PSNR values for video sequences encoded with different of tiles.These results indicate that the received video quality doesn't significantly vary, when number of tiles are kept at less than or equal to 5 × 5.In this paper, 4 × 4 and 5 × 5 tile partitionings are applied.Furthermore, in simulations, each tile is encapsulated into a slice with a distinct header.This is because at any bitrate, a portion of bits is used to carry header information of slices.As the inclusion of slice headers may result reduction of video quality at any given bitrate, in order to minimize the impact of slice headers, the simulations are conducted with higher bit-rate, 2000 kbps.In this case, headers only consume about 0.5% of total compressed video bits.It should also be considered that encapsulating tiles within slices helps to avoid transmission of extra bits to the decoder.These bits indicate row and column locations of tiles, loop filter control and bit-stream location information of all but first tile in each frame.

B. SIMULATION RESULTS FOR TWO-LEVEL UEP TECHNIQUE
Figure 4 shows the rate-distortion performance of considered Sunflower.yuvand Pedestrian.yuvvideos, when proposed UEP technique is applied.At E b N 0 = 3.2 dB, it is noticeable that the performance of different configurations are close to one another.Although the difference in PSNR value is higher for the lower bitrates, they perform more or less the same at higher bitrate (>2000 kbps).This is because at lower bitrates the video quality is compromised by proportion of bits used to carry header information.At higher bitrates, the proportion of bits carrying header information is significantly reduced and the video quality is improved.Considering the achieved result, the simulations henceforth are conducted for 5×5 tiles per frame at 2000 kbps.Figure 6 shows rate-distortion performance of the considered videos for UEP and EEP techniques.At E b N 0 = 3.2 dB, the proposed UEP scheme shows the best performance.At a bitrate of 2000 kbps, it outperforms UEP_OLD by 0.92 and 0.25 dBs in Sunflower.yuvand Pedestrian.yuvvideos, respectively.To demonstrate subjective comparison at the same bitrate, Figure 7 shows reconstructed frames of Sunflower.yuvand Pedestrian.yuvvideo sequences at

VI. PROPOSED THREE-LEVEL UEP METHOD
The three-level UEP scheme aims to increase the protection of those tiles, which are positioned in the neighbourhood of high-important tiles.As these neighbouring tiles may contain some parts of moving objects, their protection is necessary to improve overall video performance.Therefore, in order to identify tiles that contain moving objects, an object detection algorithm is presented, which is implemented in the compressed domain.The algorithm consists of two main steps, which are segmentation and clustering of CTUs based on their motion density values.In the first step, CTUs in the frame are classified into two groups based on the value of their motion density, which is calculated by equation ( 2).In the second CTU and G 2 CTU .To perform KDE, a kernel function centered at motion density value is formed for every CTU, which is given by [18]: where K [x] is a gaussian kernel function defined as: x 2 ).In this equation, h is called the bandwidth.For a Gaussian kernel, it is given by: where N is the number of CTUs inside a frame and σ MD CTU is the standard deviation of motion densities of CTUs.After determining K f s, the probability density function (PDF) is estimated as follows: where K i f is the kernel function centered at motion density of i th CTU.
The threshold for classifying CTUs is calculated as the first local minima of f (x).This threshold is represented as β.If motion density of a CTU is higher than β, it is included in G 1 CTU .By contrast, CTUs, whose motion densities are lower than β, they belong to G 2 CTU .To study the effectiveness of this technique, motion density of CTUs in different frames of Hall monitor.yuvvideo is evaluated.For 37 th frame, it is realized that most of the CTUs have motion density in the range of 0−5.By contrast, motion density of some CTUs are larger than 40.The threshold for classifying CTUs (β) is calculated as 7.6.This classification of CTUs is shown in Figure 9

B. SPATIAL CLUSTERING OF CTUs WITH HIGH MOTION DENSITY
This process applies density-based spatial clustering of applications with noise (DBSCAN) technique to identify and extract the clusters of CTUs belonging to G 1 CTU [19].Let m be the number of CTUs in G 1 CTU .A bivariate data is then formed, which is a set of m data points, i ), which represents x-and y-axis locations of the corresponding CTU.
Let D p and D q , 1 ≤ p ≤ m, 1 ≤ q ≤ m, be two data points in X.Then, the ε− neighbourhood of data point D p includes D q if d(D p , D q ) ≤ ε.Here, ε is the radius of the neighbourhood region.Similarly, the parameter d(D p , D q ) is the euclidean distance between two data points D p and D q .It is calculated as follows: Thus, the ε− neighbourhood of data point D p is represented as follows: Let η be the minimum number of data points required to form a cluster.An arbitrary data point (D a ), D a ∈ X, is called direct density reachable from D p when following conditions are satisfied: 1 2  Based on DBSCAN algorithm [19], a cluster of data points can be formed, when they satisfy the following properties: 1) All the data points must be mutually density-connected.
2) If a data point is density-connected to any data point of the cluster, it is part of the cluster as well.The parameters ε and η influence the performance of DBSCAN algorithm.In different datasets, different values of the ε and η provide the optimal clustering results.Generally, η is determined based on the number of dimensions and presence of noise in the data set.As a rule of thumb, η = 2 × dimensions of data points, can be used [20].However, larger values are usually preferred for data sets with noise to yield more significant clusters.
As previously stated, m data points represent the location of CTUs.Therefore, there is a limit to the maximum number of data points that can occur within a given neighbourhood.Let η max be the maximum number of data points in a neighbourhood with radius ε.Table 2 shows different values of η max for different values of ε.To form a cluster, minimum value of ε has to be 1, because the closest CTUs are at a distance of 1 unit from another.If ε is set at 1, the minimum number of points required to form the densest cluster becomes 5, as seen from the table.If η = η max , the DBSCAN algorithm only searches for densest clusters, which may not always present in the data.Therefore, to choose an optimum value of η, neighbourhood density is defined as follows: be the distances from i th data point to other m − 1 data points sorted in ascending order.In this case, A = {d(D i , D η−1 )|1 ≤ i ≤ m}, A ∈ Dist i , is a set of distances from each data point to its (η − 1) th nearest neighbour.The smallest of these distances (γ ) is defined by: This γ represents radius of the densest neighbourhood that includes η data points.Based on values of γ and η, ρ nh is calculated from equation (28).If ρ nh ≥ 0.5, ε = γ .On the other hand, if ρ nh < 0.5, η is increased by 1 and the process is repeated until a suitable combination of ε and η are obtained.two groups of CTUs can be noticed.CTUs, whose motion densities are higher than the threshold are indicated in white, whereas other CTUs are black.It is also realized that there may be CTUs with high motion densities even in stationary areas.Such CTUs can be interpreted as noise.To filter them, density based clustering algorithm was applied.Figure 9 (c) shows the cluster of CTUs formed based on DBSCAN technique, which represents the detected region.These regions are spread across tile boundaries.Thus, tiles that encapsulate the detected region are prioritized.
Based on the moving object detection technique mentioned in above, a three-level UEP of HEVC compressed frames can be formed.The method is summarized in the following algorithm, which is implemented for each frame.

VII. SIMULATION RESULTS FOR THREE-LEVEL PROTECTION
Simulations are performed to evaluate the performance of proposed three-level UEP technique.In this case, the error protection is achieved by a (4716, 3144) UEP QC-LDPC.Out of 3144 bits, 524, 1048, 1572 bits are transmitted as a high, low-high and low-low priority, respectively.The average code rate is maintained at 0.67 [12].Other settings including    modulation and channel type are kept the same as the previous simulations mentioned in section V.
The performance of proposed three level UEP technique is compared with the previously proposed two-level UEP method.For this purpose, the error protection for two-level  N 0 = 4 dB, which is 0.6 dB higher than the previously proposed two level UEP.This is due to utilizing a higher code rate on the UEP QC-LDPC code.

VIII. CONCLUSION AND FUTURE WORK
In this paper, efficient UEP techniques were proposed for robust transmission of HEVC compressed videos.Analyses were conducted to verify the correlation between collocated tiles in consecutive frames.Based on the correlation existed between consecutive frames, a two-level UEP was proposed.Simulation results showed that the proposed two-level UEP scheme provides a significant improvement compared to EEP technique and other UEP methods.In addition, a threelevel UEP technique was proposed to optimize the overall code rate.The three-level UEP was applied to improve the performance of tiles which had lower motion density, despite including part of a moving object.A low-complexity DBSCAN based object extraction method was implemented to identify such tiles.Results show that in terms of average PSNR, the three-level UEP method outperforms previous UEP scheme by 3.5 dB.Furthermore, improving performance of the tiles, which includes parts of a moving object, is expected to improve the subjective video quality as well.
As a future work, a detailed analysis and comparison of uniform and non-uniform tile partitioning schemes will be done to investigate the best tile partitioning configuration.Similarly, the research will focus on extending proposed scheme in application layer, which requires protection against packet/tile losses occurred during transmission.In addition, investigating a more effective technique for identifying moving objects in video frames will also be considered.

FIGURE 2 .
FIGURE 2. Motion densities of tiles in different frames.

Figure 3
Figure3shows motion density based correlation of sunflower.yuvand pedestrian.yuvvideos.The values are shown for 200 frames.In sunflower.yuvvideo, correlation coefficients greater than 0.5 and 0.8 are realized for a total of 47 and 149 frames, respectively.Similarly, in pedestrian.yuvvideo, 78 and 173 frames have correlations higher than 0.5 and 0.8, respectively.Considering such this high correlation between number of frames, it is concluded that the motion density of a tile positioned in a previous frame can be utilized to estimate motion activity of the collocated tile in the current frame.This leads to evaluate the importance of a tile in the entire bitstream on the basis of the motion density of tiles from its previous frame.
1 and repeat steps 2 to 5. Otherwise, stop the algorithm.

FIGURE 4 .
FIGURE 4. PSNR results for different tile partitioning configurations.

Figure 5
Figure5shows PSNR results of 1080p Sunflower.yuvand Pedestrian.yuvvideo sequences, respectively.Overall, the video performance provided by UEP schemes outperform EEP technique.This is because UEP schemes allocate higher protection to selected tiles in a video frame.For Sunflower.yuvvideo, at E b N 0 = 3.1 dB, the performance of the video protected based on the new technique is improved by 3.9, 4.8 and 5.6 dBs compared to UEP_OLD, UEP_PU and EEP techniques, respectively.A similar result is obtained from pedestrian video, at E b N 0 = 3.1 dB, the proposed method outperforms other UEP techniques by 7 dB.Figure6shows rate-distortion performance of the considered videos for UEP and EEP techniques.At E b N 0 = 3.2 dB, the proposed UEP scheme shows the best performance.At a bitrate of 2000 kbps, it outperforms UEP_OLD by 0.92 and 0.25 dBs in Sunflower.yuvand Pedestrian.yuvvideos, respectively.To demonstrate subjective comparison at the same bitrate, Figure7shows reconstructed frames of Sunflower.yuvand Pedestrian.yuvvideo sequences at

FIGURE 5 .
FIGURE 5. PSNR results for videos at different E b N 0 s.

FIGURE 6 .
FIGURE 6. PSNR results for videos at different bitrates.

FIGURE 7 .
FIGURE 7. (a) The 22 nd frame of Sunflower .yuvsequence (b) The 43 rd frame of Pedestrian.yuv sequence.
(b).In the Figure, CTUs belonging to G 1 CTU are indicated in white, and other CTUs are represented in black.
The number of points inN ε (D p ), |N ε (D p )|, is greater than or equal to η.That is, |N ε (D p )| ≥ η.In this case, D p is called a core point.D a is also called a core point if |N ε (D a )| ≥ η.Otherwise, it is considered a border point.Another arbitrary point (D b ), 1 ≤ b ≤ m, b = p, D b ∈ X, is called density reachable from D p ifthere exists a sequence of points D j = {D 1 , • • • , D n } with D 1 = D p and D n = D b , such that each of D j s, 2 ≤ j ≤ n, is directly density reachable from D j−1 .Lastly, any two points (D c ) and (D d ), (D c , D d ) ∈ X, are called density-connected if there is a point D p such that D p and D c as well as D p and D d are density reachable.

Figure 8
provides an illustration of these concepts.For better understanding, five points are labelled and indicated in distinct colour and a circle of the same colour represents neighbourhood of each point.In the figure, point A is direct density reachable from point B. Similarly, points A and E are density reachable from point C through points B and D, respectively.Thus, they are all density connected.In this case, points B, C and D are called core points and points A and E are called border points.

Figure 9
Figure9shows the application of above-mentioned steps in three different videos.The three videos shown above are 1080p Sunflower.yuv,4CIF Soccer.yuv and CIF Hall monitor.yuv.During encoding, the size of CTU is set as 64 × 64 pixels for 1080p Sunflower.yuvvideo.By contrast, the CTU size is set as 16 × 16 pixels for the low resolution 4CIF Soccer.yuv and CIF Hall monitor.yuvvideos.In Figure9(b), two groups of CTUs can be noticed.CTUs, whose motion densities are higher than the threshold are indicated in white, whereas other CTUs are black.It is also realized that there may be CTUs with high motion densities even in stationary areas.Such CTUs can be interpreted as noise.To filter them, density based clustering algorithm was applied.Figure9(c) shows the cluster of CTUs formed based on DBSCAN technique, which represents the detected region.These regions are spread across tile boundaries.Thus, tiles that encapsulate the detected region are prioritized.Based on the moving object detection technique mentioned in above, a three-level UEP of HEVC compressed frames can be formed.The method is summarized in the following algorithm, which is implemented for each frame.

Algorithm 2 1 )
Evaluate tiles T l , 1 ≤ l ≤ N tile , in a frame as low-or high-importance based on Algorithm 1. 2) Calculate motion density of all CTUs in a frame.(MD k CTU , 1 ≤ k ≤ N ) 3) Perform segmentation of CTUs.a) Initialize two empty groups of CTUs.G 1 CTU = ∅, G 2 CTU = ∅.b) Calculate f (x) from equation (25).c) Set: β = first minima of f (x). d of X .b) Initialize an empty cluster C, C = ∅.c) Calculate optimal values of and η. i) Initialize: η = 4. ii) Obtain Dist i , A and γ .iii) Calculate ρ nh .iv) If ρ nh < 0.5, η = η + 1, go to step (ii).v) If ρ nh ≥ 0.5, ε = γ .d) For each of CTU k s ∈ X , i) If CTU k belongs a core point or a border point, it belongs to the cluster.That is, CTU k → C. 5) If a tile (T l ) is evaluated as low-important, a) If CTU l ∈ C, 1 ≤ l ≤ L c , 2 is inside T l , it is evaluated as low-high important tile.b) If the tile doesn't include any CTU l ∈ C, it is evaluated as low-low important tile.
UEP is achieved by (5222, 3730) UEP QC-LDPC code, which has a code rate of 0.71.Out of 3730 message bits, 1492 bits are transmitted with a high priority, whereas other 2238 bits are transmitted with a low priority.Figures 10 and 11 show PSNR values for Sunflower.yuvand Pedestrian.yuvprotected by two different UEP schemes in terms of E b N 0 .UEP_Three_LEVEL_Proposed provides better performance than UEP_Three_LEVEL_CU implemented based on the numbers of CU partitioning inside tiles.It is also noticed that the peak value of PSNR is attained at an E b In this section, performance of the proposed UEP technique (UEP_NEW) is evaluated based on conducted simulations.The evaluation is done according to objective video quality measurement and visual comparison of uncompressed and received video frames.
A. SIMULATION SETUPIn all simulations, video sequences are compressed based on same configuration, including frame sizes, frame rate, specifications of tiles and number of frames.Two 1920×1080 video sequences including 200 frames, in IBBBPBBBP...format are encoded by HM reference software[11].The encoding is done at the rate of 24 frames per second.

TABLE 1 .
PSNR (dB) values for different configuration of tiles.

TABLE 2 .
Maximum number of CTUs in a neighbourhood.