Reversible Integer-to-Integer Wavelet Filter Design for Lossless Image Compression

In this paper, factors that inﬂuence the lossless compression efﬁciency of reversible integer-to-integer (ITI) wavelet ﬁlters are presented. It is observed that there is a converged region for the bitrate of the compressed subbands. Reaching the converged region, the bitrate increment is equal to log 2 K bps, where K is the increment in ratio of the coefﬁcient magnitude mean. The advantages of the 5/3 ﬁlters for lossless compression are then clearly explained. A new quantity, R mn , which measures the efﬁciency of reversible ITI transforms on ﬂat spectrum inputs, is deﬁned and is used in the new reversible ﬁlter design. First, keeping the distinctive merits of the 5/3 ﬁlters for lossless compression, the half-band product ﬁlter series of the Daubechies wavelet series is selected as the analysis high-pass ﬁlters. Next, the analysis low-pass ﬁlter is found by experimentally minimizing the R mn value. Finally, it is determined that the 17/11 ﬁlters, which employ the half-band product ﬁlter with 6 vanishing moments, have the highest compression efﬁciency. Coding results show that the new 17/11 ﬁlters lead to a bitrate reduction of 1.7% over the 5/3 ﬁlters for lossless image compression. With further enhancements on the transform part and combining the enhanced transform with more efﬁcient entropy coding, bitrate reductions of about 3.1% and 4% respectively over the state-of-the-art lossless compression algorithms JPEG-LS and JPEG2000 lossless mode are achieved.


I. INTRODUCTION
Today, lossless compression can be found everywhere. In certain applications, such as military and medical applications, lossless compression is indispensable. Lossless compression is also very often used in routine applications. Therefore, lossless compression has always been an important and hot research topic [1]- [6], even if it is not a new research area. In terms of compression efficiency, currently, JPEG2000 lossless mode [7] and JPEG-LS [8] are the stateof-the-art lossless image compression algorithms, and overall JPEG-LS has, perhaps, a slightly higher compression efficiency than JPEG2000 lossless mode.
Lossless image compression algorithms have essentially two parts. The first part is to remove the correlations among the pixels in the original image, and the second part is to entropy code the decorrelated residues or coefficients. For The associate editor coordinating the review of this manuscript and approving it for publication was Michele Nappi . the two state-of-the-art lossless image compression algorithms, JPEG2000 lossless mode uses the reversible integerto-integer (ITI) 5/3 wavelet filters for the first part and EBCOT for the second part [7], whereas JPEG-LS uses the LOGO-I predictor for the first part and Golomb-Rice coding for the second part [8]. This paper focuses on the first part. Instead of proposing new predictors, we try to improve the compression performance of the reversible ITI filters.
It is known that reversible ITI filters/filterbanks are obtained from the lifting scheme of exact-reconstruction filters/filterbanks. Once exact-reconstruction filters/filterbanks are implemented using lifting steps, their reversible ITI versions are obtained trivially by rounding the lifting steps such that the output of each lifting step is integers. Therefore, the task of finding reversible ITI filters/filterbanks is to find the lifting scheme of exact-reconstruction filters/filterbanks. As a result, a lot of research work on reversible ITI filters has been focused on how to realize the lifting scheme for exact-reconstruction filters/filterbanks and on the associated issues with the implementation of lifting schemes, such as symmetric extension, color transform, . . . etc. [9]- [13] The mechanism on how the nonlinear rounding of the lifting steps affects the compression efficiency is still not well understood. An easily accepted and thus widely used intuitive conjecture is that for a parent transform that has high compression performance for lossy compression, if, at the same time, fewer rounding approximations are made in constructing the reversible ITI version so that it is closer to the parent transform, then, the resulting ITI transform would have high compression performance for lossless compression [14], [15]. This leads to using the design method for lossy transform in designing the parent transform for ITI lossless compression, for example, using the method of optimizing the coding gain [16] for constructing the parent transform [11]. Using this method, some high-performance ITI filterbanks were found [11]. Nevertheless, some important features exclusively for reversible ITI filters on lossless compression were ignored. Indeed, the lossless compression performances of the most popular 9/7 and 5/3 filters cannot be explained using only the coding gain and the rounding errors.
It can be easily checked that for images, the coding gain of the 9/7 filters is higher than the 5/3 filters. Indeed, for lossy compression, real image coding results do show that the 9/7 filters have much higher compression performance. However, for lossless compression, the 5/3 filters achieve consistently higher compression efficiency than the 9/7 filters. Researchers might believe that this is caused by the fact that the 9/7 filters need 4 lifting steps whereas the 5/3 filters only need 2 lifting steps, and the poorer lossless performance of the 9/7 filters is caused by the larger accumulated rounding errors of the 9/7 ITI lifting version. However, in [15], the number of rounding operations is reduced by employing the 2D nonseparable decomposition. Coding results of [15] show that even if the number of rounding operations is cut by a half for the 9/7 filters, i.e., from 16 to 8 for a 2D decomposition, the compression performance is not improved. Furthermore, if the coding results from using the conventional 5/3-filter decomposition are compared with those from using the nonseparable 9/7-filter decomposition, i.e., both decompositions use the same number of 8 rounding operations for a 2D decomposition, the 5/3 filters still consistently achieve higher compression ratio. This shows that other than the coding gain and the accumulated rounding errors, there are still some important mechanisms that need to be considered in evaluating the compression performance for reversible ITI filters on lossless compression.
In this paper, factors affecting the lossless compression performance of reversible ITI filters are analyzed from the perspectives different from the conventional intuitions. Based on the analysis, new reversible ITI filters are designed, and their lossless compression performance is then tested on real images. For simplicity, we only focus on 2-channel biorthogonal exact-reconstruction filters in this paper.

II. REVERSIBLE ITI FILTER DESIGN FOR LOSSLESS COMPRESSION
It is known that all the FIR wavelet filters can be realized using the lifting scheme [17]. A general lifting scheme signal flow chart is shown in Fig. 1. Only the decomposition part, i.e., the forward transform, is shown, since the reconstruction part, i.e., the inverse transform, can be trivially obtained by reversing the order and changing the signs of the lifting steps. In general, the predict lifting steps P i (z) in the figure and the update lifting steps U i (z) in the figure do not have to appear in pairs, i.e., is allowed, and the predict P i (z) and the update U i (z) lifting steps can also appear in reversed order, i.e., the U i (z)'s lead the P i (z)'s. An exact-reconstruction filter set consists of 4 filters: the analysis low-and high-pass filters and the synthesis lowand high-pass filters. Apparently, the lossless compression performance of a reversible ITI filter set is closely related to the frequency response of its parent exact-reconstruction filter set. We first investigate the influence from the frequency response of the parent filters.

A. THE INFLUENCE FROM THE FREQUENCY RESPONSE OF THE ANALYSIS HIGH-PASS FILTER
We need to show that while the 9/7 filter frequency response leads to higher compression efficiency than the 5/3 filter frequency response for lossy compression, the 5/3 filter frequency response results in higher compression efficiency than the 9/7 filter frequency response for lossless compression.
In [18], the effect of the frequency response on compression performance for lossy compression was analyzed, which is briefly summarized below for readers' convenience. The energy of images is concentrated in the low-frequency region. Thus, after a 2-channel decomposition, the high-frequency band has much smaller coefficient magnitudes than the lowfrequency band. Furthermore, with the dramatic spectral 89118 VOLUME 8, 2020 rolloff, a significant percentage of the total output energy for the high-frequency band comes from aliasing. Therefore, reducing the aliasing energy would effectively reduce the total output energy for the high-frequency channel, and thus reduce the coefficient magnitudes in the high-frequency band. The reduction of aliasing in the high-frequency band is illustrated in Fig. 2, where the high-pass filter of sym6 is compared with the high-pass filter of sym2. The nomenclature is adopted from MATLAB, i.e., symN denotes Daubechies least asymmetric wavelets with N vanishing moments. For the dyadic decomposition structure, all the decomposed subbands are derived from the high-pass analysis filter, the only exception being the DC band. Thus, it can be formally shown that reducing the total energy of the high-frequency band improves the energy compaction among subbands, which enhances compression performance.
In [18], the coding results of real images decomposed with the symN filters were used to verify the above claim. Starting from the sym2 (N = 2), as N increases, higher compression is achieved due to the improved energy compaction resulting from the reduced high-frequency channel aliasing. However, after reaching a maximum N value, a further increment of N lowers down the compression performance slightly. As explained in [18], at very large N , the filters are very long, suggesting poor localization of the decomposed coefficients in the time/space domain. Therefore, a better way to reduce the high-frequency channel aliasing is to modify the frequency response while keeping N unchanged. This can be achieved with the asymmetric frequency response of the biorthogonal filters.
The popular classes of exact-reconstruction filters can be derived from factoring a half-band product filter [19], [20]. The popular biorthogonal 9/7 filters and the orthogonal sym4 filters are factorized from the same half-band product filter, i.e., both filter sets have exactly the same zeros except the distribution of the zeros into the analysis and synthesis filters is different [21]. In [18], it was shown that the 9/7 filter set has higher compression performance than the sym4 filter set. It was also shown that the higher compression performance of the 9/7 filters comes only a little from the linear-phase property of the 9/7 filters but mainly from the more advantageous asymmetric magnitude frequency response of the 9/7 filters. Fig. 3(a) compares analysis-filter magnitude frequency responses of the 9/7s' and the sym4s'. It can be seen that because of the asymmetry, the analysis low-and high-pass filters have a slightly higher cutoff frequency than 0.5π for the 9/7. The slightly higher cutoff frequency of the analysis high-pass filter reduces the high-frequency band aliasing, and thus the total energy of the high-frequency band, because a substantial percentage of the high-frequency band energy is from aliasing, see Fig. 2. It is this reduction of the highfrequency band energy that makes the 9/7 filters have higher compression performance than the sym4 filters.
The more dramatic example of this feature is the 5/3 filters. The 5/3 filter set and the sym2 filter set are factorized from the same half-band product filter, i.e., both filter sets have the same zeros except the distribution of the zeros into the analysis and synthesis filters is different. Fig. 3(b) shows analysis-filter magnitude responses of the 5/3s' and the sym2s'. It can be seen that the cutoff frequency of the 5/3's high-pass filter is considerably increased, and thus the aliasing of the high-frequency band is significantly reduced. Compared with the cutoff frequency shift of the 9/7 over the sym4, the shift of the 5/3 over the sym2 is much greater. However, the compression performance of the 5/3 filters is not as good as the 9/7 filters for lossy compression. The poor performance of the 5/3 filters comes from the amplification of the quantization errors during the synthesis process.
In Fig 3(c) the synthesis-filter magnitude response of the 5/3 filter set is shown. It can be seen that because of the asymmetry, the synthesis high-pass filter dramatically amplifies the input from its huge passband ripple. For lossy compression, we can view the quantized coefficients as the sum of the original coefficients plus the quantization errors. The amplification feature of the synthesis high-pass filter would significantly amplify the quantization errors, leading to lower PSNR in lossy compression, and thus lower compression efficiency. One can easily make a quantitative check that such amplification effect for the 9/7 filters is not as strong, which explains why the 9/7 filter set significantly outperforms the 5/3 filter set for lossy compression.
For lossless compression, the situations are completely different. Rounding of the lifting steps is of disadvantages, yet, it is also of advantages for compression performance.
The disadvantage side is more obvious and has already been observed by researchers, which is: the rounding of the lifting steps leads to deviations from the parent transform. Thus, coefficients from the reversible ITI transform contain errors. The integer coefficients obtained from rounding the parent transform coefficients are the most accurate integer coefficients. Rounding the parent transform coefficients is equivalent to uniformly quantize the output of the parent transform with a quantization step size of = 1. Unfortunately, the reversible ITI property is lost by this rounding/quantization, i.e., with these precise integer coefficients, the exact reconstruction of the original signal is not guaranteed. However, errors of the ITI transform can be obtained by comparing the reversible ITI coefficients with this precise integer coefficients and can be used to evaluate the compression efficiency loss of the ITI transform.
As we know, the magnitudes of the transformed coefficients are small. Thus, statistically, for the precise integer coefficients, the probability distribution monotonically decreases with the magnitude of the coefficients, and magnitude ''0'' has the highest appearing probability. Errors in the reversible ITI coefficients leads to exchanges of the coefficient magnitudes: 0's with ±1's, ±2's, . . . ; ±1's with ±2's, ±3's . . . ; etc. These exchanges lead to fewer 0's and fewer small coefficients, which results in a more uniform probability distribution of the coefficient magnitude and thus higher entropy for coding. Therefore, the closer the reversible ITI coefficients are to the precise integer coefficients, the higher compression performance the reversible ITI transform achieves. It can be seen that the high-frequency band of the 5/3 reversible ITI transform is very close to its precise integer version. Nevertheless, there is a difference. Consider the coefficient ''0''. When the inputs are integers, the outputs of the high-frequency band of the 5/3 filters are 0, ±0.5, ±1, ±1.5, . . . For the precise non-reversible integer coefficients, we can easily modify the rounding rule without changing the quantization error such that all the ±0.5 are rounded to 0, whereas for the reversible ITI transform, at most the rounding can be modified to x + 0.5 or x − 0.5 operation. As a result, if +0.5 is rounded to 0, then −0.5 has to be rounded to −1, or if −0.5 is rounded to 0, then +0.5 has to be rounded to 1. Thus, the reversible ITI coefficients have fewer 0's, and similarly fewer small coefficients, than the precise integer coefficients. Nevertheless, it can be easily checked that the reversible ITI coefficients are much closer to the precise integer coefficients for the 5/3 filters than for the 9/7 filters, because the 9/7 filters have more lifting steps and thus much larger accumulated rounding errors.
Having examined the disadvantage associated with rounding of the lifting steps, it is important to show that there are advantages from the reversible ITI transform, which has not been observed in existing literatures.
Recall that for lossy compression, the biorthogonal 9/7 filters achieve high compression performance from the asymmetric shift of the cutoff frequency. The higher cutoff frequency of the analysis high-pass filter leads to less aliasing in the decomposed high-frequency band and thus smaller decomposed coefficient magnitudes for the highfrequency band. For the 5/3 filters, the asymmetry of the frequency response is more prominent. However, the stronger asymmetry introduces a large passband ripple in the highpass synthesis filter, leading to substantial amplification on the quantization errors during reconstruction. The aliasing reduction effect, which improves the compression efficiency, is offset by the noise amplification effect, which deteriorates the compression efficiency. As a result, there is no significant advantage from the stronger skewing of the magnitude response of the 5/3 filters. However, for lossless compression, there is no error produced during reconstruction from the reversible ITI transform. This suggests that the aliasing reduction effect of the 5/3 filters, which improves the compression efficiency, is completely released, making the 5/3 filters more advantageous than the 9/7 filters in boosting the compression performance. This explains why the 5/3 filter set consistently outperforms the 9/7 filter set for lossless compression.

B. THE RELATIONSHIP BETWEEN THE BITRATE INCREMENT AND THE COEFFICIENT MAGNITUDE MEAN
In lifting implementation of exact-reconstruction filters, there is a scaling factor K as shown in Fig. 1. The scaling factor can also be implemented using lifting steps [17]. However, it has already been realized that in most cases, such as for the 5/3 filters, performing lifting steps for the scaling does not improve compression efficiency. Indeed, there could be a slight loss in compression efficiency. This is completely different from lossy compression, where the scaling factor can never be skipped.
To have a clear understanding of the influence from the scaling factor, we perform a lossy coding experiment first and then provide explanations. Consider an image decomposed with a conventional wavelet transform, say the familiar 512 × 512 test image Lena decomposed with the 9/7 filters. The decomposed coefficients are uniformly quantized with quantization step sizes 64 2 i , i = 0, 1, . . . , 9. In other words, the quantization step size decreases by a factor of 2 each time. The coding results for individual subbands are examined as follows.
In Table 1, coding results for subbands in decomposition levels 1 and 2 are shown since the other subbands have the same feature (see Fig. 4 for the 2D dyadic decomposition structure). For each subband, the coded bitrate in bps (bits per symbol/coefficient) for each quantization step size is shown in rows denoted ''Rate''. Rows ''Rate Inc.'' below rows ''Rate'' show the bitrate increment when the quantization step size is decreased by a factor of 2. For example, in Table 1, the coded bitrate of subband LH1 quantized at step size = 1 is 4.13 bps. In the next column, the coded bitrate of LH1    quantized at = 0.5 is 5.13 bps. So, the bitrate increment from = 1 to = 0.5 is 5.13 − 4.13 = 1.00 bps. In other words, at rate 4.13 bps, decreasing the quantization step size by a factor of 2 leads to a rate increment of 1.00 bps. Since rows ''Rate Inc.'' are obtained from subtracting the next column ''Rate'' from the current column ''Rate'', rows ''Rate Inc.'' have one less data point.
The mean of the coefficient magnitude of each subband, denoted ''Mag. Mean'', is shown in Table 1. Rows ''Mag. Ratio'' are the ratio of the next column mean to this column mean. For example, the magnitude mean of LH1 at = 1 is 3.78. Decreasing by a factor of 2 to the next column = 0.5, the magnitude mean is increased to 7.57, and the increment in ratio is 7.57/3.78 ≈ 2.00, which is shown in the row ''Mag. Ratio''.
As can be found from Table 1, as the quantization step size decreases, the rate increment ''Rate Inc.'' approaches ''1'', and the ''Mag. Ratio'' approaches ''2''. The converging point is at ''Rate'' being around 3 bps. To see this, ''Rate Inc.'' vs. ''Rate'' and ''Mag. Ratio'' vs. ''Rate'' are plotted in Fig. 5(a) and 5(b) respectively. Note that data points from all the subbands are plotted in the same figure to show that all the subbands have the same feature. In fact, in Fig. 6(a) and 6(b), results from averaging over 6 familiar test images, Lena, Barbara, Goldhill, Mandrill, Boat, and Zelda, are plotted. Even if the 6 images differ significantly, the averaged data points in Fig. 6 show the same feature as those in Fig. 5 with even smaller variations, suggesting that all the results from different images and different subbands have the same converging property for ''Rate Inc.'' and ''Mag. Ratio''.
The above converging property is not difficult to understand. When the quantization step size is large, most coefficients are quantized to zeros. Thus, reducing to /2 increases the number of non-zero coefficients significantly, and the magnitude mean is increased by more than a factor of 2. However, even if the magnitude mean is increased by more than a factor of 2, it is still much less than 0.5, suggesting most coefficients are still quantized to zeros, and the coded rate is still less than 1 bps, so the ''Rate Inc.'' is less than 1 bps.
Conversely, when the quantization step size is small, most quantized coefficients are not zero. Reducing by a factor of 2 would increase the magnitude mean of the quantized coefficients by a factor of 2 plus a small error caused by the quantization rounding. In other words, each quantized coefficient is increased by a factor of 2 on the average, and thus, 1 extra bit is needed for each coefficient. This is the converging property we see in Fig. 5 and 6.
Reducing by a factor of 2 can be viewed as increasing the coefficient magnitude by a factor of K = 2 before the quantization. Therefore, in the converged region, the increment on bitrate is 1 bps for a scaling of K = 2; and in the region before converging, the increment on bitrate is less than 1 bps for a scaling of K = 2. More experiments can be easily performed to obtain the general results for an arbitrary scaling factor of K = 2: in the converged region, the increment of bitrate is log 2 K bps; and in the region where the magnitude mean is less than that needed for converging, the increment of bitrate is less than log 2 K bps.
Theoretically, the above observation may be explained using the relationship between the differential entropy and the discrete entropy. In [22], it is shown that as the quantization step size → 0, where X is a continuous random variable, h (X ) is the differential entropy of X , X is the discrete random variable obtained from quantizing X with , and H X is the discrete entropy of X . Let = 2 −b , we have In a subband, each coefficient C i is a random variable. Thus, when the subband is quantized at step size , the total (discrete) entropy of the subband is where N is the total number of coefficients in the subband, and B corr is the correlation among the quantized coefficients C i . Assume B corr is not a function of in the region → 0, i.e., when is sufficiently small. Then, in the converged region → 0, the coding rate increment due to the quantization step size change from 1 to 2 is which is the converging property we observed. Now we can analyze the influence on lossless compression efficiency from omitting the scaling factor in the reversible ITI realization of exact reconstruction filters. Since the scaling factors are K and 1 K for the low-and high-frequency band respectively, i.e., one band gets amplified but the other attenuated, we need to compare the ''Rate'' of one band before scaling with the ''Rate'' of the other band after scaling. Comparing the ''Rate'' is equivalent to comparing the ''Mag. Mean'' as indicated in Fig. 6. Thus, we compare the magnitude mean of the low-frequency band after scaling, denoted mn lo , with the magnitude mean of the high-frequency band before scaling, denoted mn hi below. Consider K > 1.
Case 1: Suppose mn hi is not in the converged region but mn lo already is. When the scaling factor is skipped, the increased bitrate for coding the high-frequency band is less than log 2 K bps, but the decreased bitrate for coding the low-frequency band is equal to log 2 K bps. The overall coding rate is reduced, and thus the compression efficiency is improved.
Similarly, Case 2: Suppose mn hi is in the converged region but mn lo is not. When the scaling factor is skipped, the overall coding rate is increased, and thus the overall compression efficiency decreases.
Suppose both mn lo and mn hi are not in the converged region. Case 3: If mn lo > mn hi , the increased bitrate for the high-frequency band is less than the reduced bitrate for the low-frequency band, because, on the average, the bitrate increment is smaller at smaller magnitude mean (or equivalently smaller bitrate), see Fig 6. The overall compression efficiency is improved. Similarly, Case 4: If mn lo < mn hi , the increased bitrate for the high-frequency band is greater than the reduced bitrate for the low-frequency band. The overall compression efficiency decreases.
Finally, Case 5: both mn lo and mn hi are in the converged region, the increased the bitrate for coding the high-frequency band and the decreased the bitrate for coding the lowfrequency band are the same. Both are log 2 K bps. The overall compression efficiency does not change.
For the 5/3 filters, K = √ 2 > 1. In general, Case 2 and Case 4 only appear very occasionally, because image energy is concentrated in the low-frequency region, and thus the lowfrequency band has a much larger magnitude mean than the high-frequency band. Thus, overall, there is no efficiency loss from omitting the scaling factor for the 5/3 filters. Note, for some filters, K < 1, efficiency loss can be introduced from omitting the scaling. The analysis is very similar to the analysis above and is skipped.
There is an issue from the analysis above. If for the final decomposed coefficients, the magnitude means for two subbands are different, we can always perform an extra scaling K ex such that both subbands have the same magnitude mean after the scaling. The scaling can be performed using lifting steps [17]. It seems that if any or both of the subbands is/are not in the converged region, this scaling operation would lead to a gain in compression efficiency. This is not true, because performing reversible ITI scaling using lifting steps leads to rounding errors and these errors prevent the efficiency gain. Note, in the previous analysis, the scaling is omitted, not added.

C. THE EFFICIENCY LOSS OF BIORTHOGONAL FILTERS FOR LOSSY AND LOSSLESS COMPRESSION
For 2D decomposition, see Fig. 4, the efficiency of decomposing an H vertical band into the HL and HH band is different from the efficiency of decomposing an L vertical band into the LL and LH band. This is because the spectrum of the H vertical band has less strong rolloff feature, i.e., the spectrum is almost flat. For a flat spectrum, the total energy of the decomposed coefficients from a biorthogonal filter set is higher than the total energy of the input signal. The increased total energy leads to efficiency loss for lossy compression because higher energy suggests higher entropy of the quantized coefficients.
We define a quantity R e below to measure the energy magnifying property for exact-reconstruction filters: VOLUME 8, 2020 where L (ω) and H (ω) are the normalized frequency responses for the low-and high-pass analysis filters respectively. R e is a measurement of the input-output energy ratio when the spectrum of the input signal is a constant. For orthogonal filters, the total energy of the decomposed coefficients is the same as that of the input signal. Thus, R e = 1 for orthogonal filters. For biorthogonal filters, R e > 1. For example, R e = 1.012 for the 9/7 filters and R e = 1.094 for the 5/3 filters.
For lossy compression, R e of the 9/7 filters is slightly greater than 1, but the appropriate skew of the frequency responses of the 9/7 filters leads to much greater improvement over corresponding orthogonal filters when the input spectrum has the strong rolloff feature, as already discussed in II-A. Thus, the 9/7 filter set is an ideal high-performance filter set for lossy compression. Conversely, the R e value of the 5/3 filters is about 10% greater than 1. As a result, the 5/3 filters are much less efficient than the 9/7 filters for decomposing an H vertical band into the HL and HH band in lossy compression.
For lossless compression, the coefficient magnitude mean of every subband is large, suggesting that the coded bitrate of every subband is in or very close to the converged region as described in II-B. When the bitrate is in the converged region, it is determined by the magnitude mean. As a result, R e cannot be used to evaluate the efficiency, because the increased total output energy may not cause any efficiency loss. This can be illustrated as follows.
Suppose a signal with a flat spectrum is ITI decomposed by a reversible orthogonal filter set. Since the transform is orthogonal, the output energy is equal to that of the input signal, denoted E 0 here, and the energy of both the low-and high-frequency band is equal to E 0 /2 for the flat spectrum signal. When the bitrates of both subbands are in the converged region, an extra scaling factor does not change the overall coded bitrate as already shown in II-B, and thus there is no efficiency loss from scaling. However, the total output energy does change by scaling. To see this numerically, consider a scaling of K = 2. With the scaling, the total energy of the low-and high-frequency band is increased to K 2 (E 0 /2) + (1/K ) 2 (E 0 /2) = 2.25E 0 , which is much greater than the total input energy E 0 . However, since the bitrates of both subbands are in the converged region, for the scaling of K = 2, the low-frequency band bitrate increases by 1 bps and the high-frequency band bitrate decreases by 1 bps. Thus, the overall efficiency is not changed. As a result, the increase in the total output energy does not cause any efficiency loss in this case.
Therefore, a new quantity, R mn , is introduced to evaluate the efficiency loss for lossless compression. We still use a constant spectrum signal as the input and use an orthogonal filter set as the reference. Then, is defined to evaluate the efficiency, where C lo i and C hi i are the coefficients of the low-and high-frequency band decomposed from the filter set being evaluated, and C lo 0i and C hi 0i are the coefficients of the reference orthogonal filter set. Then, respectively for the low and high-frequency band, ( i C lo i ) ( i C lo 0i ) and ( i C hi i ) ( i C hi 0i ) are the ratios of the magnitude mean of the filter set being evaluated over the magnitude mean of the orthogonal filter set. As can be easily checked, with a scaling K , R e changes as illustrated earlier, but R mn0 does not change, indicating R mn0 is an appropriate quantity to evaluate the efficiency loss for lossless compression.
However, values and i C hi 0i , which are the L1 norm, are not related to the filters' frequency responses. Therefore, we use the L2 norm to approximate the L1 norm and thus modify the definition of (7) to: where L (ω) and H (ω) are the normalized frequency responses respectively for the low-and high-pass analysis filters of the filter set being evaluated, and L 0 (ω) and H 0 (ω) are those of the reference orthogonal filter set. Since we finally have for evaluating the efficiency loss for reversible ITI filters. Again, R mn = 1 for orthogonal filters. It can be easily checked that R mn = 1.011 for the 9/7 filters and R mn = 1.038 for the 5/3 filters. Compared with lossy compression, where R e = 1.094 for the 5/3 filters vs. R e = 1.012 for the 9/7 filters for a flat spectrum input, the efficiency loss of the 5/3 filters is greatly reduced for lossless compression. This efficiency loss difference also shows why the 5/3 filter set favors lossless compression whereas the 9/7 filter set favors lossy compression. Furthermore, based upon the R mn evaluation, the 5/3 filters can be modified to 9/3 and 13/3 filters, etc. to further reduce the R mn value, as we will show in the next part.

D. NEW REVERSIBLE ITI FILTERS FOR LOSSLESS COMPRESSION
From the above analysis, the 5/3 filters have the following two advantages for the ITI reversible version: (i) With stronger asymmetric skew of the frequency responses, the high-pass analysis filter H (ω) more effectively suppresses the aliasing from the low-frequency band. The degree of the asymmetric skew can be quantitatively measured by the value of |H (ω)| at ω = 0.5π , see Fig. 3. It can be easily verified that |H (ω)| ω=0.5π = 0.5 for the 5/3 filters, whereas |H (ω)| ω=0.5π is approximately 0.615 and 0.707 for the 9/7 filters and the orthogonal filters respectively. Both are larger than the 5/3 filters. (ii) The 5/3 filters require only one lifting step to obtain the output of the high-frequency channel, leading to the smallest errors from the lifting-step rounding.
The high-pass analysis filter of the 5/3 filters has 2 zeros, and all of them are at 1, leading to |H (ω)| ω=0.5π = 0.5. To effectively suppress the aliasing in designing a new filter set, one may easily think that the 4-tap high-pass filter with all the 3 zeros placed at 1, which leads to |H (ω)| ω=0.5π = 0.35, is a good candidate, for example, the 8/4 filters with the low-pass filter  Fig. 7, and |H (ω)| ω=0.5π for the 8/4 filters is much smaller than that of the 5/3 filters. Apparently, the 8/4 filters have much stronger aliasing rejection for the high-frequency band than the 5/3 filters. Unfortunately, R mn = 1.149 for the 8/4 filters, compared with R mn = 1.038 for the 5/3 filters. This suggests that the 8/4 filters would lead to severe efficiency loss in decomposing the H vertical bands into the HL and HH bands. Thus, this 8/4 filter set is immediately ruled out with no need for further examining on other features. Therefore, we consider using the half-band product filters for factoring the Daubechies wavelet series as the analysis high-pass filters, because all the filters in that series have |H (ω)| ω=0.5π = 0.5. We denote these filters as ''hbpN '' filters. An hbpN filter has N vanishing moments. A number of hbpN filters are listed in Table 2, and their magnitude frequency responses are shown in Fig. 8.
The hbpN high-pass filters are easily related to the Daubechies series. For example, the hbp4 in Table 2 can be formed by placing the 3 zeros of the db2 analysis high-pass filter 1, 1, (2 − √ 3) and the 3 zeros of the db2 synthesis high-   . Apparently, the hbp2 is formed by the high-pass filters of the Haar wavelet filters (db1), and the hbp2 is just the high-pass analysis filter of the 5/3 filters. It can be seen from Fig. 8 that as N increases, aliasing is reduced, suggesting that the hbpN filters with higher N would be more efficient filters than the hbp2 high-pass filter, which is the analysis high-pass filter of the 5/3 filters.
Another important feature of these hbpN filters is that because of the existence of the ''0''s in the filter taps, these high-pass filters can be implemented with one lifting step. For example, it can be easily checked that for the hbp6 high-pass filter in Table 2. This suggests that their reversible ITI realization achieves the minimum rounding errors, similar to the 5/3 filters situation. Next, we need to construct the low-pass filter. Note the natural ''low-pass'' filter is l i = {1}, i.e., the direct downsampling with no operations. However, R mn is too large for this simple choice. Thus, we do need one more lifting step for the low-pass filter. We first consider the hbp2, which is the high-pass analysis filter of the 5/3 filters. For the second lifting step of the 5/3 filters, VOLUME 8, 2020 which leads to R mn = 1.038. We know that the DC component of an image can be very large. Thus, we do need H (ω)| ω=0 = 0 for the frequency response of the analysis high-pass filter, because DC leaking significantly increases the magnitude as well as the number of the small non-zero coefficients. In other words, we need at least one zero placed at 1 for the high-pass analysis filter. However, for the lowpass analysis filter, L (ω)| ω=π = 0 is not needed, i.e., the low-pass filter can be designed with no vanishing moment. Thus, there is no restriction on the low-pass analysis filter. All we need is a smaller R mn . Consequently, we can improve the second lifting step for the 5/3 filters to This improved lifting step increases the low-pass filter length from 5 taps to 9 taps, and the quantity R mn is improved from 1.038 to 1.030. The frequency responses of this improved 9/3 filters are shown in Fig. 9(a). The high-pass analysis filter is the same as the 5/3 filters, but the low-pass analysis filter is improved to achieve a smaller R mn . It can be easily checked, using coding results on real test images, that the 9/3 filters do have a very slight improvement over the 5/3 filters. One can further increase the low-pass filter length to 13 taps, 17 taps, . . . etc., to further reduce R mn . However, the improvement on R mn becomes small and negligible. For the remaining hpbN filters, we can also use the lifting step (12) or the lifting step (13) to construct the low-pass analysis filter. However, improvements of R mn from using (12) to using (13) are from 1.051, 1.058, 1.062 to 1.034, 1.035, and 1.037 respectively for the hbp4, hbp6, and hbp8 filter. The improvements are more significant than that for the 5/3 filters.
Using coding results on real test images, we found that among the 4 sets of new filters, the set constructed from hbp6 achieves the highest compression performance. (To save space and highlight the main results, these secondary test data are not shown.) This can be explained as follows. As the number of N increases from hbp2 to hbp4, hbp6, and hbp8, high-frequency band aliasing is reduced, which contributes to improving the compression efficiency. However, the filter length also increases quickly from 3 to 7, 11 and 15 taps. The increased filter length leads to poorer localization property, which deteriorates compression efficiency. Also, the amount of reduced aliasing becomes smaller as N increases, see Fig. 8. Combining these effects, the overall improvement on compression efficiency stops at hbp6.
Finally, the new reversible ITI filter for lossless compression from this study is the exact-reconstruction filter set that uses the hbp6 filter. This exact-reconstruction filter set consists of 2 lifting steps -the first lift step given by (11) and the second lifting step given by (13), and a scaling factor K =  Fig. 9(b), where the frequency response of the 13/11 filters using (12) for the low-pass filter lifting step is also shown for comparison.

III. CODING RESULTS
Coding tests on real images were performed to test the new reversible ITI 17/11 filters. An efficient entropy coding algorithm that uses the symbol grouping method and the modified Golomb-Rice coding [23] was used to code the integer coefficients decomposed from the reversible ITI transforms.
In the first test, 6 very familiar small-size 512 × 512 8-bit grey-level test images were used. Test results are shown in Table 3. In this mini test, the newly constructed 17/11 filters consistently outperform the 5/3 filters. On the average, there is about a 1.2% reduction in bitrate by replacing the 5/3 filters with the 17/11 filters for these small-size images. The compression results from using the state-of-the-art algorithms JPEG2000 lossless mode and JPEG-LS are also shown in the Table. JPEG2000 can be found in MATLAB, and JPEG-LS we used were obtained from [24]. The combination of the 17/11 filters and the more efficient entropy coding leads to bitrate reductions of about 2.6% and 1.6% respectively over JPEG2000 lossless mode and JPEG-LS in this mini test.
Next, we performed a more formal test on large-size highquality modern digital images. Twenty-four images are used in the test. Of these images, eight are directly downloaded from websites; they are four 16-bit UHD images and four 10-bit HD images. The other sixteen images are obtained from the raw format image files taken with DSLR cameras. Adobe Raw is used for converting the raw files to images. Sharpening is turned off for the conversions, and the output images are 16 bits per component. Among the sixteen DSLR images, eight are taken by our Nikon D5300 DSLR; four are taken by a Nikon D850 -a state-of-the-art professional fullframe DSLR; and the last four taken by a Fujifilm GFX50Sa state-of-the-art professional media-format DSLR. The four Nikon D850 and four Fujifilm GFX50S raw files are downloaded from [25]. The Y components of the color images were obtained and quantized to 8-bit grey-level images for this test. Test results are shown in Table 4.
The results are similar to the mini test, except that the efficiency gain from the 17/11 filters is greater for the largesize high-quality images. The 17/11 filters achieve a bitrate reduction of about 1.7% over the 5/3 filters. To further extract more efficiency from the transform part, we made the following two enhancements. We know that for each level of decomposition, after the initial decomposition along the vertical direction, the spectrum of the H vertical band does not have the strong rolloff feature like the L vertical band. In fact, in many cases, further decomposition on the H vertical band is not necessary and leads to efficiency loss. Given the efficiency can be tested using the magnitude mean, we can check the magnitude mean and decide if the decomposition on H vertical is needed. This is the first enhancement. The complexity of performing this enhancement is very low, simply compute the magnitude mean before and after the decomposition of H vertical , and the computation of the magnitude mean only needs one addition per coefficient.
The other enhancement is from the non-separable property of the 2D reversible ITI transform. Because the 2D reversible ITI transforms are non-separable, the two image orientations (obtained from the transpose operation) lead to two different decomposition results. We use a simple method to determine the better image orientation as follows. For the two orientations, find the two magnitude means of the high-frequency band. The orientation with the smaller magnitude mean is the orientation for decomposition. Once the orientation is determined, there is no need to perform the check again for higher levels of decomposition. Thus, the complexity of this check is not high. Since the high-frequency band is from the first lifting step, the low-frequency band does not need to be computed for this check. In other words, with this check, we only need to compute one extra lifting step for the first level decomposition. Specifically, the original decomposition without the check needs 4 lifting steps for a 2D decomposition, while the decomposition with the check needs 4 + 1 lifting steps for a 2D decomposition. In addition, for large images, different regions have different features. Thus, the image is divided into smaller regions, about 1024 × 1024, for decomposition. This division does not increase complexity.
Test results from using the two enhancements are denoted ''17/11 +'' in Table 4. As can be seen, the enhancements lead to about 0.7% further reduction in bitrate. Overall, in this test, the highest lossless compression efficiency achieved leads to bitrate reductions of about 3.1% and 4% over the state-of-the-art lossless compression algorithms JPEG-LS and JPEG2000 lossless mode respectively.
In many applications, such as in military and medical applications, images can be very noisy. Thus, we tested the new filters in noisy condition. For all the images used, we added Gaussian noise to the original images, such that the PSNR of the noisy image is 22dB. Fig. 10 shows the images before and after adding the noise for Lena. As can be seen, a lot of details disappear from adding the noise. Table 5 shows the coding results of these noisy images. As expected, the number of bits is dramatically increased for all the images.
Next, the improvement from the designed new 17/11 filters over the 5/3 filters is reduced. This is because the added noise has a flat spectrum, and thus reducing aliasing does not improve energy compaction. Nevertheless, the combination of the new 17/11 filters with the more efficient entropy coding still has the highest efficiency and achieves bit rate reductions of about 3.1% and 3.8% over JPEG-LS and JPEG2000 lossless mode respectively. As described in [23], the entropy coding was designed to more efficiently handle the noise in decomposed coefficients.
In addition to the noisy images, we made an artificial image of pure noise. A 512 × 512 Gaussian noise array is generated using Matlab. The magnitude is normalized and rounded such that the maximum is equal to 255. The magnitudes are recorded as the pixel value of the image (i.e., the absolute values are used for negative values). The image is denoted ''Gaussian Noise'' and is shown in Fig. 11. It is interesting that even for this pure noise, the proposed 17/11 filters still have a slight gain over the 5/3 filters, and the new lossless algorithm still achieves the bitrate reductions over JPEG-LS and JPEG2000 lossless mode.
Readers are encouraged to verify the results and to perform their own tests with different images. The test codes and test images are available at [26].

IV. SUMMARY AND CONCLUSIONS
We presented the factors that influence the lossless compression efficiency of reversible ITI filters. (1) We showed the important relationship between the coded bitrate and the magnitude mean of decomposed subbands. That is: there is a converged region, after reaching that region, the bitrate increment follows a simple log 2 K bps rule, where K is the magnitude mean increment in ratio. (2) The effect from the frequency response and the effect from the scaling factor were carefully examined and compared for lossy and lossless compression. As a result, the advantages of the 5/3 filters for lossless compression are clearly understood. Next, a new quantity, R mn , which measures the efficiency of reversible ITI transforms on a flat spectrum, is defined. The compression efficiency on a flat spectrum is important for 2D decompositions, and R mn is directly used in the new reversible ITI filter design.
In designing the new filter, the half-band product filter series of the Daubechies wavelet series was selected as the analysis high-pass filters, since the filters have the same advantages as the analysis high-pass filter of the 5/3 filters. In fact, the analysis high-pass filter of the 5/3 filters is in the selected half-band product filter series -it is the one with N = 2. The analysis low-pass filter is designed by experimentally minimizing the R mn value. Finally, it is determined that the 17/11 exact-reconstruction filter set, which employs the half-band product filter with N = 6, is the filter set having the highest compression efficiency in this search.
Coding results show that the new 17/11 filter set achieves a bitrate reduction of 1.7% over the 5/3 filter set. With further enhancements on the transform part and combining the new 17/11 filters with the more efficient entropy coding, we achieved bitrate reductions of about 3.1% and 4% respectively over the state-of-the-art lossless compression algorithms JPEG-LS and JPEG2000 lossless mode.