Watermarking of HDR Images in the Spatial Domain With HVS-Imperceptibility

This paper presents a watermarking method in the spatial domain with HVS-imperceptibility for High Dynamic Range (HDR) images. The proposed method combines the content readability afforded by invisible watermarking with the visual ownership identification afforded by visible watermarking. The HVS-imperceptibility is guaranteed thanks to a Luma Variation Tolerance (LVT) curve, which is associated with the transfer function (TF) used for HDR encoding and provides the information needed to embed an imperceptible watermark in the spatial domain. The LVT curve is based on the inaccuracies between the non-linear digital representation of the linear luminance acquired by an HDR sensor and the brightness perceived by the Human Visual System (HVS) from the linear luminance displayed on an HDR screen. The embedded watermarks remain imperceptible to the HVS as long as the TF is not altered or the normal calibration and colorimetry conditions of the HDR screen remain unchanged. Extensive qualitative and quantitative evaluations on several HDR images encoded by two widely-used TFs confirm the strong HVS-imperceptibility capabilities of the method, as well as the robustness of the embedded watermarks to tone mapping, lossy compression, and common signal processing operations.


I. INTRODUCTION
HDR images are characterized by a wide range of visible luminance values that can accurately represent the radiance of the scene, ranging from direct sunlight to faint starlight. Thanks to its floating-point representation, this type of imaging data can depict more colors and cover a wider range of intensity values than its Standard Dynamic Range (SDR) counterpart. Acquiring, storing, and displaying HDR images is possible thanks to the use of Transfer Functions (TFs), which perform the mapping from the linear light components of the scene, to a non-linear digital signal, and eventually to a linear luminance signal to be radiated by an HDR screen. TFs can then emulate the Human Visual System (HVS) by using non-linear operations to quantize the values representing the visible luminance with minimal subjective distortions.
The associate editor coordinating the review of this manuscript and approving it for publication was Claudio Cusano .
As HDR images become widespread, their vulnerability to piracy, unauthorized distribution, modifications, and illegal copying is expected to increase. HDR imaging piracy may result in significant losses to the economy, harming content production firms and distribution companies. In the U.S. alone, a recent study estimates that global online piracy costs the economy at least $29.2 billion in lost revenue each year [1].
Watermarking is an effective tool not only for media ownership identification but also for auxiliary information delivery. The watermark, or auxiliary information, is usually embedded in the cover media as barcodes, Quick Response (QR) codes, logos, or copyright patterns. This embedded information may be visible or invisible depending on the watermarking process. It is well-known that invisible watermarking does not seriously degrade the visual quality of the cover media by performing the embedding process after a transformation, e.g., in the frequency domain. However, this type of watermarking usually requires the exchange of private VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ keys or extra information about the embedding process to retrieve the watermark. Conversely, visible watermarking allows to visually assert the media's ownership without the need for such keys or extra information. This is usually achieved by performing the embedding process in the spatial domain; e.g., by altering pixel values. Visible watermarking is desirable when the copyrighted material is disseminated over channels where piracy control is not possible, e.g., the Internet, as the visible watermark can make the final user immediately aware of the media's ownership. However, this type of watermarking inevitably degrades the visual quality of the cover media.
To leverage the advantages of visible and invisible watermarking for HDR imaging, we propose a watermarking method in the spatial domain with HVS-imperceptibility capabilities. Our method, hereinafter called High Dynamic Range -Imperceptible Watermarking, (HDR-IW) provides an easy way to recognize the media's ownership without the need for exchanging keys or any extra information about the embedding process, while minimizing the visual distortion that can be perceived by the HVS. The proposed method is based on the Unseen Visible Watermarking (UVW) technique [2], [3] and extends our work in [4]. Differently from the UVW technique, which embeds copyright information in the spatial domain of SDR regions with low visibility, the HDR-IW method embeds imperceptible watermarks in the spatial domain by exploiting the inaccuracies among the non-linear digital representation of the linear luminance acquired by an HDR sensor, the linear luminance radiated by an HDR screen by means of a TF, and the brightness perceived by the HVS from the displayed luminance. The latter is achieved by using the information provided by a Luma Variation Tolerance (LVT) curve [4]. This paper extends and complements [4] as follows: 1) The technical details and computation of the LVT curve are explained in detail for the two TFs widely-used to encode HDR images. The LVT is a core component to determine the maximum variations in luma codes that a pixel can suffer before the changes can be perceived by the HVS according to the TF used for encoding. 2) An embedding region (ER) selection process is introduced to find the region with the highest tolerance to luma code variations according to the corresponding LVT curve. 3) A novel embedding payload metric is introduced to measure the embedding payload of the HDR-IW method by accounting for the characteristics of the HDR image and the corresponding LVT curve and TF.
The watermarks embedded by the HDR-IW method in the spatial domain are imperceptible to the HVS as long as the TF is not altered or the normal calibration and colorimetry conditions of the HDR screen remain unchanged. Hence, these watermarks can be easily identified without the need for private keys or any additional information about the embedding process.
We evaluate the proposed HDR-IW method for the embedding of binary watermarks in terms of embedding payload, imperceptibility (qualitatively and quantitatively), robustness to tone-mapping operations (TMOs), which are widely used to display HDR images on SDR screens, lossy compression [5]- [7] and other common signal processing operations. To the best of our knowledge, there are no other watermarking methods for HDR images that also embed information in the spatial domain in an imperceptible manner. However, we compare the imperceptibility capabilities and robustness of the HDR-IW method with those of two invisible watermarking methods that operate in the frequency domain, [8], [9].
The rest of the paper is organized as follows, Section II reviews comparable watermarking methods for HDR images that embed invisible watermarks after transforming the cover media. Section III briefly describes the HDR acquisition and encoding process. Section IV explains in detail the HDR-IW method. Section V presents and discusses the performance evaluation results. Finally, Section VI concludes this work.

II. RELATED WORK
Although SDR watermarking is a mature area that has been extensively explored both in the spatial and frequency domains, HDR watermarking is still in the early stages. In the last few years, however, important watermarking methods for HDR imaging that embed invisible watermarks after transforming the cover media have been proposed. These methods can be classified into two main groups. The first group includes methods that embed the watermark after applying a frequency transformation. For example, Bakhsh and Moghaddam [8] employ an artificial bee colony algorithm to find the best region to embed a binary watermark in the first-level approximation sub-band of the Discrete Wavelet Transform (DWT). Maiorana and Campisi [9] present a blinddetectable multi-bit watermarking method that uses the DWT of the Just Noticeable Difference (JND)-scaled representation of the HDR image for embedding purposes, as well as a contrast sensitivity function to modulate the watermark intensity in each DWT sub-band according to its scale and orientation. Guerrini et al. [10] present a blind-detectable one-bit watermarking method that uses the approximation sub-band of the DWT of the LogLUV color space. Autrusseau and Goudia [11] propose a non-linear hybrid method that combines additive and multiplicative watermarking. The embedding process is done in the DWT domain of the RGB radiances of an RGBe-encoded HDR image. The work in [12] exploits the properties of the Radon-Discrete Cosine Transform (R-DCT) to derive an image representation whose coefficients can be watermarked with an insignificant effect on the visual quality. In [13], the authors propose a watermarking method robust to TMOs by successively performing a non-subsampled contourlet transform and singular value decomposition to extract the structural information that is invariant to tone-mapping.
The second group of HDR watermarking methods includes those that embed the watermark after applying a color decomposition or filtering process. The work in [14] proposes a method based on feature map extraction by means of the Tucker decomposition. This method divides an HDR RGB color image into the three color channels so that three feature maps are extracted. The method then embeds a watermark in the feature map that contains most of the image's energy. In [15], the authors decompose an HDR image into multiple SDR images by means of a bracketing process. Each SDR image is watermarked with a random key before being merged to produce the final watermarked HDR image. In [16], the authors propose a blind-detectable watermarking method that uses bilateral filtering to extract the small scale and texture parts of the HDR image, also known as the blue component of the detail layer. The watermark is embedded in this blue component to minimize quality degradations.
In summary, the previous watermarking methods have been shown to achieve strong performance. However, they may require the deployment of specific watermark detection and extraction modules. For example, the methods in [8], [16], and [10] require an explicit exchange of private keys to detect and extract the watermark. Although embedding watermarks in the spatial domain eliminates the trouble of deploying an extraction module, such an embedding technique is seldom explored because the embedded watermarks are visible and hence defeat the goal of providing a highquality and realistic visual experience through HDR imaging. To the best of our knowledge, no watermarking method in the spatial domain with HVS-imperceptibility for HDR imaging has been previously proposed. Such methods have only been proposed for SDR images. For example, [17] and [18] propose to exploit the cover media's color histogram to embed the watermark in the spatial domain with HVSimperceptibility. The method in [19], on the other hand, uses a JND criterion for embedding in the spatial domain, the DCT to share extraction parameters, and a binarization function for extraction. Although these watermarking methods have HVSimperceptibility capabilities, they are not suitable for HDR images because of the color and visibility ranges of SDR images differ from those of HDR images, which comes as a consequence of using distinct TFs to encode the luminance and color information [9].

III. HDR IMAGING
The abbreviations and acronyms used in this work are defined in Table 1.
Acquiring luminance from a scene in the form of an HDR image requires to first map the scene's linear luminance to  a non-linear digital signal in the form of code values. This mapping is done through an opto-electronic transfer function (OETF). To display HDR images, the code values are mapped back to a linear luminance signal to be radiated by an HDR screen by means of an electro-optical transfer function (EOTF).
Two TFs are currently used for HDR images: the Perceptual Quantization (PQ) EOTF and the Hybrid Log-Gamma (HLG) OETF. The PQ EOTF, also known as the SMPTE ST.2084 standard [20], maps 10-bit luma codes, luma code ∈ [0, 2 10 − 1], to display luminance L d ∈ [10 −4 , 10 4 ] cd/m 2 . This EOTF is an absolute, display-referred TF, as the maximum possible L d value depends on the screen's display capabilities. However, this TF maps each luma code to the same absolute luminance value in every screen. HDR images encoded by the PQ EOTF are not directly backward compatible with SDR screens. Conversely, the HLG OETF preserves backward compatibility. This TF is a relative, scene-referred TF [21], since digital signals produced by this TF represent the intensity of the light relative to the peak output of the HDR sensor.
Ideally, a TF should be a reversible function. Unfortunately, TFs are not reversible and the mapping between linear light components and non-linear codes is lossy. Fig. 1 plots the mapping of 10-bit luma codes, luma code ∈ [64, 940], to display luminance by the two EOTFs previously discussed. For the case of the HLG TF, Fig. 1 plots the inverse of the OETF, i.e., OETF −1 , as the EOTF. Note that each EOTF maps the same luma code to a slightly different display luminance value. This can be best appreciated in Fig. 2.
Contrast threshold curves are commonly used to study the HVS' ability to make contrast distinctions [22], [23]. Fig. 3 shows the contrast threshold curve proposed by Hecht et al. [22], where the luminance, L, is plotted from very dark to very bright conditions against the JND perceived by the HVS ( L/L). The JND model in Fig. 3 shows the three regions used to describe the HVS' behaviour when detecting contrast. The scotopic region, L ∈ [10 −6 , 10 −3 ] cd/m 2 , which follows the De Vries-Rose law. The photopic region, L ∈ [10, 10 8 ] cd/m 2 , which follows a relatively constant trend, i.e., the Weber-Fechner Law. And VOLUME 8, 2020  the mesopic region, L ∈ (10 −3 , 10) cd/m 2 , which combines the characteristics of the scotopic and photopic regions. JND models like the one in Fig. 3 are used to design TFs with smooth visual transitions between consecutive luma code values. This is achieved by establishing coding steps below the threshold of visibility [24].

IV. PROPOSED HDR-IW METHOD
The HDR-IW method embeds binary watermarks in the spatial domain of the Y-channel with HVS-imperceptibility. It comprises 4 main stages, as depicted in Fig. 4 and described next.

A. LUMA VARIATION THRESHOLD CALCULATION
When an initial low luminance stimulus is given to the HVS, very large variations in such a stimulus are required for the HVS to perceive any changes, as shown in Fig. 3. Designing a TF that accurately models the HVS' response to any luminance stimulus is a challenging task. Current TFs represent a trade-off between computational complexity and accuracy of the code assignment process. This trade-off usually results in representing low luminance values with a wide range of luma codes in order to minimize visible contouring artifacts at such low luminance levels. For example, for 10-bit signals, the PQ EOTF employs 100 luma codes to represent display luminance values L d ∈ [0.0001, 0.75) cd/m 2 , 64 luma codes for L d ∈ [0.75, 2) cd/m 2 , and only 22 luma codes for L d ∈ [2, 3) cd/m 2 . Among the 100 luma codes used by this TF for L d ∈ [0.0001, 0.75) cd/m 2 , there is some redundancy that results in a significant amount of bits being wasted to encode small contrast changes that the HVS may not be capable of perceiving at such low luminance levels. A similar situation occurs with the HLG OETF −1 . In other words, there is a mismatch between the HVS's capacity to perceive differences in display luminance and the modeling used by an EOTF to represent display luminance as luma codes. Consequently, luma codes used to represent low display luminance values can be appropriately modified to embed a watermark in the spatial domain so it is imperceptible to the HVS. The challenge here is to determine the regions that are most tolerant to luma code variations and the maximum variation that they can tolerate before these changes can be perceived by the HVS, i.e., their luma variation threshold, denoted by ξ . For a given EOTF, we propose to compute ξ for a luma code, luma code , based on the difference, or error, between the contrast sensitivity (CS) of the HVS and the CS modeling of an EOTF. To this end, we first determine how the luma code assignment of an EOTF changes as the display luminance, L d , increases linearly, and how the HVS' CS increases as L d increases linearly.

1) INCREASE IN luma code AS L d INCREASES LINEARLY
Let us recall that the end-to-end mapping of the linear light components of a real-life scene to the linear luminance values displayed by an HDR screen involves a non-linear quantization in the form of a digital signal. This means that if the luminance values displayed by an HDR screen increase in a linear trend, the corresponding luma codes do not increase linearly. To illustrate this, let us first define the increase in luma codes, luma code , when the display luminance, L d , increases linearly by 1 cd/m 2 , as follows: (1) where luma code [L d ] is the luma code assigned to the display luminance value, L d . Fig. 5 plots Eq. (1) for the two HDR EOTFs for L d ∈ [0.5, 1000] cd/m 2 . It is evident that when the display luminance values increase linearly by 1 cd/m 2 , the luma codes do not increase linearly. Note that for the two EOTFs, Eq. (1) follows a trend similar to that shown in Fig. 3, especially for low display luminance values. In other words, there is a wide range of luma codes available to represent low L d values compared to the narrow range available for large L d values.

2) INCREASE IN THE HVS' CS as L d INCREASES LINEARLY
Part of the HVS' ability to discern information is attributed to its capacity to perceive differences in luminance within a field of vision [25]. Changes in luminance create a pattern of contrast that conveys the majority of visual information to the viewer. The HVS' sensitivity to detect contrast is given by the reciprocal of the JND value. The CS derived from this reciprocal, i.e., CS = 1/JND, is indeed the minimum perceived brightness by the HVS associated with a contrast threshold,  L/L [26]. To appropriately compare the HVS' CS with the display luminance encoded as luma codes, we apply the same N -bit quantization used by an EOTF to the HVS' CS [27]. This N -bit quantization is given by: where [x] denotes the rounding operation on x.
The increase in the HVS' CS after N -bit quantization can then be measured as the increase in CS N bit values when the display luminance increases linearly by 1 cd/m 2 , as follows: where CS N bit [L d ] is the N -bit representation of the HVS's CS associated with the display luminance value, L d . Fig. 6 plots Eq. (3) for the case of 10-bit signals, i.e., CS N bit =10 (L d ).
Note that for the two EOTFs, Eq. (3) follows a trend similar to that shown in Fig. 5. However, there are differences between the values given by CS 10 (L d ) and those given by luma code (L d ) for the same EOTF. These differences are exploited to modify luma codes in the spatial domain in an imperceptible manner, as explained next.

3) LUMA VARIATION THRESHOLD AND THE LVT CURVE
Once the luma code and CS N bit values are computed for a display luminance value, L d , we can define the luma variation threshold, ξ , for L d as the absolute difference, or absolute error, between these two values: (4)    7 plots ξ (L d ) for 10-bit signals. These curves are the LVT curves, one for each EOTF. Note that according to these LVT curves, low L d values can tolerate large variations before the HVS is capable of perceiving them. This tolerance is relatively constant for all other L d values. This is better appreciated in Fig. 8, which shows the LVT curves for the lowest L d values plotted in Fig. 7. In this figure, one can note that for L d values within the boundaries of the scotopic and mesopic regions, there exists an important discrepancy between the CS modeling used by a TF and the brightness perceived by the HVS, i.e., the HVS's CS. The greatest differences are found for L d < 2.5 cd/m 2 , for both EOTFs.
It is important to note that the LVT curves in Fig. 7 can also be defined in terms of luma codes. Fig. 9 shows the LVT curves plotted as a function of luma code , i.e., ξ (luma code ), for VOLUME 8, 2020  10-bit signals. For a PQ compatible system, one can see that a luma code = 100 can be modified to any value ∈ [75, 125] without being perceived by the HVS, since ξ (100) = 50. In the case of an HLG compatible system, a luma code = 100 can be modified to any value ∈ [96, 104], since ξ (100) = 8 without being perceived by the HVS. For a given EOTF, there is then a target range of luma code values that are best suited to embed a watermark in the spatial domain without being perceived by the HVS. We denote this target range by luma target .

B. EMBEDDING REGION SELECTION
To guarantee that the embedded watermark in the spatial domain is imperceptible to the HVS, the ER must be uniform with luma codes ∈ luma target . Our approach to finding an ER that fulfils these criteria on the Y-channel is embodied in Algorithm 1.
In line 2 of Algorithm 1, function superpixelSeg is used to perform SLIC superpixel segmentation [28] on the Y-channel, which results in set SP with η superpixels (SPs). Superpixel segmentation divides the Y-channel into η homogeneous regions in terms of texture, color and visual semantics, which is a desirable property for watermarking [29]. In lines 4-5, the average luma code (luma SP k ) and area (area SP k ) of the k th SP ∈ SP are computed, where luma code [p] is the p th luma code and P is the total number of pixels in the k th SP. In line 8, luma SP k is normalized to [0,1], where 0 denotes the largest value in set SP and 1 the area SP k = P 6: end 7: for each SP ∈ SP do 8: luma SP k ← normalize(luma SP k ) 9: area SP k ← normalize(area SP k ) 10: smallest value in the set. In line 9, area SP k is normalized to [0,1], where 0 denotes the smallest value in set SP and 1 the largest value in the set. In line 10, a global score, GS SP k , is computed for the k th SP as a weighted average of luma SP k and area SP k , with weights w l and w a , where w l > w a and w l +w a = 1. In other words, GS SP k assigns higher importance to luma SP k , i.e., SPs with small luma code values are preferred over those with large areas (and possibly relatively large luma code values) to guarantee imperceptibility. In line 11, the GS SP k value is placed in set SP GS . In line 13, function rank organizes the elements in SP GS in descending order, where the first element, SP GS 1 , is the largest SP with the smallest luma SP k value. Finally, in line 14, the ER is defined as the largest inscribed region within SP GS 1 by means of function inscribe. Fig. 10 (rows 1-3) shows sample results of Algorithm 1 on the Y-channel of various HDR images.

C. WATERMARK EMBEDDING
The HDR-IW method embeds a binary watermark, BW , of size m × n into the ER of size m × n to produce a watermarked ER denoted by ER: where ER i,j and BW i,j are the value of the watermarked ER and the binary watermark at pixel location (i, j), respectively, and HDR is the embedding factor of the cover image. It is important to mention that the human visual attention and the HVS' response to contrast variations not only depend on the target region but also on its surrounding region [23], [24]. For this reason, the HDR-IW method accounts for the L d values of the region surrounding the ER when embedding the watermark. The embedding factor of the cover image, HDR , is then computed as a weighted sum of the average luma variation threshold of the ER, denoted byξ ER ; the average luma variation threshold of the region surrounding the ER, denoted byξ SR ; and the average luma variation threshold of the cover image, denoted byξ HDR : where w 0 and w 1 are weights that establish the impact of the terms, with w 0 + (2 × w 1 ) = 1, and k is a strength factor. The average luma variation thresholds in Eq. (6) are computed by averaging the luma variation thresholds of all the pixel locations in the corresponding region. For example, for the m × n ER,ξ ER is computed as follows: where ξ i,j (luma code ) is the luma variation threshold of pixel location (i, j) as given by the corresponding LVT curve (see Fig. 9). The region used to computeξ SR comprises the 8 blocks of size m × n surrounding the ER. To computē ξ HDR , all pixels locations of the cover image are used except for those in the ER and its surrounding region, as shown in Fig. 11. Fig. 10 (4 th row) shows sample watermarked images in the 4:2:0 YUV color format after embedding the binary watermark in Fig. 12 in the Y-channel. Fig. 13 graphically illustrates the complete embedding process.

D. DETECTION
A watermark embedded as explained in Section IV-C remains imperceptible to the HVS as long as the TF is not altered or  the normal calibration and colorimetry conditions of the HDR screen remain unchanged. To make the watermark perceptible to the HVS, i.e., to visually detect it, one of the following procedures must be applied: VOLUME 8, 2020 FIGURE 13. Block diagram of the embedding process. Blocks in green, red and blue denote inputs, outputs and processes, respectively. 1) Manual color calibration of the HDR screen. The EOTF, peak RGB gamut, luminance, black/white points, and greyscale settings of the HDR screen affect the screen's colorimetry. Therefore, manually modifying the HDR screen's colorimetry to display a brighter version of the watermarked HDR imaging highlights mid and bright tones, which enhances the current contrast. This contrast enhancement contributes to exaggerating the watermarked luma codes, thus making the watermark perceptible to the HVS. This is illustrated in Fig. 14 for the watermarked HDR images in Fig. 10 (4 th row). 2) Applying a gamma TF to the tone-mapped version of the watermarked HDR image. This process consists in varying the gamma factor of the traditional gamma TF, which is typically set to γ = 2.2. Applying a lower γ factor produces a brighter version of the tone-mapped image, thus making the watermark visible to the HVS. 3) Printing out the watermarked HDR image. The EOTF used by most printers is the dot gain compensation curve (DGCC), which is a variant of the traditional gamma function used by SDR screens [30]. The DGCC corresponds to luminance being reproduced as a power function of a code, where the exponent value is set to 1.75, instead of the traditional 2.2 value used for displaying purposes. Printing the watermarked HDR image involves applying a TMO,which is similar to the second procedure. 4) Using special software to handle color grading. Color grading aims to enhance the color of visual content by applying color correction and artistic color effects. Specialized color grading software performs a TMO and color correction with the traditional gamma TF, where γ can be modified to make the watermark perceptible to the HVS. This procedure is analogous to procedures 2 and 3.

V. EVALUATION RESULTS
Five sets of experiments are conducted to evaluate the proposed watermarking method to embed imperceptible binary watermarks in the spatial domain. These experiments evaluate the method's embedding payload, imperceptibility, and robustness. A total of 51 HDR images are used for evaluation. These HDR images are frames from a large collection of real-life HDR video sequences captured in a wide variety of scenarios and lighting conditions, including indoor and outdoor scenes, natural scenes, sports scenes, urban scenes, daytime scenes, night scenes, and textured scenes. Each HDR image has a resolution of 1920 × 1080 and is coded using Rec.2020 + PQ EOTF −1 or Rec.2020 + HLG OETF, as tabulated in the first four columns of Table 2 and illustrated in Fig. 15. The binary watermark in Fig. 12 is embedded in each test HDR image in all experiments. In all evaluations, the weights to compute GS SP k in Algorithm 1 are set to w l = 0.6 and w a = 0.4. The weights to compute HDR in Eq. (6) are set to w 0 = 0.6, w 1 = 0.2. Based on our evaluations, these values provide the strongest HVS-imperceptibility capabilities. This is confirmed in Figs. 16 and 17, which show the relationship between w l and w 0 , respectively, and the imperceptibility of a watermark embedded in image Show-Girl2TeaserClip4000_25_12_P3ct2020_444i_300 [31], as tabulated in Table 2. We quantitatively measure the imperceptibility of the embedded watermark in terms of the HDR Visual Difference Predictor (HDR-VDP-2) [37]. This metric measures the visibility and quality of a pair of HDR images. The visibility describes the probability that an observer can distinguish differences between the two images and the quality measures the degradation that the original image suffers after watermarking. Both parameters are given in terms of an u × v probability map, p(u, v) ∈ [0, 1], which is reduced to a single term by means of the Minkowsky distance: where β = 2.4 is an adjusting factor, and u and v are coordinates for the current pixel location. To compare HDR-VDP-2 values with conventional metrics, Eq. (8) is converted to a dB scale [37]: HDR-VDP-2 dB = 20 · log 10 HDR-VDP-2 max HDR-VDP-2 .
From Fig. 16, we can see that the imperceptibility is strongly affected for w l < 0.6. Hence, to guarantee that an ER with the smallest luma code values is selected over others with large areas (and possibly relatively large luma code values), we use w l = 0.6 and w a = 0.4. From Fig. 17, we can see that values w 0 < 0.6 also decrease the imperceptibility. Therefore, we set w 0 = 0.6 and w 1 = 0.2.   values depend on both, the image's content and the TF used. Namely, PQ-encoded images have positive HDR values and lower luma ER values than HLG-encoded images, which have negative HDR values. As shown in Fig. 1, the HLG TF uses a narrower range of codes than that used by the PQ TF to encode low luminance values. Therefore, low luminance regions of HGL-encoded images are then expected to have a larger average luma code value than that of PQ-encoded VOLUME 8, 2020  images. To embed imperceptible watermarks in the spatial domain of HLG-encoded images, the HDR value should be then negative, otherwise, the embedded information may be perceived by the HVS as medium tones. On the other hand, to embed imperceptible watermarks in the spatial domain of PQ-encoded images, the HDR value should be positive. Based on our evaluations on the test images, such HDR values are achieved by setting the strength factor, k, to {5, 25} for PQ-encoded and HLG-encoded images, respectively [see Eq. (6)]. Additionally, as shown in Table 2, absolute HDR values of HLG-encoded images tend to be larger than those of PQ-encoded images. The HLG TF has a relatively low granularity of luma codes for low luminance values. Consequently, there is more room to modify these codes aggressively before the changes can be perceived by the HVS. This particular TF uses large coding steps in low luminance regions to code large luminance variations. Consequently, if a luma code is modified by a value < HDR , the HVS may not be able to perceive the embedded watermark even after the TF is altered or the normal calibration and colorimetry conditions of the HDR screen are changed. This is because the ER's watermarked luma codes may still be within the range of values of the surrounding region. On the other hand, the PQ TF has a high granularity of luma codes for low luminance values. Therefore, modifying these codes aggressively increases the risk that the HVS can perceive the changes.

A. FIRST SET OF EXPERIMENTS: EMBEDDING CAPACITY
Based on the previous discussions, one can conclude that, in general, HLG-coded images allow for larger imperceptible variations to low-valued luma codes than PQ-encoded images. Such variations, however, can only be applied if the ER has luma codes ∈ luma target , i.e., the range of luma codes that are best suited to embed a watermark in the spatial domain that is imperceptible to the HVS.
Let us recall that the HDR-IW method combines the content readability afforded by invisible watermarking and the visual ownership identification afforded by visible watermarking. As with any other watermarking method in the spatial domain, determining the embedding payload is challenging, as watermarks may be embedded by altering the whole cover media or a small region of it. The embedding payload of a watermarking method in the spatial domain is then dependent on the content of the cover media and the level of distortion introduced by modifying pixel values. Since the HDR-IW method indeed combines aspects of visible watermarking and invisible watermarking, we propose a new metric to quantitatively compute its embedding payload. Our metric, EC HDR , accounts for the contents of the cover media and the TF. Specifically, it accounts for the size of the ER and theξ values: where ER size ∈ [0, 1], max ξ [luma target ] is the maximum ξ (luma code ) value for the range luma target (see Fig. 9), {w 0 , w 1 } are weights as defined before [see Eq. (6)], and {w 2 , w 3 } are weights that establish the importance of each constituent term of the EC HDR metric, with w 2 + w 3 = 1. A value EC HDR = 1 denotes the highest embedding payload, e.g., when the ER spans the entire cover image and the second term of Eq. (10) = 1. Column 8 of Table 2 tabulates EC HDR values for the test images with {w 2 = 0.2, w 3 = 0.8}, i.e., by giving more importance to the second term as ER regions are, in general, relatively small and unlikely to span the entire cover image. Note that the EC HDR metric indeed accounts for the cover's content and the TF used. For example, image BF_100 has an embedding payload EC HDR = 0.0549, which is less than the embedding payload of image BF_320 (EC HDR = 0.0826), despite the fact that image BF_100 has a larger ER than that of image BF_320. Image BF_100 has, however, a lower HDR value, hence, the embedding payload is expected to be relatively small. As expected, HLG-coded images have the largest embedding payloads with a maximum value of EC HDR = 0.1501 for the test images.

B. SECOND SET OF EXPERIMENTS: IMPERCEPTIBILITY
Let us recall that the HDR-IW method operates in the spatial domain by modifying pixels values in the Y-channel. It is then expected that the visual quality, both quantitative and qualitative, of the cover media is disrupted. However, since the embedded watermarks cannot be perceived by the HVS, these disruptions are expected to be non-existent or minimal. To confirm that the embedded watermarks are imperceptible to the HVS, we use two quantitative metrics that measure imperceptibility: the HDR-VDP-2 metric and the multiexposure Peak Signal to Noise Ratio (mPSNR) [38].
The mPSNR measures the error in a watermarked HDR image by first computing a series of exposure levels, which are tone-mapped by a gamma curve after exposure compensation. The tone-mapped version of an HDR image, I , is given by: where e is the current f-stop, which represents a variation in the aperture of a camera, γ = 2.2, and [·] 255 0 indicates clamping to the integer interval [0, 255]. The mPSNR is then computed by using the mean square error (MSE) over a total of E exposure levels: mPSNR = 10 · log 10 3 · 255 2 MSE , where {W , H } are the width and height of I , respectively, and { R xy , G xy , B xy } are the errors in the R, G, and B components, respectively. For an f-stop, e, these errors are computed after computing T (I , e) − T (Ĩ , e), whereĨ is the watermarked image [38].
To the best of our knowledge, no watermarking method for HDR imaging in the spatial domain with HVSimperceptibility capabilities has been previously proposed. However, in this second set of experiments, we also evaluate the invisible watermarking methods in [8], [9], which are proposed for HDR images and operate in the frequency domain by applying the DWT.
HDR-VDP-2 and mPSNR values are tabulated in the last six columns of Table 2. For the HDR-IW method, images with large ERs, i.e., ER size > 2.5%, tend to have the lowest HDR-VDR-2 values. Note also that PQ-encoded images tend to be more robust to degradations introduced by watermarking, as HDR-VDR-2 values for these images are, on average, higher than those of HGL-encoded images. mPSNR values do not tend to significantly vary according to the TF or the ER size for the HDR-IW method. For the majority of the test HDR images, both metrics are within an acceptable range, which confirms that the HDR-IW method can indeed embed watermarks in the spatial domain that are imperceptible to the HVS. Overall, the HDR-IW method attains a higher imperceptibility, in terms of HDR-VDP-2 and mPSNR, than that of the methods in [8], [9]. The lower HDR-VDP-2 and mPSNR values attained by the methods in [8], [9] are due to the fact these methods do not account for the EOTFs needed to display HDR images on a screen.
To qualitatively measure the imperceptibility of the embedded watermarks, we use the Mean Opinion Score (MOS) as the metric. Specifically, fifteen observers with various experience levels in HDR imaging have visually inspected each watermarked image on a laptop built-in HDR screen of 17 inches wide with Windows 10 HDR advanced color settings enabled. The observers are asked to identify the watermark in a variety of lighting conditions and are given the opportunity to analyze the watermarked images from any distance and viewing angle. Results from this evaluation are collected using four scores ranging from 1 to 4, where 1 corresponds to full perceptibility and 4 to full imperceptibility. In cases where the observer is able to perceive the watermark (scores 1 -3), the observer is asked to determine if the watermark is visually disturbing. The percentage of watermarked HDR images assigned to each of the four scores is tabulated in Tables 3 -5 for the HDR-IW method and the methods in [8], [9], respectively.
Results in Tables 3 -5 further confirm that the HDR-IW method can embed watermarks in the spatial domain that are imperceptible to the HVS. In the few cases where the watermark can be barely perceived (score 3), only a very small percentage of images is found to be visually disturbing. Note that the lower MOS values assigned to the images watermarked by the methods in [8], [9] also show the importance of accounting for the EOTF in the embedding process, as this TF is needed to display the HDR image on a screen. Hence, visual distortions may be introduced if this TF is not accounted for even if the watermark is embedded in the frequency domain.
It is worth further emphasizing the importance of the LVT curve in the computation of the luma variation threshold (ξ ) and the embedding factor ( HDR ) to guarantee both imperceptibility and detection of the watermark in the HDR-IW method. For instance, in Fig. 18, the binary watermark is embedded using an arbitrary embedding factor which leads to full perceptibility, even when the watermark is embedded in the ER selected by Algorithm 1. Similarly, if the binary  [8] in terms of the MOS: percentage of watermarked HDR images assigned to each of the four scores.

TABLE 5.
Qualitatively evaluation of the method in [9] in terms of the MOS: percentage of watermarked HDR images assigned to each of the four scores.  watermark is embedded in a region different from the ER selected by Algorithm 1, but using the HDR for the appropriate ER, the watermark is also fully perceptible, as shown in Fig. 19.

C. THIRD SET OF EXPERIMENTS: ROBUSTNESS TO TMO
For this experiment, five TMOs are applied to the test HDR images watermarked by the HDR-IW method and the methods in [8], [9]. Namely, Clip (C-TM), Gamma (G-TM), Hable (G-TM), Mobius (M-TM) and Reinhard (R-TM) [39]. Let us recall that TMOs are designed to generate SDR images from HDR images by maintaining similar visual content. TMOs modify the contrast of an HDR image by modifying pixel values, including regions with low luma codes, which are the regions where the HDR-IW method operates. Table 6 presents the percentage of watermarked images that are assigned a Score = 4 by the observers of Experiment 3 after applying a TMO. These results show that the HDR-IW method embeds watermarks that are more robust to TMOs than those embedded by the methods in [8], [9]. Tone mapping reduces the dynamic range of an HDR image by squishing down the entire capability of representing luminance by means of luma codes. It is then expected that the watermarked images by the HDR-IW method with low luma ER values be assigned the full imperceptibility score (4) after applying a TMO.
To quantitatively evaluate the robustness to TMOs, we use the Bit Error Rate between the original binary watermark, BW , and the tone-mapped binary watermark, BW : BER values are tabulated in Table 7 for 20 of the most representative test HDR images in terms of color distribution, texture, variety of lighting conditions, and dominant contrast proportions. These results show that the HDR-IW method is more robust to TMOs than the methods in [8], [9], as BER values attained by this method are the lowest for all TMOs. It is important to recall that the HDR-IW method embeds the watermark in low luminance regions, whose values are less susceptible to aggressive tone mapping. Note that the method in [9] is particularly susceptible to TMOs for PQ-encoded images, with an average BER as high as 0.5036. Figure 20 shows sample binary watermarks extracted after applying a TMO to the HDR images watermarked by the HDR-IW method and the methods in [8], [9]. These visual results confirm the trend observed in the BER values tabulated in Table 7. Specifically, note that although the binary watermarks for the HDR-IW method have noticeable visual artifacts, they have a higher visual quality than those for the methods in [8], [9].

D. FOURTH SET OF EXPERIMENTS: ROBUSTNESS TO LOSSY COMPRESSION
To evaluate the robustness to lossy compression, we use the HEVC compression standard reference software HM v.16.18 [40], which supports HDR compression. We employ intraprediction coding with four different Quantization Parameters (QP), ranging from a low compression level, QP = 0, to a very high compression level, QP = 40. Table 8 tabulates the BER values of the decoded binary watermarks w.r.t. the original binary watermark after lossy compression, using the proposed HDR-IW and the methods in [8], [9]. As expected, these results show that the robustness of all methods to lossy compression decreases as the compression is more aggressive. This is due to the fact that lossy compression mechanisms tend to compress more aggressively smooth regions, which are where watermarks are usually embedded in the pixel domain. When aggressive lossy compression is used, e.g., QP = 40, the maximum BER value for the HDR-IW method is 0.2840. Conversely, the maximum BER value for the methods in [8], [9] for QP = 40 are 0.7236 and 0.7246, respectively. We acknowledge that the sensitivity to aggressive lossy compression is one aspect of the proposed HDR-IW that may limit its applicability for the distribution of HDR images in compressed format.

E. FIFTH SET OF EXPERIMENTS: ROBUSTNESS TO COMMON SIGNAL PROCESSING OPERATIONS
Watermarks embedded in the spatial domain can be easily modified by applying common signal processing operations such as noise addition (GN), blurring (BL), rotation (ROT) and downscaling (DS). To measure the robustness to these common operations, we modify the test watermarked images, as follows:   Table 9 shows the BER values of the binary watermarks w.r.t. the original binary watermark after applying the signal processing operations listed before. These results confirm that the HDR-IW method is very robust to such operations. The largest BER values are obtained after adding Gaussian white noise; however, the average BER value for this operation is below 0.05. The methods in [8], [9] tend to be, on average, also robust to these signal processing operations. However, in general, the BER values for these methods are larger than those for the proposed method.
We finish this section with some comments about the computational complexity of the proposed HDR-IW method. For the evaluated HDR images tabulated in Table 2, our method takes, on average, 12.26 seconds to watermark each image on a PC with an Intel Core i7-7500U @2.90GHz CPU and 16GB of RAM. The methods in [8], [9] take, on average, 734.54 and 84.90 seconds, respectively, to watermark each of these HDR images on the same computer. Such low average processing times make the proposed method very well-suited and applicable for real-life scenarios.

VI. CONCLUSION
In this paper, we proposed the HDR-IW method to protect HDR images by embedding binary watermarks in the spatial domain that are imperceptible to the HVS. The HDR-IW method is based on a thorough analysis of the modelling used by an OETF to represent HDR images as a non-linear digital signal, the linear luminance radiated by an HDR screen by means of an EOTF, and the brightness perceived by the HVS from the HDR screen. To this end, the method uses an LVT curve to determine not only the most appropriate ER, but also the maximum variation that luma codes within the ER can tolerate before any changes can be perceived by the HVS. The watermarks embedded by the HDR-IW method in the spatial domain remain imperceptible to the HVS as long as the TF is not altered or the normal calibration and colorimetry conditions of the HDR screen remain unchanged. Our evaluations on a wide range of real-life HDR images encoded by the PQ and HLG TFs confirmed the method's capacity to embed imperceptible watermarks and its robustness to various manipulations, including tone-mapping. The HDR-IW method is then an attractive option to merge the advantages of invisible and visible watermarking methods to protect HDR imaging. Our future work focuses on increasing the robustness of the HDR-IW method to very aggressive lossy compression.
FRANCISCO GARCIA-UGALDE was born in Mexico. He received the bachelor's degree in electronics and electrical system engineering from the National Autonomous University of Mexico, in 1977, the Diplome d'Ingénieur degree from SUPELEC, France, in 1980, and the Ph.D. degree in information processing from the Université de Rennes I, France, in 1982. Since 1983, he has been a Full-Time Professor with the National Autonomous University of Mexico. His research interests include video coding, image analysis, watermarking, theory and applications of error control coding, turbo coding, applications of cryptography, and parallel processing and data bases.
VICTOR SANCHEZ (Member, IEEE) received the M.Sc. degree from the University of Alberta, Canada, in 2003, and the Ph.D. degree from The University of British Columbia, Canada, in 2010. From 2011 to 2012, he was with the Video and Image Processing Laboratory, University of California at Berkeley, as a Postdoctoral Researcher. In 2012, he was a Visiting Lecturer with the Group on Interactive Coding of Images, Universitat Autonoma de Barcelona. From 2018 to 2019, he was a Visiting Scholar with the School of Electrical and Information Engineering, The University of Sydney, Australia. He is currently an Associate Professor with the Department of Computer Science, University of Warwick, U.K. His research has been funded by the Consejo Nacional de Ciencia y Tecnologia, Mexico, the Natural Sciences and Engineering Research Council, Canada, the Canadian Institutes of Health Research, the FP7 and the H2020 Programs of the European Union, the Engineering and Physical Sciences Research Council, U.K., and the Defence and Security Accelerator, U.K. He has authored several technical articles, book chapters, and a book in these areas. His main research interests include signal and information processing with applications to multimedia analysis and image and video coding, security, and communications. VOLUME 8, 2020