Self-designed colour chart and a multi-dimensional calibration approach for cultural heritage preventive preservation

In this paper we present a colour calibration framework for cultural heritage preventive conservation. It includes a custom colour chart based on a regular sRGB division and a multi-dimensional calibration approach based on hyper-surfaces that map the original colours to the calibrated space taking information from all the channels. As shown, our work outperforms past proposals. This solution has been designed to be automatic and match the requirements of our cultural heritage surveillance framework needs, such as non-invasiveness and easiness to deploy. Its performance has been compared to publicly available calibration proposals and charts, and their results judged basing on the context and restrictions of the intended goal, such as illumination. Conclusions on the need for adapting technological solutions to the context of the goal, its cases of use when valid and self-designed methods are drawn. This work is part of the MIPAC-CM project.


I. INTRODUCTION
C ULTURAL heritage preservation has been a major concern of the worldly digital and information horizon for the past years, being included in the agendas of governmental organisms [1]. Proposals for achieving a successful cultural conservation involving digital technology, have been launched in big numbers [2], especially considering the wide meaning of the term and the different possibilities of its fulfilment.
Preventive conservation is a way of achieving preservation that focuses on safeguarding heritage integrity, and opens for a multi-disciplinary approach [3]. Preventive maintenance means taking care of the concerned piece before any kind of damage happens. It is also desirable that the prevention system is the least invasive to the piece as possible. This implies that its presence should, by any means, not damage the piece, nor having physical contact with it, and being as less noticeable by third parties as possible. This leaves space for any kind of monitoring deployment involving a variety of sensing-related approaches [4]. Therefore, when involving technology, preventive maintenance is likely one of the most favouring fields for blending cultural heritage with it [5]. Ambient condition sensors are amongst the most popular ones for cultural heritage control, like humidity, light, emissions or other chemicals that may affect the state of the concerned pieces [6] [7]. In addition to these, image acquisition related means are also proposed, as image data are considered as valid for heritage preservation as a noninvasive method [8].
Colour is an important visual cue for assessing the state of many materials [9]. Metals like iron, copper, silver or lead change their surface colour when exposed to degrading ambient conditions [10] [11]. Thus, visually checking the appearance of any material prone to degradation is a valid sensor-related surveillance mean.
When collecting colour data as indicative cues, camera disparity has to be considered. Due to subtle physical differences in their construction, no camera acquires colour information the same way as any other [12]. When there is no possibility of employing the same camera for performing all the work, a colour calibration scheme is needed, so the information, regardless of the source, can be homogenised. There exist well-known, reliable and available calibration systems that can be self-programmed or purchased and installed [13], alongside generally employed open-source solutions [14]. Nevertheless, when recurring to these, their conditions of usage may not always be adequate to the followed goals, such as the need of manual handling, or the installation of additional software or physical assets that are out of the scope. Besides, when purchasing a commercial solution, the price could be prohibitive if the dimension of the monitoring system is of a considerable size and requires of a big amount of assets.
In a previously published paper, we introduced a cultural heritage preservation framework based on image crowdsourcing [15] (figure 1). Visitors to the cultural heritage exhibition site are encouraged (and also stimulated through gamification strategies [16]) to take pictures of metal dosimeters and other materials sensitive to degradation placed next to the pieces under surveillance, which will act as indicators of the presence of degrading agents in the site, basing on their surface colour. Considering that all the third-party cameras will be different, a colour calibration is needed so the information of each crowdsourced image is homogenised and reliable. In our previous work we proposed a calibration scheme based on linear interpolation in the sRGB space layers of the acquired images that improved other state-ofart proposals with similar goal, relying on a set of reference colours present on a chart additional to the dosimeters. This is a crucial part of the MIPAC-CM project. In this paper, we present the natural continuation of the lastly proposed framework. The linear calibration approach is replaced with a multi-dimensional scheme that maps sRGB values present in an image to the calibrated space accurately employing information from all three dimensions to calculate one channel, effectively forming a hyper-surface as transfer function. Its calculation is unique for every concerned image, and does not require manual control nor the installation of a specialised software package.
Aside, we designed a particular colour calibration chart to be employed in the context of this project, that partitions the sRGB space in regular intervals to be mathematically generalist. It is conceived to be easily reproducible, cheap and ready to be deployed in big quantities, whilst allowing a possible reduction of its invasiveness towards the piece under surveillance when needed by employing a subset of the colour selection.
It is also an intention of this paper, aside to present a valid improvement to our previous work, to assess that generally available colour chart solutions do not always work so well when having to be adapted to the particular circumstances of certain applications, and that the design and deployment of self-made calibration techniques, specifically designed to deal with the particular problematic in question, are feasible and offer good results.

II. RELATED WORK
When gathering data, crowdsourcing is an interesting approach due to its scientific possibilities [17], including ways of involving it with cultural heritage preservation [18].
Recent research with crowdsourcing aimed to help preservation has been done, with successful results [19] [6]. Control and monitoring can be achieved by having a constant feed of third-party data about the status of a piece. Third-party data coming from cellphones is also considered. [20]. This also fulfils the non-invasiveness requirement [6].
When involving cultural heritage in controlled environments such as museums, lighting conditions are important to assess the visual status of the exposed materials. Whilst there exist some optimal guidelines on how to perform illumination, there exists no standard that imposes a particular criterion [21]. This is a problem that calibration also addresses, given that different illuminants affect on how colours, e.g. in a colour calibration chart, are perceived [22] [23]. Some approaches, especially agriculture-oriented, perform calibration to overcome the differences between sunlight in different times of the day when checking plants [24].
Solutions for colour calibration require the presence of two fundamental factors aside from illumination: the colour calibration chart and the mathematical basis for the proper calibration process [25]. Professional and widely available proposals may include both, offered by companies such as X-Rite [13] [26] or Imatest [27], or a specific software meant for fulfilling the operation, like Fiji [14].
However, some works take advantage of the mathematical basis of publicly available calibration schemes, such as Color Calibration Matrix (CCM) [28], and adapt it to optimise the process to their goals. Proposals to calculate the calibration matrix range from simple processes like linear algebra approaches [24], to other more elaborated ones involving least squares [29] or deep learning [30] have been published. Aside, other works also present custom color charts for the contexts of their own projects [31].

III. MATERIALS AND METHODS
In this section, we describe the reasoning behind our designed approaches and assets. These should deliver good results whilst maintaining themselves in the appropriate conditions needed to achieve the goal in its pretended context.

A. COLOUR CHARTS
In order to perform a correct colour calibration process we need to deploy a colour target that needs to appear in the pictures to be corrected. Usually, this kind of targets offer a set of printed patches whose corresponding RGB values are the "anchors" or references that serve as basis for building the mathematical calibration operation.
Most colour calibration applications in professional and amateur fields rely on commercially available charts whose good performance has been proved to be useful in most cases. One of the most widely recurred charts is X-Rite's ColorChecker Classic/MacBeth ColorChecker Chart, conceived to represent frequent colours existing in average real scenes [26]. There exist other variants of the ColorChecker Classic, including a wider amount of luminance-oriented colour values and a more exhaustive shading of the basic featured tones, such as ColorChecker Digital SG.
However, even with their conception of being multipurpose oriented charts, there is always the question if commercially available targets like X-Rite's can be the most suitable ones to be found for any general or specific application. This question is interesting, especially considering that factors like source illumination and the colour reference selection can be decisive [24]. It is also a factor to consider that its array of featured colours is conceived with a perceptual and not a strictly mathematical criterion. Thus, we conceived a self-designed colour chart with the goal in mind to display mathematical regularity. Considering that the sRGB colour space takes form of a Cartesian "cubic" three-axed space, we divided each of its axes into regular segments, and the colour combinations resulting of that division are taken to form the chart. The selection of colours is decided upon the number of patches that can be deployed and conform the set derived from equation (1), and also the acceptable physical size of the chart so it does not result invasive.
Here step = 2 n_bits /( 3 √ number_of _patches − 1) and n_bits is the number of bits per colour used for quantization in the colour space. Note that 3 √ number_of _patches should return an integer, so space is covered homogeneously by the division in all three dimensions and all patches fit in space.
If number_of _patches is adjusted to 64, it is possible to generate a physical colour chart as big to host enough large colour patches that will give reliable cues of their sRGB values, but small enough to be placed next to heritage pieces without bothering their inspection. Also, if adjusting and 8 bits per colour, (R, G, B) ∈ Z³. Lastly, if decimal sRGB values should be returned by the equation, (R, G, B) values should be quantized to the nearest value if needed.
Effectively, the conceived chart consists of 64 patches that correspond to the colours resulting from the commented division, given the aforementioned advantages and the feasible dimension requirements. Our design also includes the "corners" and limits of the sRGB space, so all the RGB and CYM primaries are included, as well as black and white tones (figure 2).
There is another consideration to be taken for the chart design. When performing a colour calibration, it is expectable that the colour reference values can be found in real environments when the pictures to be calibrated depict real scenes. This is especially important when concerning the extremes of the domain, like the primaries or pure black and pure white. Normally, real scenes will feature a limited gamut, spanning the interior of the total mathematical range of perceivable colours in space and leaving the extremes aside. Therefore, VOLUME 4, 2016 it is important to assure that the featured hues in the designed chart depict sensibly real colours, too.
In order to achieve this, the "real" corresponding values of the ideally conceived division have to be found, and then the chart needs to be built with them, so a sensible and accurate calibration can be achieved ( figure 3). So, the values of the aforementioned ideal sRGB space division have been printed with the same materials and conditions that were chosen in reference [16] for the physical chart to be deployed in real environments, and only then their spectral values measured with a Konica Minolta CM-700-d spectrophotometer. These measurements were then transformed to the sRGB (figure 2). Therefore, now the calibration chart with the appropriate reference values can be printed and deployed. Despite the fact that the printing process will inevitably reduce the fraction of sRGB space covered by the colour chart, having the sRGB range fully covered in its conception means that the range reduction of the real-life colours will be less exaggerated if only the interior fraction of sRGB was considered in the initial steps of the design.
The self-designed colour chart has been printed upon acidfree laminated paper, using an industrial ink set LED-UV Xtreme Pro, as stated in the previous article [16]. In this work, we described why these materials are suitable for producing colour calibration charts for the aim of this project, namely cultural heritage monitoring in a controlled environment. This also allows for a quick, easy and cheaper chart reproduction for many simultaneous usages and calibrations, so a big amount of them can be deployed or substituted at the same time. It has an advantage compared to other commercial charts in this sense, where their prize hinders their usage in a massive way such as monitoring a big amount of pieces. These are thus being more adequate for a reduced number of non-simultaneous calibrations.
In the following, however, the calibration performance for both colour selections, X-Rite's commercial and our own chosen mathematical criterion, are tested. In order to regularise the charts, we have printed the colour hues of the ColorChecker Classic the same way we printed our selfdesigned chart, arranging them in an analogous way. Given that XRite's chart features a reduced number of colours compared to our design, we added patches with repetitions of the same colours, so the same visual design of the chart can be achieved. When performing the calibration, all of them or only a subselection can be selected. Repeating reference values included in the calibration processes has shown having no influence, furthermore.
The colour patches in the charts are arranged on a pattern of 9 rows and 11 columns. In the central part of them, a space of 5x7 patches has been reserved for the aforementioned dosimeter material tags. These will be the monitoring cues for performing an effective surveillance.

B. CALIBRATION TECHNIQUES
In our previous article [15], we presented a colour calibration approach specifically designed for the followed context and goal, based on linear interpolation for the range [0, 255] of the sRGB R, G and B layers separately. Given the assumed discordance between cameras of origin, it has been conceived as an adaptive technique bound to overcome it and represent the image information in an homogeneous domain, regardless of the source. Thus, it calculates its parameters from and for every single supplied image exclusively.
In this paper we present an improvement to the previous linear calibration. Following the same basic idea, the concept is expanded so the sRGB information for every pixel in the R, G and B layers is taken into account to calculate each of the values for every coordinate in the calibrated space. This is equivalent to calculating three hyper-surfaces that relate three-dimensional information to one dimension each time separately. Analogue to the previously proposed linear calibration, each hyper-surface acts as a transfer function that maps all the sRGB domain to the calibrated RGB values. The resulting one-dimensional values of the three transfer functions are then stacked, so the new R, G and B layers of the calibrated image are formed.
This multi-dimensional calibration divides itself in two fundamental sequential steps. Firstly, the colour reference data are calculated with the values from the chart using a variant of the Reinhard color transfer technique [32], so the position of the reference data in the sRGB cube and their relationship with their homologues in the pictures to be processed can be described. Secondly, the construction of the R, G and B spatial transfer functions is performed, so the value mapping can be performed. Both steps are entirely calculated using the single image to be processed, knowing the set of reference colours in beforehand.
If we represented the colour patches out of the chart in the sRGB space, ideally they should take the form of regularly scattered dots along the Cartesian "cube". However, when taking any image representing the chart to be processed, each patch will feature some visual heterogeneity among its pixel values due to the effects of the inks when printed, the angle of light reflection on its surface, the digital camera sensor response, etc. What was originally conceived as an uniform, single colour value printed in a small physical region, is perceived as an approximately Normal distribution surrounding an average value when digitalised (see figure  4). Therefore, the shape of the colour patches in a digital image, when represented in the sRGB space, will be seen as a group of three-dimensional point clouds that also have lost the regular distribution intended with the initial sRGB division. Shortly said, the colours of the chart are displaced and instead of being described by an unique position (R ori , G ori , B ori ) in the R, G and B axis, they now possess a distribution of N (µ, σ) characteristics in every coordinate, Taking this phenomenon into account, the first step of the multi-dimensional calibration consists in recovering the spatial statistic characteristics of the ideal sRGB distribution, calculated from the colour patches in each image to be processed. This will enable the possibility to build the  "skeleton" of the transfer functions as anchor points of the hyper-surfaces to be constructed. In order to achieve this, a modified version of the Reinhard color transfer technique is applied. The Reinhardt color transfer consists in modifying the mean and the standard deviation of a numerical distribution, so it is displaced in space and its width is modified, but without actually losing information on peaks and maintaining proportions in space. When applied to the colour histogram of the R, G and B layers of any given image, it effectively produces a colour shift of the overall tone in it.
The effect of the Reinhard transfer can be used for producing an equivalent mathematical distribution shifting in 3D. Instead of modifying an image layer histogram, each one of the digital colour patch distributions of the chart in space can be shifted in three dimensions(equation (2)), so the positions in space of the ideally conceived chart can be regained parting from its appearance in the images to be processed.
Effectively, this matrix operation means a displacement and reshaping of every distribution in space. If the target colour distribution consists of a single point (σ = 0), the standard deviation is left untouched, so the value range of every patch distribution is respected and there is no loss of image pixel information in this sense (equation (3)).
When finishing this first step, every pixel value corresponding to the colour patches in the chart will have a corresponding homologue rearranged in the sRGB space. These will conform the anchor points of the multi-dimensional transfer functions to be built, and thus the "skeleton" of the calibrated sRGB space to be calculated. The following process consists in finding out the rest of values between the anchor points, by connecting them in all three dimensions.
To achieve this, the hyper-surfaces f (x 1 , x 2 , x 3 ) are calculated analogously to the firstly proposed linear calibration (4), but extended to the amount of dimensions needed for this application (N dim = 3). This process is based upon a trilinear interpolation (5), where P (r k , g k , b k ), k = 1, 2 stand for the anchor points of the cube that surrounds the sRGB space region where the hyper-surface is constructed. These anchor points are composed by combination of two different r, g and b coordinates.
(5) Lastly, having built the three transfer functions following this criterion, the R, G and B layers of the calibrated image are calculated by projecting every pixel of the images to be processed using them.
In the following sections, the performance of the multidimensional calibration is compared with the previous linear VOLUME 4, 2016 scheme and its stronger points objectively evaluated. Aside, in order to achieve a more complete comparative experiment, the results obtained with a widely utilised scheme such as Colour Correction Matrix (CCM) are also featured. Normally, the CCM calibration is favoured to be performed with the help of specialised software tools, which allows a manual parameter selection for an elaborated optimization process [28].
Nevertheless, for the sought kind of cultural heritage monitoring, this method is infeasible. Surveillance of the pieces in question should be automated and adaptive, considering that the amount of crowdsourced pictures can reach big numbers and involving a permanent human effort would mean additional undesired costs. Having this restriction, the CCM calibration to be tested here needs to be adapted to the conditions of this project. Other works also do their personal approaches to the basic CCM idea [30]. Thus, a resolution of the CCM based in linear algebra, following the reasoning in [24], would be more feasible, quick and appropriate for the comparison to be explained in this paper.
The process for calculating the CCM here is thus unique for every image. First of all, representative values for the colour patch distributions for each image to be processed are extracted. These are the means of the distributions. Parallel to this, the ideal values of the colour reference patches are also extracted. Using both sets of values, a 3x3 matrix is deduced (6). The calibration is achieved by multiplying every image pixel with it.
In equation (6), it is stated that the set of reference values R should be equivalent to the CCM multiplied by the set of same colours in the original image O r . Elaborating on this, CCM can be calculated by equation (7).
This method also ensures saving colour distribution information by assigning a different calibrated value to every original value. However, it should be performed in the mathematical linear RGB space, so the sRGB associated gamma non-linearity does not affect the results of an otherwise linear operation. Thus, the CCM calibration approach requires a gamma preprocessing and postprocessing so the sRGB information stays coherent with the calibration and the linear operations maintain their linearity.

IV. EXPERIMENTAL
The chart and the mathematical calibration basis, as the two critical parts of the scheme, have therefore been tested in order to measure their performance quality in the context of this work.
The case of usage consists in collecting crowdsourced pictures from different cameras of origin. Our last proposal addressing this topic [15] showed the robustness of the linear scheme when dealing with the inter-camera acquisition variability. Therefore, it is also adequate to test the other calibration schemes under different image conditions that might alter the colour and visual characteristics.
Henceforth, four sets of test images have been prepared to check the performance of the linear, multi-dimensional and CCM calibration schemes employing the previously prepared two variants of the colour charts. The dosimeter placeholder space, in the middle of the targets, has been substituted with uniform colour stripes that imitate the average tone of copper, silver and PH-sensitive paper, and different textures that have been taken of real photographs of copper and silver tags ( figure 5). So, the effect of the calibration and accuracy can be measured on plain colours different from the anchor points, on the gray luminance-related region of sRGB space, and on approximately realist colour information of the real targets to be processed in field cases. Therefore, four different kinds of test charts have been printed.

FIGURE 5. The two types of test dosimeters employed
It is utmost important to consider the effect of the calibration on the dosimeters. Their colour values, and especially their colorimetric differences with the original, non-degraded state, is essential for performing an effective surveillance on the state of the cultural heritage pieces. Then, the calibration performance on their equivalent textures has to be evaluated in a similar way. This is the reason why the success of the tests has been measured with objective means. Given that the monitoring of the colour change in the dosimeters is meant to be done in the measurement-oriented CIELAB space, the calibration performance on their placeholders is evaluated by taking the average CIEDE2000 [33] difference of the calibrated picture compared with the equivalent ideal dosimeters in the digitally designed colour charts. This is the main metric taken for judging quality, and the one used as a guide for the evaluation. Apart, CIEDE94 [34] has been taken, to see its effects on an alternate metric. The results of these can be found in the additional tables (Results set 1, Results set 2, Results set 3 and Results set 4) supplied with this article.
Lastly, and as a mean to assess how accurate each calibration approach is, the same distance measures are taken on the colour chart reference patch distributions. This way, the closeness of the transfer functions with its initial "skeleton" made with the original anchor points can be evaluated.
The four image sets feature acquisition of the charts under different lighting conditions, ordered from the lowest to the highest illuminant temperature. Set 1 consists in pictures taken under a common tungsten light bulb, with a yelloworange light of low brightness. Set 2 was taken under direct sunlight at noon (white light with elevated brightness), whilst sets 3 and 4 were taken under common fluorescent illuminants of high temperature (blue light), of high and low brightness each. The particular values of illuminant temperature are left unknown, as the intention of the project is to test the performance under any condition, given the lack of standardization on interior lighting [21]. All of them were taken with a conventional cellphone camera that codifies the information in sRGB, as the vast majority of crowdsourced pictures from conventional cellphone cameras are expected to be [15].
According to this, it is reasonable to think that the physical placements of the charts under monitoring might incur in slight illumination differences depending on the room or the exhibition, which is possible to happen even in the same room depending on their location. Thus, the selection of test images covers both the effects of inter-camera acquisition variability, with different perceptions of the same colours, and the visual aspect provoked by non-regulated lighting sources. Aside, it is very interesting to check the calibration performance when using a reduced subset of the original anchor colours for building the transfer functions. Some papers [24] state possible benefits and better performance depending on the number of reference values to take on calibration depending on the context of usage, namely on reducing it. This is also a fundamental question to address when considering that the colour chart for monitoring cultural heritage is an invasive element among the exposed pieces, so any justified reason for its presence reduction would be adequate to check, too. The reduced amount of material implies an even cheaper cost of chart reproduction and deployment.
Henceforth, the tests have been performed for the complete set of colour patches, and for reduced subsets consisting in the CYM and/or RGB primaries (or their closest equivalent), adding black and/or white. This has been repeated for all images in the four sets of different characteristics, and in both kinds of produced charts with two different dosimeter texture testings. Table 1 shows the variable factors that have been combined for experimenting with each set of images. Lastly, so the effectiveness of our scheme can be assessed to the last extremes, the explained experiments have also been executed over field pictures aside from the test sets. The self-designed chart has been printed in the aforementioned procedure and deployed in a museum during an one-year test period, and therefore some pictures taken as a testing ground.

V. RESULTS AND DISCUSSION
The tests have been performed on the pictures of the four considered sets and field images. The visual appearance of the non-calibrated pictures corresponds to what is expected from real crowdsourced information by conventional cellphone cameras: different illuminations and shadings, pixeling effects in what was initially conceived as uniform colour regions and overall different colour tones.
This characteristic in the pictures has been observed as being considerably influential over the calibration process. This is an especially prominent effect to be observed on the CCM calibrations. After going through the process, the overall colour tone given by the illuminant is exaggerated ( figure 8). Thus, all pictures from the Set 1 after going through the CCM present an exaggerated yellow-green tone shifting, whilst sets 3 and 4 are shifted towards blue-violet. Set 2, given its white illuminant, does not feature that effect. When observing each picture in detail, it can be seen that the overall tone shifting is slightly different for each image of the same set, which is indicative of the calibration being unique VOLUME 4, 2016  9). It is understandable to think that, given that the CCM calculation is made by means of taking the average value for each anchor colour distribution, and due to its pixeling effect, it will be slightly different in each picture, even among the same set under the same illuminant. The colour tone shifting does not happen with the selfconceived linear and multi-dimensional calibrations. However, these incur in another different effect that is not to be found in the CCM results. Whilst CCM, being fundamentally a multiplication with a certain value, respects the pixel distribution variability, the other two techniques may lose that coherence in some cases. This is especially predominant in the linear approach, where every layer transfer function and therefore coordinate is calculated separately, and thus the resulting pixel coordinate values might have not the same proportion than in the original image, effectively assigning "false" colours to them. The multi-dimensional approach minimizes this effect by extracting information from all layers for every transformation. Curiously, this effect is more prone to happen in tests where a bigger amount of reference colours is taken. This happens due to the different values of a pixel distribution falling into different interpolation regions that are close in space, so the bigger the interpolation region, the less distribution alteration happens.
Nevertheless, given that the final intention is to measure the calibration quality by objective means, the objective distance metrics ∆E mentioned in the Experimental section are taken as representative to the success of the designed scheme. This is also the most reliable way to assure that the calculated transform functions are accurate, and to compare the performance between charts.
First of all, the accuracy of the calibration schemes needs to be measured. In order to do that, mean CIEDE2000 ∆E colour differences between each colour patch of the reference data and calibrated images are evaluated. Table 2 shows the mean and the standard deviation for the difference metrics for all the 64 colour patches taking the calibrated images from all the four sets, for every calibration approach. In all cases, the multi-dimensional calibration offers a mean metric between 0 and 1, indicating an extremely accurate resemblance, almost unnoticeable, with a very slight variation. This indicates an accurate structuring of the "skeleton" of the transfer functions. These metrics are superior to the CCM and linear approaches, where the effect of the colour shifting and the false pixel colour assignations respectively can be observed by the big differences and the stronger variability.
Then, the mean difference values between the calibrated images and the reference data are calculated for the dosimeter textures. Here, a comparative study for all the images inter and intra-set has been performed, observing the performance for the self-designed and the X-Rite-inspired colour charts. Two fundamental criteria have been taken: accuracy and its improvement between both kinds of charts, and robustness against reduction of the anchor colours.
Thinking logically, it may appear that the more amount of anchor colour values taken to build the calibration scheme, the better the performance is meant to be. This conclusion is TABLE 2. Accuracy CIEDE2000 ∆E metrics of the 64 colour patches of the self-made colour chart for the three colour calibration schemes. It can be seen that the multi-dimensional scheme, in columns 2 and 9, offers the best similarity. obvious to reach when considering that more points means a bigger range and a finer division of the space, a fact that should imply a more accurate value mapping when building the curves and a lower probability of getting an exaggerated interpolation error. Nevertheless, some studies [24] show that not only reducing the amount of anchor colours to a specific subset does not only not affect the calibration quality, but it also may improve it when using CCM and X-Rite's colour chart. This kind of result is not as sure to happen when employing the self-designed calibration schemes, considering their different mathematical basis. The effects of the reduction of the amount of anchor colours is measured with the standard deviation of the differences in the dosimeter textures for all the experiments with different amounts of reference colours. The closer to 0, the less effect the reduction has. This is especially important when dealing with the linear and multi-dimensional calibration schemes, considering the bigger span of the interpolated regions. Table 3 reflects this study on variability on sets 1, 2, 3 and 4, respectively, in different rows. The standard deviation of the CIEDE2000 ∆E calibration accuracy values for every dosimeter in each reference variation experiment is shown. It can be seen that primarily and in all sets, for the multidimensional calibration scheme the stability of the results is better. Furthermore, our self-made colour chart shows a generally even better stability than the X-Rite reference, with a standard deviation between 0 and 2 in most cases, especially in sets 3, 4 and 1.
The most variable results are thrown by the CCM calibration, which is understandable considering the colour shift effect on the calibrated pictures. The linear calibration offers results of mixed and variable quality, which arguably come from the pixeling effect it is prone to provoke. Whilst the illumination difference seems to affect these experiments, its effect are seen to be minimised on the multi-dimensional calibration, where the resulting numbers stay in a similar range for all four sets.
This implies that the multi-dimensional calibration framework is the least affected in results by the lighting conditions and reduction of the reference colour set.
In the supplied tables as additional data (Results set 1, Results set 2, Results set 3 and Results set 4) , one for each set of images, the average ∆E metrics measurements are given for each single target dosimeter in each experiment with different calibration techniques, as seen in the different tags of the tables (CCM, Multi-dimensional and Linear), using both charts as references. The two column groups on the right state the difference in percentage that the usage of the selfmade chart means compared to X-Rite's selection. A negative value means a worse performance, whilst a positive indicates a better one.
When inspecting the CIEDE2000 data results for set 1 (low temperature and brightness), it can be seen that, generally, performance when employing X-Rite's chart is better, in some cases with a vast difference, whilst in the multidimensional calibration the self-made does a better result with reduced reference sets, corresponding to the observed before. The range of differences stays mostly between 2 and 14, with some dosimeters surpassing 20. The worst metrics appear in dosimeters with values closer to the extremes of the sRGB cube, like the green or the yellow uniform tone, whilst the grayer tones in the interior feature better ones. Overall, even if following a tendency of which reference values do better, variability amongst the improvement/worsening metrics indicates a different treatment of the colour values VOLUME 4, 2016 depending on their position in space.
When inspecting set 2 (white light and high brightness), the seen effect is the opposite. The predominant tendency is improvement of the quality when employing. Also, it can be observed that for the three calibrations, the general tendency is to worsen the metrics the more reduced the set of reference colours is. The linear, however, is more stable to reduction, whilst its improvement or worsening is more variable depending on the dosimeter. The most affected dosimeter is, again, the one that corresponds to the green tone, whilst the innermost show measures of the same proportion. The multidimensional calibration, again, shows the best qualities in a range between ∆E 2 and 10 in the huge majority of cases, and CCM shows distances of a more exaggerated error.
Sets 3 and 4 are shot under a high illuminant temperature (blue light), under high and low brightness each. The first thought that the observer might have is that the objective results might be parallel, with slight differences. In CCM, the evolution of the metrics for the experiments reducing the amount of values is similar, as they stay on the same proportion of values. Nevertheless, set 3 shows more exaggerated differences with less colours, as it happens with the improvement percentages. The dosimeters that seem the less affected by the brightness difference between sets 3 and 4 are the three uniform brown ones, that show analogue improvements in both, whereas in the others the metric is more exaggerated. In the multi-dimensional calibration, this effect on the exaggeration of metrics in the images with a higher brightness is observed again. The metrics stay in the same range for both the uniform and textured dosimeters with the self-made chart, whereas the calibration effects with X-Rite's seem to affect them differently. Nevertheless, X-Rite's offers better results in general, with ∆E averagely staying between 2 and 10, except when involving a smaller set of reference colours. The linear calibration, as before, shows a big variability among the resulting metrics and improvements and worsening, revealing the effect of pixel discolouring.
The same information can be drawn from inspecting the CIEDE94 metrics, that follow a similar tendency, given that it features an alternate measure but in the same uniform colour space, CIELAB.
The same experiments have been ran over the four considered field images. However, considering that pictures of this kind are examples the subject of analysis of the whole system when deployed, the comparison between dosimeters has no place in this case. When testing over real field images, there do not exist any reference dosimeters, given that they are the subject of analysis of their colour change. Therefore, the accuracy of the calibration here is demonstrated by calculating the CIEDE2000 difference between the calibrated colour patches and the ideal hues only. Table 4 shows the extracted CIEDE2000 values between the colour patches of every field image and the reference for the three calibration strategies. It is noticeable that, in spite of being non-simulated test subjects, the objective results follow the same pattern than with the other sets of pictures. The Multi-dimensional calibration scheme offers the best results, ranging between CIEDE2000 0-1, whilst the linear stays between 0 and 10, and CCM exceeding that range by far, up to 40 in some cases.
The results for all experiments using different amounts of reference colours can be found in the supplemental table Field images. There it can be seen that, when executing the calibration via CCM, there exists a slight variability in the results in every different execution of the process. Meanwhile, the metrics for the linear and multi-dimensional schemes stay consistent for each colour for all experiments, meaning that the mapping is unique for every image with no possibility of miscalculation using the exact same reference values.
This is a proof that the conceived scheme is robust and performs well in the application context of this work.
Seeing all the valuable information given by the performed amount of experiments, the following results can be deduced: • The usage of CCM for calibration is not recommended for monitoring with crowdsourced pictures, where pixelings in the acquired pictures and illumination shifts are a frequent thing to happen, introducing an extremely undesired variability among the results. It is especially important when considering to monitor similar content under different illuminations using the same intended reference. Furthermore, pixeling effects when acquiring the images are prone to happen when the chart is printed on paper, even if by professional means. This is a requirement in this project since charts are needed to be placed in many locations, and buying a big number of professional commercial charts would mean prohibitive costs. Easiness to reproduce is mandatory in this case. • The multi-dimensional calibration approach does a better objectively-measured performance overall compared to the last linear proposal. Calculating the calibrated coordinates employing information from all three colour channels and not only one at a time helps reducing the pixel alteration that may affect the colour distributions when summoning an interpolation-based calibration scheme. Its metrics show that it builds the transfer functions accurately, mapping the anchor values correctly to the calibrated space and builds a solid interpolated region. • In good lighting conditions, with a neutral temperature and high brightness, its measures indicate an excellent calibration quality. It is also robust to colour reduction in variable lighting conditions, which makes it an excellent candidate for calibration in museum and other indoors heritage exhibition room lighting setups. • Comparing our choice of anchor colours with X-Rite's, it can be observed that the commercial chart selection does generally a good performance, whilst ours offers a more accurate experience at the right conditions. This arguably has its explanation in the physical presence of the anchor colours in the real-life taken scenes. X-Rite's selection of colours is intended to reflect real, present hues (such as skin tones) that are likely to happen in real life, which means that they are contained in the interior areas of the sRGB cubes. That implies a closer distance between each other, and thus it forms a calibration hyper-surface with a finer assignment in the interior, less saturated regions. That also explains the bigger error in the dosimeters featuring colours in more exterior spatial positions as greens, and the different effect of illuminant temperatures on each hue. The conceived uniform sparsity of the sRGB space we designed, even after taking into account the effects of reality by taking the spectral measurements of the printed ideal colours, was produced with an idea of generality, instead of focusing in a finer calibration within a certain subset of the space. Nevertheless, its regular approach allows the reduction of the set of anchor colours without incurring in noticeable errors, and even improving the performance, in most of the cases. Given the cube-like distribution of the anchors, when removing the interior values whilst maintaining the corners the shape of the hyper-surface will remain approximately the same from one extreme to the other of the cube, effectively meaning a similar transfer function. This structure is definitely important to consider when wanting to deploy the charts in real life scenarios. In order to reduce the invasiveness of the monitoring mechanism and its assets, which is a heavy factor to have in mind when designing similar systems, the size of the chart to be placed along the heritage piece should be as small as possible, especially the closer it is to the exposed item, for any reason. The size also influences the cost of the chart reproduction, so the smaller, the more possibilities of deployment it has.
In the end, our proposal of colour calibration chart offers more horizontal possibilities for the concerned kind of cultural heritage monitoring system. Whilst more sensitive to the lighting conditions than X-Rite's colour selection, it offers better results on cases of adequate brightness, up to the range of 11-36% of improvement in these situations, and fulfils the invasiveness, cost, reproducibility and robustness requirements expected from the application it is designed for.

VI. CONCLUSIONS
In the previous text we have outlined a colour calibration technique for sRGB pictures gathered from commercially available cameras, and a colour calibration chart with a mathematically regular and generalist choice of colour values. Their performance and validity of usage for building a crowdsourcing-oriented colour calibration scheme for cultural heritage monitoring in controlled environments has been tested, their advantages stated, and compared with other alternative and widely used variants.
Whilst there already exist universal solutions for calibrating images, many of them already implemented in hardware to use or in commercially purchasable or open-source software packages, concrete and punctual modifications of the same or different variants of these for specific applications may be needed to be implemented. However, sometimes these adaptions on them to match a particular use case might fail or not be suitable to be implemented, delivering results worse than what is expected.
In this paper, the aforementioned fact has been acknowledged. Available means, as the X-Rite chart and the CCM VOLUME 4, 2016 calibration, have been adapted to match to the projects requirements, and have been proved not being completely adequate to the conditions sought, failing to compel in some situations. Self-designed proposals have been conceived, and have been found to deliver more compelling results in these cases while being adaptive to the project from scratch.
In settings such as cultural heritage exhibition rooms, exposition conditions such as lighting can be a decisive factor, showing the boundary where custom solutions can be better. In this particular case of study, with high brightness and direct lighting.
The self-designed sRGB division in our chart offers equal or better results when reduced in anchors to be less invasive to the piece compared to the full range, and the multidimensional calibration scheme offers an accurate value mapping from the original sRGB to the calibrated space with many possibilities to be refined for concrete value ranges in further approaches.
Knowing the context of the application for a colour calibration system and its requirements, developing the adequate framework and assets is a feasible and reasonable possibility when the conditions of the sought aim do not allow the direct and bold implementation of available solutions due to any reason.
When monitoring any number of cultural heritage pieces in a controlled environment, it is desirable to incur in the least invasive mean possible. Crowdsourcing pictures implies not needing to install a camera system. The calibration chart should be easy and cheap to reproduce, possibly available in a big number if many pieces need to be put under surveillance, and reducible in size (colour patches) if less invasiveness is needed without meaning to worsen the performance. The calibration scheme should be automatic to perform and adaptive so it can be calculated from each picture without the need of a working hand, and easy to deploy without any specialised software while delivering satisfactory results.
The proposed scheme and chart are adequate for the followed aim in cases where other methods may fail, and are also cheap and easy to modify if future improvements may be needed.
The work presented here lays ground for further explorations in the direction of calibration chart design and specifically cultural heritage preservation. Knowing which range of the sRGB space the particular material to be monitored can reach in its values when measured, different proposals of anchor colours, encompassing a smaller region of the space around the values of interest, can be conceived. Therefore, a finer calibration can be achieved in these regions, given that an effectively shorter distance between anchor points in space can be achieved. Aside, a mean for generating valid calibration charts depending on the size, amount of anchor colours and context of usage can be deduced for further improvement.

ACKNOWLEDGMENT
Work partially funded by the project MIPAC-CM (Monitorización por procesado de imagen y ciencia ciudadana para la conservación de materiales del patrimonio cultural -Monitoring by image processing and citizen science for conservation of cultural heritage materials), project code Y2018/NMT-4913. Thanks are due to Ignacio García, from Once34, for his generous collaboration in the printing of the charts.