Colorimetric Observer Categories for Young and Aged Using Paired-Comparison Experiments

Observers with normal color vision have different color perceptions since their retinal cones have different spectral responses, especially when the color pairs are constituted by different primary colors. In order to categorize the color-matching functions (CMFs) for young and aged observers, seven devices with different primary colors were performed on the clustering of observers’ CMFs. Four CMFs were generated from Beijing Institute of Graphic Communication (BIGC) with paired-comparison color matching experiments based on five printed colors from young and aged observers performed under fluorescent light source (Experiment I). Besides, color comparison experiments based on three printed colors under light emitting diodes (LEDs) light source were also carried out (Experiment II). And then, the existing CMFs, including Commission Internationale de l’ Eclairage 1964 (CIE 1964), CIE 1989 standard deviate observer (CIE 1989 SDO), CIE 2006, Sarkar 1 to Sarkar 8, Asano 1 to Asano 10, BIGC 1 to BIGC 4 were tested with the results from Experiments I and II respectively. The results indicated that the visual data from young and aged observers had quite large discrepancies. When they viewed printed materials illuminated by fluorescent and LED light sources in a field of view larger than 10°, based on the standardized residual sum of squares (STRESS) values and the number of the observers contributed to the minimum STRESS values, two new CMFs were proposed to represent the young and aged observers who participated in our experiments.


I. INTRODUCTION
The color sensation of observers is usually determined by three factors, which are the spectral power distributions (SPDs) of the light source, the spectral characteristic of the object and the spectral response of observer's retinal cones. With the development of modern color science and technology, there are more techniques and methods to reproduce colors, such as light-emitting diodes (LEDs) and lasers that are spectrally narrow band, magnifying the individual differences in color-matching functions (CMFs). In that case, the observers with normal color vision will have different color perceptions (named color inconstancy) [1] and different color differences (named observer metamerism) [2], [3] under the same viewing conditions. The associate editor coordinating the review of this manuscript and approving it for publication was Chao Zuo .
The CIE 1931 and CIE 1964 standard colorimetric observers proposed by Commission Internationale de l' Eclairage (CIE) [4], [5], also known as 2 • and 10 • (standard) observers, took different viewing fields into consideration and represented the average cone fundamentals of a population with normal color vision. However, with the development of devices with different spectral characteristics, it may be bringing failures in color reproduction [6]- [8]. In order to describe this failure in detail, in 2006, the CIE proposed a physiological observer model (called CIE 2006) that provided cone fundamentals by specifying the observer's age and field size [9]. This model enables the CMFs of different theoretical observers to be generated and used to evaluate observer metamerism.
Some research indicated that the psychological parameters would affect the spectral response of observers during aging. Pokorny et al. [10] pointed out that there was high variability of difference in the optical density of the ocular media among observers, which was more obvious in the observers of higher age groups, and this variability would manifest more significantly in the blue region of color space. Meanwhile, in Rich and Jalijali's study [11], large inter-observer variances were found in 26 observers aging from 20 to 50. Recently, Swanson and Fish [12], Zagers and van Norren [13] have proposed that the peak optical density of the visual pigments decrease gradually as a function of age. CIE TC1-86 reported that [9] the main difference between the elderly and the young was the difference in the absorption of the ocular media (especially the lens). In the following work, Sarkar et al. [14] indicated that compared with other physiological factors, the peak optical density of the photopigment had the greatest impact on the color perception of the display. Therefore, it is necessary to investigate the variation among observers.
Sarkar et al. [15], [16] proposed different observers' categories by using a cluster analysis method, it started from 47 Stiles-Burch individual CMFs [5] and 61 CMFs which were computed by the CIE 2006 model and corresponding to the age range of 20 to 80 with one interval in 10-degree viewing field. In the simulation experiment, 47 human observers were classified into nine categories (including the CIE 1964 standard observer as one category) and two of them were assigned to the aged observers. However, in Sarkar's work, only 47 Stiles and Burch's observers were regarded as the real observers to be categorized, which will cause the range of age for his categorical observers unclear and limited. In 2015, Asano et al. generated 10000 sets of lms-CMFs from the individual colorimetric observer model using Monte Carlo simulation and performed cluster analysis by modifying k-medoids method [17]. In his study, a simulation workflow was conducted by eight different combinations (a reference spectrum vs. a set of matching primaries) from the actual display primaries, and thus 10 categorical observers were derived iteratively, in which the age of the observer ranged from 30 to 68 given in the calculation. In Asano's work, the simulation experiments were conducted to categorize the observers' functions. The performances of the categorical observers in terms of different ages and the numbers of categories are still requiring further investigation.
The goal of present work is to categorize young and aged observers into different categorical observers and the work is divided into three parts: firstly, categorical observers for observers with different ages in 10 • field of view were generated iteratively using cluster analysis method; secondly, our previous paired comparison color matching experiments conducted by 30 young and 26 aged observers were used to generate four combinations of Beijing Institute of Graphic Communication (BIGC) CMFs; finally, 26 CMFs were tested by two groups of paired comparison color matching experiments conducted by young and aged observers in the previous study, and then two observer functions were recommended for young and aged group respectively.

II. THE ESTABLISHMENT OF OBSERVER CATEGORIES A. CLUSTER ANALYSIS METHOD IN SARKAR'S WORK
In Sarkar's work [16], [18], the cluster analysis [19] was performed on the 108 (= 47 real + 61 simulated observers) CMFs, which were constituted by 47 Stiles and Burch's individual observers and 61 CMFs computed from CIE 2006 with the age range of 20 to 80, in 10 • field of view size. The clustering was performed for eachx(λ),ȳ(λ) andz(λ) channel of the CMFs separately. The minimal number of clusters was sought based on CIE DE2000 values criterion. The number of clusters was determined to be five for each channel of the color matching functions, which resulted in 125 (= 5 × 5 × 5) possible combinations of model CMFs. And then, in order to derive the possible categorical observers from the 125 combinations, the Macbeth 240 Color Checker samples viewed under D65 illumination were selected as the reference colors, and the CIE DE2000 color differences were computed between real Stiles-Burch observer CMFs data and 125 (= 5 × 5 × 5) possible combinations of model CMFs. Among the 125 CMFs, a set of model CMFs that cover as many Stiles and Burch's individual observers as possible under a certain color difference was sought iteratively. The algorithm was repeated excluding the already-covered Stiles and Burch's individual observers until all Stiles and Burch's individual observers were covered. In the end, eight observer categories and the corresponding CMFs were derived.

B. CLUSTER ANALYSIS METHOD IN PRESENT WORK
With respect to Sarkar's work, there are some limitations and the modifications in present work were illustrated as follows: 1) As the age distribution of the 47 Stiles-Burch CMFs in Sarkar's work was not uniform but significantly high in twenties. In the calculation of CIE DE2000 color difference, 61 CMFs were added as individual observers to expand the age range to 80. The calculations were repeated until all 108 CMFs were covered, instead of covering only 47 Stiles-Burch's real observers.
2) Considering the primary colors have great influences on observer metamerism [7], [20], [21], and the reference colors used by Sarkar have relative flat distributions, 17 color centers were selected and presented on five different displays with three channel primary colors or printed and illuminated by different light sources as the reference color stimuli. The 17 color centers were recommended by CIE for further coordinated research on color-difference evaluation [22] and uniformly distributed in CIELAB color space.
During the cluster analysis, five displays with different spectral primary colors were used and their relative power were shown in Fig.1. For the No.1 to No.5 displays with different primary SPDs, the presented color stimuli have different primary spectra and they were used in our previous study to investigate observer metamerism [20]. For the No.6 and No.7 devices, the 17 printed samples were illuminated by two light sources with different SPDs: an artificial daylight (AD) provided by a Gretag Macbeth Judge II viewing cabinet fitted   with a D65 simulator and a CIE recommended standard D65 (SD) illuminant. The relative power of the light sources and the spectral curves of the 17 printed samples were illustrated in Fig.2. The detailed information of different devices was given in Table 1.
Meanwhile, in order to perform a comparison with Sarkar's work, five categories were respectively derived inx(λ),ȳ(λ) andz(λ) color matching functions (the red, green and blue lines) by using the k-medoids algorithm of cluster analysis as well as the square euclidean distance measure, then 125 (= 5 × 5 × 5) categories were derived from the combinations of each of 5 ×x(λ), 5 ×ȳ(λ) and 5 ×z(λ) functions via an iterative algorithm. The 108 CMFs (black lines) and five representative functions in eachx(λ),ȳ(λ) andz(λ) channel are plotted with red, green and blue lines in Fig.3 respectively.
The XYZ values were computed from all combinations of 125 CMFs possibilities and for each of 108 individual observers (= 47 + 61). The CIE DE2000 color differences were computed from 108 observers and those from predicted CMFs (i.e. 125 combinations of each of 5 ×x(λ), 5 ×ȳ(λ) and 5 ×z(λ) CMFs), as shown in (1) and (2). The calculation of color differences as performed in (2) requires a common (unique) reference white. Two sets of CMFs will correspond to two reference white colors, and a chromatic adaptation is usually needed before computing color differences. In the transformation, the CAT16 (color adaption transformation) proposed by Li et al. [23] was performed on the 108 individual observers with reference to each CMFs in 125 combinations where i max is 108, j max is 125 and n is ranging from 1 to 17. ϕ n (λ) indicates the spectral characteristic of the n th reference color, (XYZ) in and (XYZ) jn represent the XYZ values of the n th reference color computed by each of 108 individual observers and 125 predicted combinations. For individual observer i, 125 averaged E 00 color differences can be computed in turn from (2), the CMF j corresponding to the minimum E 00ij value will be assigned for the individual observer i.

C. CTEGORICAL OBSERVERS FROM PRESENT WORK
The reference colors were reproduced by different devices are shown in Table 1, and the minimum E 00 computed from (1) and (2) was taken as the criteria for a given observer, 10 reduced sets of color matching functions out of the abovementioned 125 observer categories from each device were sorted as CMF-x, CMF-y, . . ., CMF-. Table 2 shows the features of 10 CMFs of each device and the categorical VOLUME 8, 2020 results. In order to accumulate as many observers as possible in a limited category, the criteria for accumulative probabilities (AP%) of the observers assigned to the top ten observer categories is no less than 75%. Note that the criterion of the top ten and 75% is only a criterion for selecting the primary CMFs to represent most normal observers. It can be seen from Table 2 that there are several same combinations sorted by the seven devices with different primary colors in cluster analysis. For example, the CMF-x of iPad was constituted by the combinations of 4-x(λ), 3-ȳ(λ) and 1-z(λ), and this combination also occurred in the following computations with the other six devices. The different combinations from seven devices were gathered in sequence (italics font and filled in gray background), and then 19 unique combinations were selected based on all 70 combinations of seven devices and named as BIGC-1, BIGC-2,. . . , BIGC-19 (hereinafter abbreviated as B1 to B19). In order to confirm the effectiveness of these 19 observer categories and assign some of them to individual observers, the paired-comparison experiments carried out in our previous study would be used in the next work, and their performances for observers with different ages would be tested.

III. THE PAIRED-COMPARISON EXPERIMENTS
Two groups of paired-comparison color matching experiments performed by He (Experiment I, hereinafter abbreviated as Exp.I) [24] and Xi (Experiment II, hereinafter abbreviated as Exp.II) [25] in previous study were used to test the performance of the generated observer functions and to categorize the real observers to corresponding CMFs. The detailed information of the visual experiments is shown in Table 3.

A. COLOR PAIRS
In Exp.I, a Gretag Macbeth Judge II viewing cabinet fitted with a D65 simulator, and in Exp.II, a spectral tunable lighting system with nine narrow-bands and two broad-bands LEDs was used to produce two kinds of light sources (L1 and L2), respectively, which had the illuminances of 936.3 lx, 1117 lx, the correlated color temperatures (CCTs) of 6744 K, 5014 K and the IES TM-30-18 R f , R g [26], CIE color rendering indices (R a ) [27] of 95, 104, 93.5 for L1 and 97, 98, 98.2 for L2 respectively. The spectral power distributions of two lighting sources are shown in Fig.4.
Experiment I and II have five target color samples (gray, brown, blue-green, blue and purple) and three target color samples (gray-2, brown-2 and light brown) respectively. All of them were selected from the practical name color chart (published by Fashion Color Association of China). The target and the comparison samples possess different reflectivity. Hundreds of color samples with small color differences (usually less than 5.0 E 00 units) around each target color were prepared by the Epson Stylus 7908 Inkjet printer on matt paper with the size of 5 cm × 5 cm.
Two sets of CMFs (CIE2006-20y and CIE2006-70y) with the age of 20 and 70 computed from CIE 2006 model in 10 • field of view were used to represent the CMFs of the young and aged groups, and then the XYZ values as well as the E 00 color difference between the target samples and their compared samples were computed. In order to enlarge the color discrimination differences between young and aged observers, the requirement for choosing compared samples was that the E 00 values computed by the two sets of CMFs  Fig.5 and Fig.6).
Meanwhile, the largest values were made as large as possible within 5.0 E 00 units, and the smallest E 00 values were required as small as possible. Finally, four compared color samples surrounding each target color center in Exp.I were prepared, there were 20 pairs of color samples around five color centers in Exp.I, and correspondingly six compared color samples surrounding each target color center in Exp.II were prepared, there were 18 pairs of color samples around three color centers in Exp.II.

B. VISUAL EXPERIMENTS
The visual assessments were conducted in a dark room using a viewing cabinet fitted with L1 or L2 light sources. The color pairs were placed without gaps at the center of the floor. A gray mask with the CIE L * 10 , a * 10 , b * 10 values of 59.86, -1.60, 0.78 was used to be the cover, which has an open window with the size of 15 cm × 5 cm to present the target sample and two compared samples (the size of each color sample was 5 cm × 5 cm). For each color center, the target sample was placed in the middle, and two samples were randomly selected from the comparison samples to be placed on the left and right sides. Before visual experiments, all observers passed the Ishihara test and had normal color vision.
In previous work, Rinner and Gegenfurtner [28] proposed that the chromatic adaptation was basically completed after two minutes, and after 1 min, the observers would have more than 90% chromatic adaptation. As mentioned earlier, the time-course measured by Fairchild and Lennie [29] could be described as exponential function with a time constant of 8.4 s, which took almost one minute to complete. Later, Fairchild and Reniff [30] revealed the role of the fast mechanism and the slow mechanism. The time constant of the fast mechanism was 1 s and 40 -50 s for the slow mechanism. Werner [31] indicated that after a few seconds (about 30 seconds) to 1 min, it gradually approaches to a steady state. Considering the time consuming in the experiments, the observers were allowed to perform chromatic adaption for 0.5 -1 min before each experiment.
The observers were required to view the color samples within a distance of approximately 25 to 30 cm with the visual field size of the color patches were 28.1 • -33.4 • , and the viewing condition was 0 • / 45 • . They need to judge which color sample present a larger color difference compared with the middle target sample, the left one or the right one. In order to avoid the influence of the sequence on the experimental results, a random trial was conducted on each participant. All the young observers were from the Beijing Institute of Graphic Communication (BIGC) and majored in printing engineering. They had a color / lighting background, and had participated in similar color experiments before. Elderly observers were retirees from different fields and most of them had no knowledge of color science.
In Exp.I, the paired-comparison experiments were performed by 30 young observers aged from 19 to 25 years old and 26 elderly observers aged from 60 to 74 years old. For each target color sample, each observer made 4 × (4 -1) / 2 = 6 assessments and all observers were asked to make 12 replications / repetitions at different intervals, thus there are 20160 (= 6 assessments × 5 target colors × 12 repetitions × 56 observers) judgments.
In Exp.II, the paired-comparison experiments were performed by 26 young observers aged from 20 to 25 years old and 14 elderly observers aged from 62 to 75 years old. For each target color sample, each observer made 6 × (6 -1) / 2 = 15 assessments and all observers were VOLUME 8, 2020   It took about 12 -15 minutes to complete an experiment (30 /45 assessments).

C. VISUAL COLOR DIFFERENCE
The data obtained in Exp.I and Exp.II evaluated by observers 12 or 10 times in the experiment were used for probability statistics [32]- [34], divided by the total number of observations for a given pair judged by each observer, and then the probabilities values were converted to Z score. Finally, the visual color differences for each color center were obtained by eliminating negative values. Fig.7 and Fig.8 draws the average results from the group of the young and aged observers as the final results.

IV. RESULTS AND DISCUSSIONS A. VALIDATION OF 19 BIGC OBSERVER CATEGORIES
In order to assign the 19 BIGC observer categories to individual observers and test their performances for different aged observers, the paired-comparison experiment Exp.I was used in the next work. The correlation between the color differences computed by different CMFs and the visual color differences of the 20 color pairs assessed by the individual observer i was characterized by the standardized residual sum of squares (STRESS) value [35], which was defined by (3).
. For a given CMFs, i refers to the number of observers, and for the group of young observers, i max is 30; while for the aged observers, i max is 26; n is the number of color pairs, ranging from 1 to 20. STRESS in refers to the STRESS value computed by the visual color difference ( V i value) from the observer i and the computed CIE DE2000 value ( E 00 value) with n color pairs. For a given observer, 19 STRESS values can be obtained from 19 BIGC observer categories with the observer's visual data. STRESS value is ranging from 0 to 100, the smaller the STRESS value is, the more accordant the visual data and the better the CMFs perform. For a perfect agreement of observers' assessments, STRESS will be zero and a higher STRESS value indicates a larger dissimilarity between two datasets. Table 4 lists the average STRESS values obtained from the group of young and aged observers in Exp.I.
It can be seen from Table 4 that the STRESS values from different observer categories almost have opposite results for young and aged observers. The smaller STRESS values for the young observers, the larger results for the aged observers. The paired -comparison results from Exp.I indicated that for young observers, BIGC 17 has the smallest STRESS value, followed by BIGC 1, and for aged observers, BIGC 14 has the smallest value, followed by BIGC 5.
For a given observer, the minimum STRESS value was selected from the 19 STRESS values, and the observer categories corresponding to the minimum value will be counted as functions of the observer. Fig.9 shows the number of the observers contributed to the minimum STRESS values among different observer categories counted from the two groups. Considering the STRESS value and the number of the observers contributed to the minimum STRESS values, there are 12 and 7 young observers out of 30 observers assigned to BIGC 17 and BIGC 6, as well as 14 and 10 aged observers out of 26 aged observers belonging to BIGC 5 and BIGC 14 respectively. Finally, categorical B6, B17 and B5, B14 were recommended to young and aged observers respectively. In Table 5, B17 and B14 have the best performances for young and aged observers in Exp.I, S2 and A10 have the minimum STRESS values for young and aged observers in Exp.II. The different results also indicate that the categories of individual observers are determined by the primary colors of the color pairs. The F-test was also used to test the difference between the other 25 CMFs and CIE2006 (22) for young observers, as well as CIE2006 (68) for aged observers, and the results are given in Table 6, a significant difference between the tested CMFs and CIE 2006 with the corresponding average age are marked in bold and italics.
From Table 6, S2, S3, B6, and B17 in Exp.II for young observers outperformed CIE2006 (22) significantly. Fig.10 shows the number of the observers contributed to the minimum STRESS values among 26 color matching functions counted from two groups in Exp.I and Exp.II. Considering the STRESS value and the number of the observers contributed to the minimum STRESS values, S2 and B5 outperformed others for the observers with different age. There are 18 (= 9 + 9) young observers from Exp.I and Exp.II assigned to S2 and 19 (= 14 + 5) aged observers from Exp.I and Exp.II assigned to B5. Finally, S2 and B5 are recommended to young and aged observers respectively.

C. PROPERTIES OF CATEGORICAL OBSERVERS
The distributions of two recommended CMFs S2 and B5 are plotted in Fig.11, together with CIE2006 (22) and CIE2006 (68), where CIE2006 (22) and S2 are plotted with blue lines VOLUME 8, 2020  and marked with circles and diamonds respectively, CIE2006 (68) and B5 are plotted with red lines and marked with circles and diamonds respectively.
The properties of the two CMFs, including the positions of the peak wavelength and the full width at half-maximum (FWHM), the maximum spectral tristimulus values (STV max ) are used to compare with the CIE2006 (22) and CIE2006 (68) CMFs which are regarded as a standard observer for the observers with a given age. The results are summarized in Table 7.
As shown in Fig.11 and Table 7, each CMFs has two peak wavelengths inx(λ) channel. Compared with the CIE2006 (22) and S2 CMFs for young observers, the CIE2006 (68) and B5 CMFs for aged observers almost shifted to long wavelength. The peak wavelength of S2 CMFs inx(λ),ȳ(λ) andz(λ) channels are similar to those of CIE2006 (22), with the values of 594 nm, 556 nm, 446 nm respectively. Only the second peak wavelength of thex(λ) channel is shifted with 4 nm (= 445 nm -441 nm) towards to the long wavelength. The main peak wavelengths of the B5 CMFs inx(λ) andz(λ) channel are almost the same as that of the CIE2006 (68), only the peak wavelength inȳ(λ) channel is shifted with 13 nm (= 569 nm -556 nm) to long wavelength. Table 7 indicate that CIE2006 (68) and B5 for aged observers have narrower FWHM than those of CIE2006 (22) and S2 for young observers.
The differences in cone fundamentals for young and aged observers will cause color failure in color reproduction and color evaluation, which is also determined by the primary colors of the color stimuli. It is possible that a color pair is very similar to the group of young observers but might be extremely different for the group of aged observers, just as mentioned in our two groups of paired comparison experiments. Therefore, in cross-media color reproduction and digital color image transmission, it is necessary and urgent to establish and categorize the CMFs to represent the cone fundamentals of observers with different age groups and to compare the colors with different spectra accurately.

V. CONCLUSION
Observers with normal color vision have individual differences in their CMFs and will arouse different color perceptions. It's necessary to classify the observers into different categories, especially for the observers of different ages. A cluster analysis method was performed, 17 CIE recommended colors were selected and reproduced as reference colors by seven devices with different primary spectra. Nineteen combinations were generated and named as B1, B2, . . ., B19 observer categories.
A paired-comparison experiment (Exp.I) was designed and 56 color normal observers aged from 19 to 74 were organized to carry out the color difference evaluation using 20 color pairs with different primary colors, the visual results were used to choose four combinations of BIGC observer categories. In order to test the performance of 26 CMFs, including CIE (CIE 1964, CIE 1989 SDO, CIE2006-22y, CIE2006-68y) CMFs, Sarkar1 to Sarkar 8, Asano 1 to Asano 10, and four BIGC (B5, B6, B14, B17) observer categories, another paired comparison experiments performed by Xi (Exp.II) using 18 color pairs with different primary colors and evaluated by 40 observers with normal color vison aged from 20 to 75 was used, together with the Exp.I. The results indicate that when viewing printed materials in the filed size larger than 10 • , the S2 and B5 are representative for young observers and aged observers, respectively.
Furthermore, the spectral properties of S2 and B5 were compared with CIE2006 (22) and CIE2006 (68), including the positions of the peak wavelength, the full width at half-maximum (FWHM) and the maximum spectral tristimulus values (STV max ). The main differences were occurred at different positions of the peak wavelength, the newly recommended S2 of the young observers shifted 4 nm to the long wavelength in the second peak wavelength ofx(λ) channel, while B5 of the aged observers shifted 13 nm to the long wavelength in the peak wavelength ofȳ(λ) channel. In further work, more visual experiments are required to establish and test the performances of cone responses of observers with other ages and color stimulus by different primary colors. In addition, observers of different ages in the experiments can be organized to compare the standard and comparative samples using the methods of fuzzy color space and fuzzy color, which can be discussed not only in the ordinary (crisp) color space as we have done with the colorimetric values, but also in terms of fuzzy colors. The subjectivity of color can be considered with the fuzzy logic, which will bring us new insights into the study of observers of different ages.