Calibrating Oxygen Saturation Measurements for Different Skin Colors Using the Individual Typology Angle

Since the start of the SARS-CoV-2 pandemic, wearable devices featuring oxygen-saturation measurements have gradually attracted public attention. The US Food and Drug Administration (FDA) has, however, raised doubts about the accuracy of watch-type oximeters (e.g., Apple Watch and Fitbit Sense) for darker-skinned users. That is, the accuracy of oxygen-saturation measurements is affected by skin tone. Accordingly, this article proposes a method of calibrating the bias of the oxygen-saturation measurement caused by differences in skin tone. We integrate a color sensor into a wearable device featuring the function of oxygen-saturation measurement. We also use the individual typology angle (ITA) to quantify the user’s skin color and the skin’s ITA quantization value to calibrate the oxygen saturation value of the pulse oximeter sensor. The oxygen-saturation-calibration algorithm of the ITA-quantified value is suitable for determining the <inline-formula> <tex-math notation="LaTeX">${R}$ </tex-math></inline-formula>-value bias caused by skin color. Our experimental findings derive from testing the <inline-formula> <tex-math notation="LaTeX">${R}$ </tex-math></inline-formula>-values of subjects with different skin colors and simulating and verifying oxygen saturation ranges from 70% to 100%. The findings suggest that it is possible for the oxygen saturation bias of darker-skinned subjects to be reduced from an <inline-formula> <tex-math notation="LaTeX">${A}_{\text {rms}}$ </tex-math></inline-formula> error of 5.44% to an <inline-formula> <tex-math notation="LaTeX">${A}_{\text {rms}}$ </tex-math></inline-formula> error of 0.82%; that is, using ITA-quantified value for calibration, the accuracy of oxygen saturation measurements (OSMs) has been significantly improved. The proposed method enables the oxygen-saturation measurements of dark-skinned subjects to comply with the FDA guidance and ISO 80601-2-61:2017 standards, meaning that this study’s method can effectively improve the accuracy of the oxygen-saturation measurements of watch-type oximeters.


I. INTRODUCTION
T HE SARS-COV-2 pandemic has substantially impacted the medical systems of countless countries. During the early stages of the pandemic, medical testing resources were insufficient to conduct large-scale screening effectively, preventing the timely detection of potential infections. In addition to employing hospital-based polymerase chain reaction (PCR) tests to determine whether a person has been infected with SARS-COV-2, the body's vital signs can also enable diagnosis. For example, the pandemic's early researchers recognized that core body temperature and oxygen saturation could serve as a basis for infection diagnosis [1], [2]. The advantage of using vital signs for diagnosis is that they can enable quick determination of whether a person has been infected, following which they can go to the hospital for a PCR or other medical test, effectively reducing the demand for limited medical resources. Elsewhere, the pandemic saw oxygen saturation used as a condition for diagnosis [3], with the critical parameter defined as an oxygen saturation of 94% or lower. This enables potential infections to be diagnosed at an early stage. Accordingly, wearable devices with the function of measuring oxygen saturation have been widely used [4]. Monitoring oxygen saturation levels using wearable devices can warn users of potential infection, allowing individuals to know that they are infected and reducing the risk of sudden disease deterioration due to silent hypoxia [5].
According to a 2022 study [6], cited by the US Food and Drug Administration (FDA), current wearable devices with the function of oxygen saturation measurement (OSM) are, however, usually worn on the wrist. The skin color of the wrist affects the accuracy of OSM, risking overestimation of the OSM of people of color [7], [8], with one study observing that black patients recorded occult hypoxemia undetected by pulse oximetry at nearly three times the frequency of white patients. Furthermore, the issue of OSM bias due to skin color also exists in infants [9]. Therefore, the FDA doubts the viability of dark-skinned users of watch-type oxygen-saturation monitoring wearable devices, such as Apple Watch or Fitbit Sense. This is because skin melanin has a specific absorbance for the light of a specific wavelength [10], and noninvasive oxygen-saturation monitoring uses red light and infrared light to detect the ratio of oxyhemoglobin (HbO 2 ) and hemoglobin (Hb) in the subject to calculate oxygen saturation. The skin's melanin content will affect the absorption of red light [11], which is the primary factor responsible for the overestimation of oxygen saturation levels during hypoxia among patients with high levels of skin melanin [12]. Some studies suggest that the issue of melanin absorbing red light in the rest state can be compensated by increasing the light intensity [13], [14]. The measurement error of photoplethysmography (PPG) is, however, positively correlated with the skin's melanin levels [15], meaning that skin melanin may still affect the accuracy of OSM. To resolve the issue of biased OSM due to the skin color of the user of the OSM-enabled watchtype device, it is necessary to incorporate a color sensor to detect the individual's skin color; however, using skin color to calibrate OSMs involves two issues. The first is converting the color values detected by the color sensor into standard skin color classification and quantization values. We use the skin color quantization technique known as the individual typology angle (ITA) method to address this issue. Second, before the manufacturer puts the wearable device on the market, it is necessary to perform clinical experiments and calibration of the pulse oximeter component of the instrument for initial calibration. The initialized calibration curve may be a linear or nonlinear function. If the manufacturer's clinical data derive from light-skinned patients, the OSM values of dark-skinned users may be disadvantaged by the oxygen-saturation curve set before the device leaves the factory; however, requiring manufacturers to conduct subgroup clinical trials on people with various skin colors is not economically feasible. Therefore, we propose a method of premodeling the ITA skin color quantization function to calibrate the oxygen-saturation curve. This can allow OSM-enabled watch-type wearables to calibrate skin color.
II. OUR CONTRIBUTION This study's contribution is the design and development of a watch-type oximeter with skin tone calibration. We propose a method of calibrating the oximeter's initial oxygen-saturation curve function by using the ITA skin color quantization value. This method can effectively calibrate the oxygen-saturation curve according to skin color. In practice, using the method proposed in this article, manufacturers of OSM-enabled  watch-type devices can model the skin color quantization values of lighter and darker skin for the pulse oximeter function of their product. This means using our method to calibrate the initial oxygenation saturation curve such that it better fits the oxygen saturation value of users with different skin colors. This can eliminate the cost and time required associated with conducting subgroup clinical trials for specific skin-color groups.

III. SYSTEM DESIGN AND DEVELOPMENT
This study required collecting skin-color data from users with different skin colors and using the sampled data to build a skin-color-based calibration model. Therefore, we developed a watch-type wearable device integrating an oxygen-saturation sensor and a skin-color sensor for data collection and validation. Fig. 1 shows the design of the wearable device used in the experiment. Fig. 1(a) is our proposed experimental device, which is worn on the wrist. Fig. 1(b) is the pulse oximeter sensor, which integrates infrared and red LEDs and a photodiode for receiving diffuse light signals to measure the user's oxygen saturation. Fig. 1(c) shows the analog-to-digital converter (ADC), which converts the color into a digital format by integrating photodiodes and optical filters to measure the skin surface's red, green, and blue (RGB) values. This data is used with the ITA quantization method to identify the user's skin color. Fig. 2 is the measurement method proposed by this study. When the user wears the device on their wrist, the pulse oximeter sensor in Fig. 2(a) will alternately excite the infrared light and the red-light LED to irradiate the blood vessels beneath the skin tissue of the wrist between the LED and the photodiode. The shape of the photon's path resembles the shape of a banana [16]. Using the integrated ADC to sample the photodiode's analog voltage allows for the sampling of the ac and dc changes generated by the patient's pulse. The pulse oximeter sensor used in this study is MAX30101 [17] (a MAX30101 produced by Maxim Integrated Inc., San Jose, CA, USA). The housing design of the pulse oximeter sensor needs to block ambient light, and the structure uses matting materials to reduce the impact of reflected light. Fig. 2(b) is the TCS34725 color sensor [18] (a TCS34725 produced by ams-OSRAM AG., Premstaetten, Austria). The white light LED is used to irradiate the user's skin surface. After an optical filter filters the reflected light signal, the RGB value of the skin can be measured [19] and converted into an ITA-derived skin color quantization via an algorithm.

A. Using ITA to Quantize Skin Tone
ITA is a method of classifying Fitzpatrick skin type (FST) [20] into six groups ranging from very light to dark skin: very light > 55 [21], [22]. To quantize the user's skin color, we use the ITA quantization value; however, the data measured by the color sensor is not the ITA value. It is necessary to convert the data to obtain the ITA value. As Fig. 3 shows, the microprocessor will turn on the white light LED to illuminate the skin's surface, and the color sensor can sample the skin-color information via the reflected light. Next, the microprocessor runs the skin-color quantization algorithm to calculate the skin's ITA value [23].
Before the RGB color space is converted into the XYZ color space, it is necessary to normalize to the range of 0-1 before performing the matrix operation of (1) [24], where γ is 2.2 and α is 0.055. According to the chromaticity of Commission Internationale de L'Eclairage (CIE) standard illuminant D65 [25] based on the CIE 1931 XYZ standard, before the XYZ color space is converted to CIE L * a * b * , it must be adjusted with a coefficient that is used to calibrate differences in the human eye's sensitivity to different colors in daylight according to specific illuminance and observer parameters. The observer parameters have been standardized as mathematical functions known as 2 • standard observers [26]. The XYZ calibration value of the 2 • observer needs to be divided by three different coefficients: 95.047, 100.0, and 108.883. Then, XYZ can be converted into CIE L * a * b * [27], [28] by calculating (2). Next, we can use the calculation of (3) [29] to convert the CIE L * a * b * of the skin color measured by the color sensor into an ITA quantization value, which allows for the calibration of the oxygen-saturation curve of darker-skinned B. Noninvasive OSM The principle of noninvasive OSM is based on the Beer-Lambert law [30]. The different ratios of HbO 2 and Hb in the blood to the absorbance of infrared light and red light estimate the oxygen saturation level of the subject. This technology is also called PPG. Because the PPG signal changes with the blood volume, the dc component of the signal is contributed by human tissues. In contrast, the ac component is the periodic change in blood volume caused by systole and Outputs: result diastole [31]. In measuring oxygen saturation levels, dynamically detecting the dc and ac components from the PPG signal is the key to noninvasive OSM. In our previous research [32], we developed an algorithm that can dynamically detect the PPG signal's dynamic dc and ac components. Algorithm 1 is the pseudocode of the local maximum I H and local minimum I L used to detect the PPG signal [33]. I H and I L are produced by the systolic and diastolic phases of the artery [34], and the ac component can be obtained by subtracting the I H and I L , while the I L is the dc component [35]. The covariance of the signal determines the threshold [36], which we calculate and update every second so that the detection algorithm can dynamically detect the signal features. As shown in Fig. 4, this dynamic feature detection algorithm can detect the feature points of the PPG signal. In this study, we use PPG to measure the dc-to-ac ratios of users with different skin colors. We assume that at the same oxygen saturation level, the dc-to-ac ratios in the infrared light and red light will be different due to differences in skin color. These differences occur because red light is affected by the absorption of melanin [37], [38].

C. Calibrated Oxygen Saturation Based on Skin Tone
The pulse oximeter design requires calibrating the oxygen-saturation curve based on clinical trial results. This process involves R-value and oxygen saturation level regression models. The R-value is the ratio of red light and infrared light calculated using (4) [39], [40], [41], [42]. The wavelength of the red LED of the pulse oximeter sensor is 660 nm, and the wavelength of the infrared LED is 940 nm. Fig. 5 shows an R-value curve produced by the regression model of oxygen saturation (SpO 2 ) and R-value, showing an inverse correlation. The optical structure of a specific device and the population of clinical experiments determines the output of the regression models. Therefore, this regression model is called the characteristic oxygen-saturation curve or initial R-value curve. When the user's skin color differs substantially from the skin color of the clinical trial population of a specific device, the regression model will produce different measurement errors at different oxygen saturation levels That is, the initial R-value curve is modeled on data from clinical experiments, and when the user's skin color is darker than the skin color of the trial population, the ITA value can be used to calibrate based on the initial R-value curve. The oxygen-saturation curve is impacted by darker skin because red light is affected by the absorption of melanin, changing the R-value, meaning we first need to calibrate the R-value. Equation (5) shows the calibration function is established by multiple linear regression (MLR) models [43], [44]. MLR model can consider more parameters to calibrate the R-value. In this case, it is the ITA quantization value and R-value before calibration. The calibrated R-value is called c, where β 0 is the Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. intercept term, and β 1 and β 2 are coefficient terms. The result c obtained from (5) is incorporated into the calculation of (6). The SpO 2 value can be calibrated using the ITA-derived quantized skin-color value IV. RESULTS This research involved positioning the device on the wrist of the participants, as shown in Fig. 6. The holder supports the wrist of the participants to improve the stability of data collection. We recruited volunteers from different subgroups, including ten lighter-skinned participants with ITA values ranging from 8 to 58 and five darker-skinned participants with ITA values ranging from −3 to −56. The experiment was approved by the Wenzao Ursuline University of Languages and executed in compliance with the guidelines. Dark skin participants in this study are compliant with the ISO 80601-2-61:2017 guidance guideline for testing the accuracy of pulse oximeters, which requires at least ten participants [45]. The participants whose wrist hair density is nil or sparse are included in this study [46]. As shown in Fig. 7, we can use the tissue optics properties measured by lighter-skinned participants to model the initial R-value curve used as the oxygen-saturation curve by our OSM device. The tissue optics properties of participants with darker skin (see Fig. 8) are used to model a characteristic simulation model of the skin color and oxygen-saturation curve. Because the melanin of participants with darker skin absorbs red light differently from that of lighter-skinned participants when oxygen-saturation curves based on data from lighter-skinned participants were used to estimate oxygen saturation among darker-skinned participants, the oxygen saturation value is overestimated. Therefore, this experiment verifies the extent to which overestimation of oxygen saturation due to skin color can be mitigated using the ITA-derived quantized  value for skin color and our proposed MLR method to calibrate the oxygen-saturation curve. Before this experiment was performed, we used the SkinColorCatch (Delfin Technologies Ltd., Kuopio, Finland) skin-color meter [47] to calibrate the ITA measurement value of the color sensor of our OSM-enabled device. SkinColorCatch could reliably discriminate between skin erythema and melanin without any cross-contamination. Its accuracy and reproducibility for skin-color measurement are used in skin-related clinical research and pharmaceuticals and medical materials development [48], [49], [50]. Furthermore, we used the FDA-approved Masimo MightySat pulse 1 oximeter (Masimo, Irvine, CA, USA) as a reference pulse oximeter [51].

A. Experiment Result
As Fig. 9 shows, we use data from lighter-skinned participants to model the initial R-value curve. This curve is produced by the regression model described in Section III-C based on data from the oximeter manufacturers. Usually, the characteristics of such regression models are determined by  the optical structure of a specific device and the population of clinical experiments. Fig. 9 reveals that the lower the oxygen-saturation level of dark-skinned participants, the bigger the difference between the R-value observed and the initial R-value curve. Thus, according to the ITA quantization value, the darker the participant's skin, the larger the difference between the R-value and the initial R-value curve. If the oxygen-saturation curve of the initial R-value curve were used to estimate the oxygen saturation of dark-skinned participants, a diagnosis of occult hypoxemia might be incorrectly given [52]. Fig. 10 demonstrates the use of data from participants with an ITA of −53 to model the calibration results of the MLR model based on ITA. After the MLR calibrates the R-value of participants with darker skin, the overestimation based on the initial R-value curve is significantly reduced. After the MLR model calibrates the R-value, the difference between the R-value and the initial R-value curve becomes larger only for the participant with an ITA of −3. Table I presents the accuracy of the results according to the root-mean-square (A rms ) difference of dark-skin participants. The A rms calculation is as in (7) [53], where predict i is the predicted SpO 2 of dark skin participants, reference i is the SpO 2 of the initial R-value curve, and n is the number of samples. According to FDA guidance  [54], [55]. Before using the MLR model to calibrate the R-value, it is difficult to achieve acceptable accuracy for the dark-skinned population, especially the participants with an ITA of −56 and −53. After using the MLR model to calibrate the R-value with the ITA value, the difference between the R-value of dark-skinned participants and the initial R-value curve was significantly reduced, meaning that the A rms error also reflected an acceptable level of accuracy, becoming larger for the participant with an ITA value of −3. This means that the R-value of the participant with an ITA of −3 does not fit the initial R-value curve produced by the MLR model. This means that the MLR used for R-value calibration reaches the effective calibration threshold. According to this study, ITA values below −26 produced A rms errors of the estimated oxygen saturation value that were better than before calibration after using the MLR to calibrate R-values Thus, according to our verification, the method proposed in this study to calibrate the oxygen-saturation curve with the quantitative value of ITA skin color is effective, especially for participants with darker skin. If a manufacturer of an OSM-enabled device wants to add the ability to calibrate oxygen saturation with skin color to their devices, they only need to model the initial R-value curve and the MLR based on the ITA-derived skin-color quantized values. Notably, the collection and modeling of experimental data make OSM-enabled wearables capable of skin-color calibration, with an important alternative being where the modeling of the initial R-value curve is based on the dark-skinned population, which would require the MLR to use light-skinned population modeling.

V. LIMITATIONS AND FUTURE WORKS
Although this study proposes a watch-type oximeter with a skin-color calibration function, there remain some issues worthy of research attention. First, although the method of calibrating oxygen saturation using the ITA value as the parameter of the MLR is effective, it depends on the parameters of the MLR. Because the MLR employed in this study uses R-value and ITA quantized values as parameters, the weight of the ITA value is large. If the ITA values of the participants are calculated at the time of the collection of the data used for the MLR, more accurate OSMs may be obtained; however, if timing differs substantially, significant errors may occur. Future researchers can introduce more parameters or use other algorithms to resolve this issue. At present, according to this article's results, we recommend that the MLR model be used for calibration when the ITA quantization value reaches the effective calibration threshold. Second, further research is needed on the optical properties of the skin tissue of the darkskinned population. This study is based on the notion that red light is affected by the absorption of melanin, producing biased R-values, which causes significant errors in oxygen-saturation estimation. Whether factors other than absorption, such as reflection and diffusion, can cause R-value errors sufficient to affect oxygen saturation estimation, however, demands further investigation. Future researchers could conduct studies with skin melanin phantoms to explore the extent to which the optical properties of these tissues affect watch-type oximeters. Finally, the effect of wrist hair density on the pulse oximeter sensor's signal-to-noise ratio (SNR) must also be considered. In this study, only participants with hair densities of nil and sparse were recruited; however, some studies show that the participants with dense hair may not be able to measure accurately [56], [57], [58], while others study that it will not be affected [59], [60]. Nevertheless, SNR depends on the hair density and the LED light intensity of the pulse oximeter sensor. Therefore, more quantitative studies on the effect of hair are still needed.

VI. CONCLUSION
This study has proposed a watch-type oximeter with skincolor calibration capabilities. We have used this device to conduct experiments on people with lighter and darker skin. We have modeled the characteristic oxygen-saturation curve based on the optical tissue characteristics of a light-skinned population. Using an MLR with ITA quantization values as a parameter, the R-value of participants with darker skin has been calibrated based on the characteristic oxygen-saturation curve. According to the results, the calibrated OSM values of the dark-skinned population produce A rms error values that meet the FDA guidance and the ISO 80601-2-61:2017 criteria. Therefore, the method we propose for calibrating oxygen saturation by quantizing skin color is feasible. In the future, watch-type oximeter manufacturers should consider using our method to reduce errors in the estimation of OSM for different populations. When watch-type oximeter manufacturers integrate the skin color and SpO 2 calibration technology proposed in this study, they only need to establish a calibration function for the bias caused by skin color on the oximeter SpO 2 curve of the original product. We show a calibration function modeled with skin color as a calibration parameter, allowing oximeter manufacturers not to reconduct clinical trials on the original product's SpO 2 curve modeling work. It only needs to establish the skin color and SpO 2 calibration function for the skin color of the target group. Therefore, this study reduces the consumption of clinical resources and boosts the measurement performance of watch-type oximeter manufacturers for watchtype oximeter.