Loss-Based Control Charts for Monitoring Non-Normal Process Data

Quality and loss of products are crucial factors in competitive companies, and firms widely adopt a loss function to measure the loss caused by a deviation in the quality variable from the target value. From Taguchi’s view point, it is important to monitor any deviation from the process target value. While most existing studies assume the quality variable follows a normal distribution, the distribution can in fact be skewed or deviate from normal in practice. This paper thus proposes loss-based control charts for monitoring the quality loss location or equivalently the deviation of the quality variable from the target value under a skew-normal distribution. We consider the exponentially weighted moving average (EWMA) average loss control chart, which illustrates the best performance in detecting an out-of-control loss location for a process with a left-skewed distribution. Numerical analysis demonstrates that the proposed EWMA average loss chart always performs better than the existing median loss chart for both left-skewed and right-skewed distributions. A numerical example illustrates the application of the proposed EWMA average loss control chart.


I. INTRODUCTION
Control charts are commonly-used tools in process signal detection to improve the quality of manufacturing and service processes, yet in the past few years, increasing attention has been paid to the application of control charts to service industries. See, for example, Tsung et al. [1], Ning et al. [2], and Yang and Wu [3]. While a normal distribution has been widely employed in practice to fit data, some data of real examples in psychology, reliability, telecommunications, environment, climatology, sciences services, education, finance, and health insurance often exhibit moderate to strong asymmetry as well as light or heavy tails (for example, see Bono Cabré [4]). In most situations, quality data from the service sector do not follow a normal distribution. Clearly, fitting a normal distribution to such data is not appropriate, and the commonly-used Shewhart variables control charts that depend on a normality assumption are not suitable. A list of statistical process control research for dealing with nonnormal data includes Amin et al. [5], Chakraborti et al. [6], Altukife [7], Bakir [8], Li et al. [9], Zou and Tsung [10], Graham et al. [11], Yang and Wu [3], Abbas et al. [12].
The associate editor coordinating the review of this manuscript and approving it for publication was Roberto Pietrantuono .
The quality loss function is a popular method for measuring the loss of quality caused by variations in a product or service. Sullivan [13] emphasized the importance of monitoring deviations from the target value. Taguchi [14] proposed the quadratic loss function of quality variable from the target value. Changes in the process mean and/or dispersion lead to a variation of the loss. A few loss control charts have been proposed to monitor process loss. For example, Wu and Tian [15] and Wu et al. [16] suggested the weighted loss function chart and adaptive loss-function-based control charts, but they assumed that the in-control process mean equals the target, and that the quality variable follows a normal distribution. Yang [17] and Lu [18] examined lossbased control charts, assuming that the in-control process mean may not equal the target under the normality and nonnormalty distributions when simultaneously monitoring the process mean and dispersion.
A major drawback of loss-based control charts is that most of them assume the quality variable follows a normal distribution. Hence, this paper focuses on discussing loss-based control charts under a skew-normal distribution. Yang et al. [19] proposed using the median loss instead of the average loss to simultaneously monitor changes in the process location and/or dispersion when the distribution of a process VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ is not symmetric, but rather left-skewed or right-skewed. Their median loss (ML) chart illustrated the best out-ofcontrol detection performance for the left-skewed distributed process. Even under a normal distribution, they showed that the resulting out-of-control detection performance of the ML chart performs better than the average loss (AL) chart in Yang [17] except for very small shifts in process mean. Yang and Lu [20] proposed the median loss control chart with an unbiased average run length in order to monitor the process loss center. The exponentially weighted moving average (EWMA) is an effective alternative to the Shewharttype control chart, and may be used when small shifts occur in the process parameter (for example, see Montgomery [21]). However, the properties of the average loss control chart have not been discussed for the skew-normal distributed process. We are interested in knowing whether the EWMA-ALSN control charts have better out-of-control detection performance than that of the ML chart for a skew-normal distributed process. Our paper thus considers fixing the sample size and sampling time interval. The rest of the paper runs as follows. Section II derives the sampling distribution of the average loss for a process with a skew-normal distribution. Section III designs the EWMA-ALSN charts and lists their control limits for various sample sizes and shape parameter of a skew-normal distribution under a predetermined in-control average run length (ARL 0 ). Hence, their out-of-control detection performance for small to moderate shifts in the difference of process location and target and/or dispersion can be evaluated using out-of-control average run length (ARL 1 ) under the specified process shifts. Section III also compares out-ofcontrol detection performance among the EWMA-ALSN and existing ML charts in Yang et al. [19]. Section IV illustrates the application of the proposed charts using the Roberts IQ score. Section V summarizes the findings and provides a recommendation.

II. THE DERIVATION OF AVERAGE LOSS DISTRIBUTION
The skew normal (SN) distribution is an extension of the normal distribution, allowing for the presence of skewness. Helguero [22] was the pioneer of the skew normal (SN) distribution and formulated the genesis of non-normal distributions via a selection mechanism, leading to a departure from normality. Many researchers, for example, Azzalini [23], Azzalini and Valle [24], Azzalini [25], Azzalini [26], and Azzalini and Capitanio [27] contributed to the development of the theory of SN distributions.
Assume the random variable X follows an in-control skewnormal distribution with location parameter ξ 0 ∈ (−∞, ∞), scale parameter a 0 ∈ (0, ∞), and shape parameter b ∈ (−∞, ∞). In other words, X ∼ SN (ξ 0 , a 0 , b). From Azzalini [26], the probability density function (pdf) of X is where φ(·) and (·) are respectively the pdf and cumulated distribution function (cdf) of the standard normal distribution. If b = 0, then the skew-normal distribution reduces to the traditional normal distribution with mean ξ 0 and standard deviation a 0 . The distribution is right-skewed for b > 0 and left-skewed for b < 0. The cdf of the skew-normal random variable X is The expectation and variance of X are
We thus have: We define the Taguchi loss function as L = k(X − T ) 2 . Without loss of generality, we set k = 1. Let X i , i = 1, 2, . . . , n, be a random sample from the incontrol distribution of SN (ξ, a, b). We further define the sample erage loss (AL) as Edgeworth [28] derived the Edgeworth expansion, which relates the cdf of a random variable having expectation zero and variance 1 to the cumulative density function (cdf) of the standard normal distribution using the Chebyshev-Hermite polynomials.
We obtain the r th moments of where φ(·) and (·) denote the probability density function and cdf of the standard normal distribution, respectively. The expectation and the standard deviation of L (µ L and σ L ) can be obtained by the moments of L, that is Thus we approximate the cdf of Z n by the Edgeworth expansion, which is expressed as: where φ(·) and (·) denote the probability density function and cdf of the standard normal distribution, respectively. Note that (r) (z) = (−1) r−1 He r−1 (z)φ(z), where He r−1 (z) is the Chebyshev-Hermite polynomial. One can obtain He r−1 (·)by Table 1 lists the Hermite polynomials obtained from (8).
To obtain λ r , one can use the relation λ r = κ r /σ r L , where κ r is the r th cumulant of L. From Hall [29], the cumulants of L can be obtained from the moments of L as shown in Table 2.
The first step to construct the ALSN chart is to find the distribution of AL when X follows a skewnormal distribution. However, the exact distribution of AL is not available. Our study uses Edgeworth expansion (for example, see Hall [29]) to approximate the AL distribution.
The approximate pdf of Z n can be obtained by differentiating (7) as:  The cdf and pdf of AL can therefore be obtained by the following.
as well as: Let µ AL and σ AL be the expectation and standard deviation of AL, respectively. Thus, we arrive at: and The accuracy of this approximation is examined by Pearson's χ 2 goodness-of-fit test. We consider sample sizes n = 5, 11, δ 3 = 1, µ 0 = 0, σ 0 = 1, and b = −500, 0, 500 and simulate m(= 1000, 2000) samples from  SN (ξ 0 , a 0 , b) with each n, so as to calculate the m random samples of AL and then fit them with the approximated cdf given in (10). Table 3 lists the p-values of the χ 2 test. We see the test reveals that the approximated cdf has no significant difference from the cdf using Monte Carlo simulation. Fig. 1 illustrates that their colf curves are very close for n = 11, m = 2000, and b = −500. Moreover, the accuracy improves for larger sample sizes.  As noted before, X * is the quality characteristic from the out-of-control process, while X * ∼ SN (ξ * , a * , b) with mean µ 1 = µ 0 + δ 1 σ 0 and standard deviation σ 1 = δ 2 σ 0 . That is, .
We denote the out-of-control average loss as AL * = n i=1 X * i − T 2 /n· Following the proving procedure of the cdf of AL ( eq. (10)), we can derive the cdf of AL * .

III. DERIVATION OF THE EXPONENTIAL WEIGHTED MOVING AVERAGE ALSN CONTROL CHART
To better detect small and moderate shifts in the process average loss, we propose an EWMA average loss (EWMA-ALSN) control chart with monitoring statistic EWMA AL,t at time t, as follows. Define EWMA AL,t as the monitoring statistic of the EWMA-ALSN chart at time t, which is in the form of where λ ∈ (0, 1) is the smoothing parameter. When λ = 1, the EWMA-ALSN chart reduces to the average loss (ALSN) chart.
The in-control mean of EWMA AL,t is µ AL , and its standard deviation is σ AL as time, t, approaches infinity.

A. THE CONTROL LIMITS OF THE EWMA-ALSN CUNA-ALLSN CHART
Based on the mean and variance of the monitoring statistic, EWMA AL,t , we can construct the EWMA-ALSN chart. The design parameters of the proposed chart are sample size n sampling time interval h, and coefficients (k 1 , k 2 ) of the upper and lower control limits. We consider the n to be fixed, and h is one time unit. Given the derived µ AL in ( 12) and σ AL in (13), and fixing the predetermined in-control average run length (ARL 0 = ), we may determine the upper and lower control limits (UCL, LCL) of the proposed EWMA-ALSN control chart as follows.
where k 1 and k 2 are determined by the Monte Carlo simulation such that ARL 0 = . If the monitoring statistic, EWMA AL,t , falls outside of UCL or LCL then the process is deemed to be out-of-control. The Monte Carlo simulation method is applied to determine the chart parameters k 1 and k 2 so as to meet the predetermined in-control average run length, namely ARL 0 = with λ. Table 4 lists the coefficients (k 1 , k 2 ) of LCL and UCL as well as the corresponding LCL and UCL of the EWMA-ALSN chart by setting b = −500, −2, 0, 2, 500, µ 0 = 0 and σ 0 = 1, δ 3 = 1, λ = 0.05, 0.2, 0.4, 1.0, n = 5 and ARL 0 = 370.4. We find that the control limits are more symmetric when λ is small, for example, λ = 0.05. Furthermore, the width of the chart increases when λ or b increases.

B. PERFORMANCE COMPARISON AMONG THE EWMA-AL, ALSN AND ML CHARTS
Using the resulting control limits in Table 4, we adopt ARL 1 to measure the out-of-control detection performance of the proposed EWMA-ALSN chart. In order to measure the spread of the run length distribution, we consider the standard deviation of run length (SDRL). Using Monte Carlo simulation, we calculate ARL 1s and SDRL s . Here, we assume the process shifts (δ 1 and δ 2 ) are known or specified. When δ 1 and δ 2 are unknown or unable to specified, the expected ARL (EARL) can be employed as a performance metric (for example, see Teoh et al. [30]).
Based on the results in Table 5, we conclude that the EWMA-ALSN chart performs the best among the three control charts when the process follows a skew-normal distribution. On the other hand, the EWMA-ALSN chart performs better than the ALSN chart for small changes in process location and dispersion, and the EWMA-ALSN chart performs better than the ML chart for small changes in location and small to large changes in dispersion. Hence, we recommend using the proposed EWMA-ALSN chart to monitor the loss location or the process location and/or dispersion by replacing the proposed ALSN chart or the existing ML chart when the process follows a skew-normal distribution.

IV. ILLUSTRATIVE EXAMPLE
This section demonstrates an application of our proposed control charts using a real data set collected from the Roberts IQ score [31]. The IQ score data give the Otis IQ scores for 87 white males and 52 non-white males hired by a large insurance company in 1971. Brown [32] showed that the IQ data of the 87 white males follow a skew-normal distribution with estimated locationξ 0 = 105.78, scaleâ 0 = 11.94 and shapeb = 1.14. In other words, the mean is 118.39 and standard deviation is 9.53. However, the IQ data of the 52 non-white males follow a skew normal distribution with estimatedξ 0 = 106.62,â 0 = 8.266, andb = 0. Hence, it follows a normal distribution.
We take 85 IQ data from all IQ data of 87 white males hired and regard them as the in-control data with population mean 118.39 and standard deviation 9.53. The in-control 85 IQ data are grouped into 17 subgroups with sample of size 5 (Table 6). Furthermore, we take 50 IQ data grouped into 10 subgroups with sample of size 5 (Table 7). We set the target of IQ score (T ) to be 109.39, and so the scale of the deviation from target value is δ 3 = 1. We define the loss as the deviation of the Q score from the target value, L = (X − T ) 2 . Hence, using the proposed loss location monitoring approach in Section III, we determine UCL = 3.367 and LCL = 1.143 of the EWMA-ALSN chart with λ = 0.2 and ARL 0 = 370.4. The in-control subgroup statistics, EWMA AL,t , of the seventeen subgroups are listed in Table 6 and plotted in Fig. 2. Although the subgroup numbers 10,11, and 13 of the in-control samples fall below LCL (very close to LCL), they are in-control subgroups or false alarms. Furthermore, we calculate the out-of-control subgroup statistics, EWMA AL * ,t , of the ten subgroups of IQ score data for the non-white males. The subgroup statistics, EWMA AL * ,t , are listed in Table 7 and plotted in Fig. 3. We find nine out of ten fall outside the LCL. This indicates that deviation of the IQ score from the target or loss location of the IQ score data of the non-white males is significantly different from that of white males hired.
In this example, the IQ score data of white males hired follow a skew-normal distribution with b > 0, but not for the IQ score data of non-white males hired. The result of the   example shows that the proposed EWMAALSN chart may effectively detect the out-of-control loss location or equivalently the deviation of the IQ score from the target for the non-white male IQ scores with a normal (or skew-normal distribution with b = 0) distribution. The EWMA-ALSN chart is thus recommended to monitor the out-of-control loss location or the deviation of quality variable from the target for a process with a skewed distribution.

V. CONCLUSIONS
This study has proposed ALSN and EWMA-ALSN charts, based on the derived approximate distributions of an average loss statistic, in order to monitor the process loss location, or equivalently, the deviation of the quality variable from the target when the process exhibits a skew-normal distribution. Compared to the ALSN chart, we find that the out-of-control detection performance of the EWMA-ALSN chart performs better for small changes in process location and/or dispersion. Furthermore, compared to the existing loss control chartsuch as the median loss (ML) chart -when the process has a skew-normal distribution, the newly proposed ALSN and EWMA-ALSN charts always perform better for detecting the out-of-control process.
In a real example of detecting out-of-control IQ scores with a skew-normal distribution for non-white males hired in a large insurance company, we demonstrate that the proposed EWMA-ALSN control chart performs well. We thus recommend using this new EWMA-ALSN chart to efficiently monitor shifts in process location and/or dispersion when the process variable follows a skew-normal distribution.