Efficient Control Charts for Monitoring Process CV Using Auxiliary Information

Coefficient of variation (CV) control charts are a suitable choice for the monitoring of variation in cases when the process standard deviation is proportional to the process mean. This study is aimed at enhancing the detection ability of usual CV chart by incorporating the use of auxiliary information. In recent years, researchers investigated the use of auxiliary information for improving the sensitivity of location and dispersion charts. However, no study has investigated the CV control charts in presence of auxiliary information. In this study, I propose and investigate a set of auxiliary information based charts for efficient monitoring of process CV by considering a variety of CV estimators. The auxiliary information is used in terms of regression, ratio and hybrid forms. A real life example concerning the monitoring of air quality is also presented to illustrate the application of proposed charts.


I. INTRODUCTION
Control charts act as the most important tool in the statistical process control (SPC) tool-kit. The main purpose of their implementation is the timely detection of assignable causes that can affect the quality of a product or state of a process. The quick detection of these assignable causes can greatly improve the quality standards of a product/process (cf. [1]). Walter A. Shewhart did the pioneer work by proposing the control charts for the monitoring of manufacturing processes but soon their use is extended to the monitoring of other processes such as in nuclear engineering, health care, analytical laboratories, environmental monitoring etc (cf. [2]).
Control charts are mostly used for the monitoring of process location and dispersion. When the process mean levels are constant and the process standard deviation is independent of mean, the variability of the process is usually monitored using range (R) or standard deviation (S) charts. For cases, when the process location is not stable, and process standard deviation is proportional to the mean, the process CV is mostly constant and hence CV control charts are a preferred choice, for the the monitoring of process variability (cf. Kang et al. [3]). For many real life processes, we can observe this phenomenon. For example, Castagliola et al. [4] showed that the standard deviation of the pressure test drop The associate editor coordinating the review of this manuscript and approving it for publication was Jenny Mahoney. time in a sintering process (that manufactures mechanical parts) is proportional to its mean. Abbasi and Adegoke [5] indicates a proportional relationship (in a multivariate setup) between the covariance matrix and the mean vector for the monitoring of inner diameter and average length of carbon fiber tubes, considering the pultrusion process. Moreover, Nguyen et al. [6] observed that in sanitary sector, the standard deviation of the weight of scrap zinc alloy material is proportional to its mean.
Kang et al. [3] did the initial work for the monitoring of process CV by presenting a Shewhart type CV chart. They provided control limits using Monte Carlo simulations as well as the quantile points from the non-central t distribution. A number of studies then enhanced the performance of the CV chart proposed by Kang et al. [3], by considering a variety of design structures. Castagliola et al. [7] and Yeong et al. [8] proposed the adaptive Shewhart CV and EWMA CV control charts, respectively, based on variable sampling interval and compared its performance with the SH-CV, synthetic CV (Syn-CV) and VSI-CV charts. Calzada and Scariano [9] proposed a synthetic control chart for the monitoring of process CV. Amdouni et al. [10] proposed the use of a variable sample size (VSS) to monitor CV in short production runs. Menzefricke [11] proposed a CV control chart for log normally distributed quality characteristic, considering Bayesian framework. Zhang et al. [12] proposed a modified EWMA chart to further enhance the sensitivity of EWMA originally proposed by [7]. Hong et al. [13] proposed the generally weighted moving average (CV-GWMA) chart and showed a better performance of CV-GWMA over CV-EWMA and CV-DEWMA (double exponentially weighted moving average) charts (cf. [14]), particularly for the detection of small shifts. Recently, a lot of researchers proposed new structures for efficient monitoring of process CV. Dawod et al. [15] and Abbasi and Adegoke [5] investigated univariate and multivariate CV control charts, respectively, for Phase I of SPC. They considered both diffuse symmetric and localized CV disturbance scenarios. Abbasi et al. [16] proposed the design of CV control chart using the progressive mean technique. Chen et al. [17] developed a generally weighted moving average control chart for the monitoring of process CV. Chew et al. [18] proposed a variable parameter control chart for monitoring the multivariate coefficient of variation. Nguyen et al. [6] investigated the performance of Shewhart type VSI control chart for CV monitoring in presence of measurement error. Haq and Khoo [19] proposed new adaptive EWMA control charts for monitoring both univariate and multivariate CV. Moreover, Abbasi et al. [20] enhanced the performance of CV control chart by incorporating ranked set sampling schemes in the design structures.
To enhance the detection ability of the charts, researchers made use of different methods in recent past. One such method is to use the auxiliary information that can help in better estimation of process parameters and in return enhancing the detection ability of control charts. Riaz [21] proposed V r chart for efficient monitoring of process variability by using the regression type auxiliary estimator of population variance. Riaz et al. [22] investigated the estimation effects of auxiliary information based location charts considering normal and non-normal processes. Ahmad et al. [23] investigated a variety of auxiliary information based variability charts. Abbas et al. [24] proposed an efficient EWMA location chart by making use of auxiliary information. Recently, Sanusi et al. [25] and Adegoke et al. [26] enhanced the performance of location EWMA and HWMA charts by making use of auxiliary information based ratio and regression estimators, respectively. Adegoke et al. [27] enhanced the EWMA location chart by using the auxiliary information with different ranked set sampling schemes. Further studies on the use of auxiliary information based VOLUME 8, 2020 control charts can be seen in [28]- [30], and references therein.
All these studies are using auxiliary information for enhancing the efficiency of location or dispersion charts. No study, as of yet, enhanced the efficiency of CV charts with the use of auxiliary information. The purpose of this study is to enhance the detection ability of process CV charts by making use of auxiliary information. The auxiliary information is used in the form of ratio, regression and hybrid forms. The rest of the article is detailed as: Section II presents a set of auxiliary information based CV estimators, a general control chart structure is presented in Section III. The performance comparison of different auxiliary information based CV charts is presented in Section IV. Performance comparison of the proposed CV charts is made with some existing CV charts in Section V. Illustrative example is presented in Section VI and finally conclusions in Section VII.

II. AUXILIARY INFORMATION BASED CV ESTIMATORS
The purpose of this study is to enhance the detection ability of usual CV chart by incorporating the auxiliary information.
I use a set of auxiliary information based estimators to serve this purpose.
Let Y represents the quality characteristic of interest which is correlated with an auxiliary variable X . The pairs (Y i , X i ) are assumed to follow bivariate normal distribution with mean vector µ and variance covariance matrix Here ρ represents the correlation coefficient between the study and the auxiliary variables Y and X , respectively. Let (x 1 , y 1 ), . . . (x n , y n ) represents a sample of size n from the bivariate normal distribution. Letȳ andx represent sample means, s 2 y and s 2 x represent sample variances, s xy represent sample covariance, c y and c x represents sample CVs and r xy represents sample coefficient of correlation. Based on these notations, a set of estimators for process CV, are defined below: Usual CV Estimator (CV U ) All the existing CV charts are based on the usual definition of CV, as given below: whereȳ and s y respectively represent the mean and sample standard deviation of the study variable Y , defined as: A set of auxiliary information based estimators for process CV are proposed by [31]- [34]. Below, I am describing some of their best estimators, that are used in this study:

Regression Estimator 1 (CV Reg 1 )
This estimator of CV is based on using the ratio of the auxiliary information based regression estimators of process variance and process mean: Under normal distribution, the expression for b 1 and b 2 simplifies to (cf. [34]): Regression Estimator 2 (CV Reg 2 ) CV Reg 2 estimator of CV is using the population CV information of the auxiliary variable (X ). The estimator is defined as: Under normal distribution b 3 simplifies to (cf. [34]):

Regression Estimator 3 (CV Reg 3 )
Archaana and Rao [34] proposed a new regression estimator of CV by improving the CV estimator of [32].
The ratio estimator of CV uses the information of the ratio of population mean of the auxiliary variable and its sample mean ( [33]): A class of hybrid estimators were proposed by Tripathi et al. [32], that are based on the mixture of ratio and regression estimators. In this study, I will also use some of these hybrid estimators: Hybrid Estimator 1 (CV H 1 ).
where b 4 = s xy s xx and b 5 = s xy s xx 2 .
Next section will describe a general control chart structure for evaluating the performance of CV charts based on these estimators.

III. CONTROL CHART STRUCTURE
To obtain the control limits of the auxiliary information based CV charts, I define a general statistic CV A ; ∀A = U , Reg 1 , Reg 2 , Reg 3 , R, H 1 , H 2 , H 3 , i.e. CV A can be any of the CV estimators, defined in Section II. Using CV A , a standardized CV statistic can be defined as γ is the process coefficient of variation. By taking the expectation on both sides of For a specific CV A estimator, d 2 and d 3 entirely depend on sample size n and correlation coefficient ρ. The E(CV A ) can be replaced with the average of sample CV A estimates, (i.e. CV A = m j=1 CV A j /m). Hence, an unbiased estimator of process coefficient of variation (γ ) can be defined asγ A = CV A /d 2,A,n,ρ . Moreover, to maintain the false alarm rate α, the lower and upper quantile points of the distribution of V A estimates can be defined as V A,α/2 and V A,(1−α/2) , respectively. Using these notations, the probability limits for the CV A charts are defined as: VOLUME 8, 2020 After fixing the control limits for a specific CV A estimator, the corresponding CV A estimates are plotted to identify the state of the process. If all the CV A statistics fall inside the probability limits, the process is declared to be in-control. If any of the plotting statistic falls outside the limits, the process is said to be in out-of-control state.
For the rest of the study, the control charts based on the different choices of A as U , Reg 1 , Reg 2 , Reg 3 , R, H 1 , H 2 and H 3 are named as CV U , CV Reg 1 , CV Reg 2 , CV Reg 3 , CV R , CV H 1 , CV H 2 and CV H 3 charts, respectively.

IV. PERFORMANCE EVALUATION AND COMPARISON
To evaluate the performance of a wide range of CV charts, as described in Section II, power of detection is used a performance measure. Power of detection is defined as the probability of detecting an out-of-control signal, when the process CV shifts from an in-control level γ 0 to a shifted level γ 1 , where γ 1 is defined as γ 1 = δγ 0 . The power of detection may vary with a change in design parameters such as shift δ, sample size n and correlation coefficient ρ. I noticed that changing γ 0 doesn't effect the performance of VOLUME 8, 2020 the CV charts as the general control chart structure is based on the standardized CV statistic. For the performance evaluation of all the CV charts, I used δ = 1.0, 1.1, 1.2, . . . , 3, n = 5, 10, 15, ρ = 0.3, 0.5, 0.7, 0.9 and γ x = γ y = γ 0 = 0.10. This will help us in identifying the best chart for a variety of design parameters. Note that, δ = 1.0 represents that there is no shift in the process CV. A detailed Monte Carlo simulation study is conducted for performance evaluation of the different CV charts. The steps taken in simulation study are described below: Firstly, to get the control chart constants (d 2 and d 3 ) and the quantile (V A,α/2 and V A,(1−α/2) ), the following procedure is adopted.
• One million samples of size n are generated from an in-control bivariate normal distribution N 2 (µ 0 , γ 0 0 ).
• The sample meansȳ,x, the sample standard deviations s y , s x , the sample coefficient of variations c y , c x and the sample covariance s xy are estimated from each sample.  Tables 3-4, respectively, for varying levels of n and ρ. After finding the control chart constants and the probability points for the CV A charts, I firstly compared the efficiency of the different CV estimators, as described below:

A. EFFICIENCY COMPARISON
The relative efficiency (RE) of the different CV estimators is computed for evaluating their precision following Rousseeuw VOLUME 8, 2020  and Croux [35] and Abbasi [36]. RE is defined as the ratio of minimum standardized variance (SV min ) to the standardized variance (SV ) of a specific estimator. Mathematically, we can define RE as: where SV CV A is defined as: Relative efficiency of the different estimators is graphically compared in Figure 1, using one million samples of size n = 5, 7, 12, 15 considering ρ = 0.3, 0.5, 0.7 and 0.9. In each plot, relative efficiency is plotted on y-axis and the sample size on x-axis. The plots indicate the following: For low correlation levels (i.e. ρ = 0.3 and 0.5): • The best best way of estimating CV is by using the usual estimator (i.e. CV U ).
• CV R estimator is also maintaining high relative efficiency.
• CV H 1 estimator is the least efficient estimator. As the correlation between study and auxiliary variables increase, the auxiliary information based CV estimators are becoming more efficient as compared to the usual CV estimator.
At moderate to high correlation levels: • The usual CV estimator quickly looses its efficiency, compared to the auxiliary information based CV estimators.
• The most efficient way of estimating CV is by using the CV H 2 estimator.
• At large sample sizes, the CV H 3 and CV Reg 3 estimators are also performing relatively well. After the comparison of the relative efficiency, the performance of the charts is evaluated as the power of detecting shifts in process CV. The power of the charts is computed using the following procedure: • One million samples of size n are generated from N 2 (µ 0 , δγ 0 0 ), where δ represent the shift in the process CV level.
• The CV A estimates are computed for each sample and are plotted against the respective probability limits.
• The detection power of the charts is computed as the proportion of sample CVs plotted outside the probability limits. The chart with the higher power of detection will be considered better than the competing charts. The power of detection is computed for all the CV A charts considering varying levels of design parameters n, δ and ρ. To save space, the power results using n = 10 at varying levels of ρ are provided in Table 5. For other combinations of n and ρ, the results are presented graphically in Figures 2-4. The comparison indicates that: • The power of detection increases with an increase in the level of δ for all the charts.
• The detection ability of the auxiliary information based charts increases with an increase in the level of ρ.
• At low correlation levels (i.e. ρ = 0.3), the usual CV chart based on CV U is the best performing chart.
• As ρ increases, the performance of the CV charts based on auxiliary information gets better and better, as compared to the CV U chart.
• At moderate to high correlation levels, the CV H 2 chart is the best performing chart, considering the different sample sizes.
• The CV H 1 and CV Reg 2 are the second best choices for small and large sample sizes, respectively, at high correlation levels.
• The comparative performance of the CV H 1 chart deteriorates significantly for large sample sizes, particularly at low correlation levels. Detection of positive shifts in process CV are usually of more interest, as it indicates a reduction in quality of process/products. On the other hand, detection of negative shifts in process CV are also important as it reflects improvement in process or quality of products. Table 6 reports the power of detection for all the charts considering positive and negative shifts when ρ = 0.90 and n = 10. I can observe from Table 6 that: • The detection power of almost all the proposed charts is greater for the detection of negative shifts, compared to the detection of positive shifts of same magnitude.
• The CV H 2 is the best performing chart for the detection of both negative and positive shifts process CV.
• For the detection of negative shifts, the CV H 3 and CV Reg 1 charts show better detection ability, compared to the CV H 1 , CV Reg 2 and CV Reg 3 charts.
• The CV R and CV U charts are the worst performing charts for the detection of negative shifts. The power comparisons in Figures 2 -4 and Table 6 advocates that the CV H 2 chart is the best performing chart at moderate to high correlation between the study and the auxiliary variable. VOLUME 8, 2020

V. COMPARISON WITH EXISTING CHARTS
In this section the performance of the auxiliary information based CV charts is compared with some existing CV charts, proposed in SPC literature. This will highlight the benefit of using auxiliary information for efficient monitoring of process CV. Specifically, the performance of proposed charts is compared with the CV U chart proposed by Kang et al. [3], the CV EWMA chart proposed by Hong et al. [37], the synthetic CV (CV sync ) control chart proposed by Calzada and Scariano [9], and the improved CV-EWMA (ICV EWMA ) chart proposed by Park et al. [38]. Average run length is used as a performance measure for the comparison of these charts. For a fair comparison, in-control ARL (ARL 0 ) of all the charts are kept at a fixed level of 370 using γ = 0.1 and n = 5, 10, 15. At a fix ARL 0 , the chart with the smallest out-of-control ARL (ARL 1 ) will be considered better than other charts. Table 7 reports the average run length comparison of the proposed CV charts with CV U , CV EWMA and ICV EWMA charts whereas Table 8 provides comparison with CV sync chart. The comparison reveals the following: • The proposed CV H 1 and CV H 2 are performing better than the existing CV U , CV EWMA and ICV EWMA for all choice of n with δ ≥ 1.2.
• The proposed CV H 1 and CV H 2 charts are performing better than the CV sync chart at all levels of n and δ.
• The proposed CV Reg 2 chart is better than the CV sync chart for δ ≥ 2. From the comparisons, I revealed that the proposed CV H 1 and CV H 2 charts are performing better than all the other competing charts. Moreover, it is to be noted that I am using Shewhart structure for my proposed charts, in this study. EWMA or CUSUM versions of these charts will be even more efficient, for the detection of shifts in process CV.

VI. ILLUSTRATIVE EXAMPLE
In this section, I will present an illustrative example using a real data set on air quality to show the application of auxiliary information based CV charts. Atmosphere is a thick layer of air that exists around the globe. The air pollution directly and indirectly effects the environment through different sources, such as: i) combustion of fossil fuels, like coal and oil for electricity and road transport that produce nitrogen and sulfur dioxide, ii) the release of the heavy amount of carbon monoxide, hydrocarbon, chemicals and organic compounds into the atmosphere by industries and factories iii) the emission of excessive use of pesticides, insecticides, and fertilizers at the agriculture lands etc.
In result, the polluted air seriously affects the human health for example, Carbon Monoxide (CO) reduces the amount of hemoglobin in blood; Nitrogen Oxide effects the respiratory system; Sulfur Dioxide SO2 creates smog and cause of global warming; Lead (Pb) may cause neurological damages including nervous system failures, cardiovascular dysfunctions and skin problems. In such situations, it is necessary to monitor and control the pollution levels in air, to improve its quality. For the illustration of this study, I consider a real-life Air Quality data that contains average responses recorded on hourly basis from an array of 5 metal oxide chemical sensors embedded in an air quality chemical multi-sensor device. Originally, this data set contains 9358 observations but after discarding missing values (that are tagged with the value −200), I considered 827 valid observations. The data set consists of 13 process variables, out of which I considered the true hourly averaged tungsten oxide (NO x ) concentration (in ppb) as the study variable (Y) and the true hourly averaged tungsten oxide (NO 2 ) concentration (in µg/m 3 ) as the auxiliary variable (X). Further details about other variables of the data set may be seen in De et al. [39] or on the website: (cf.https : //archive.ics.uci.edu/ml/datasets/Air + quality).
The overall estimates for this data are considered as the process parameters. I found out that µ y = 143.502; µ x = 100.26; γ x = 0.21788; γ y = 0.37817; and ρ = 0.8574. Moreover, the 827 observations are distributed in to 165 samples of size 5. Based on these process parameters, the control limits for all the CV A charts are computed to fix the false alarm rate α = 0.0027. From the set of these 165 samples, 80 samples are selected at random to represent the in-control process. As recommended by [3] and [5], the first step should be to check the constancy of CV and suitability of the proposed charts. For the illustrative example, I will use the three CV charts, namely, CV U , CV Reg 3 and CV H 2 charts. I plottedγ A 2 against the sample means using all the three CV estimators and the plots are provided in Figure 5. All the plots in Figure 5 show that the CV is constant against the means, considering the three estimators (CV U , CV Reg 3 and CV H 2 ). Moreover, to test it formally, the regression results are provided in Table 9. The reported p-values are all greater than 0.05, indicating the independence of CV and mean levels.
For fixing the false alarm rate α = 0.0027 for the three charts, the selected control limits are (0.161, 2.673) for the CV U chart, (0.202, 4.742) for the CV Reg 3 chart and (0.295, 2.429) for the CV H 2 chart. Further, to check the detection ability of the three charts, shift of magnitude δ = 2.5 is introduced in the next 20 random selected samples. The plotting statistics for the three charts are reported in Table 10. These statistics are plotted against their respective control limits in Figure 6. The plots show that the CV U detects 5 (at sample points 85-87, 91, 92 and 98), the CV Reg 3 detects 2 (at sample points 81 and 97) whereas the CV H 2 chart detects 10 (at sample points 83, 85-88, 91, 92, 94, 97 and 98) out-of-control VOLUME 8, 2020 and CV H 2 charts for the Air Quality data. signals, after the occurrence of shift. This clearly shows the superiority of the CV H 2 chart over the other competing charts. This is also in accordance with the findings of Section IV.

VII. CONCLUSION
In this study, a set of auxiliary information based control charts are proposed for efficient monitoring of process CV. The auxiliary information is used in regression, ratio and hybrid forms. The performance of these charts is evaluated using power of detection as the performance measure. Further, the performance of auxiliary information based charts is compared with the CV U , CV EWMA , ICV EWMA and CV sync charts. It has been observed that the performance of a wide range of the auxiliary information based CV charts improve with an increase in the correlation between auxiliary and study variable. The newly proposed charts, based on auxiliary information, are significantly efficient than the competing charts, particularly for moderate to high correlation levels.
The best performing chart is best on the H 2 auxiliary estimator (i.e. the CV H 2 chart). This study will help quality practitioners to choose an efficient control chart for the monitoring of process CV. CV EWMA and CV CUSUM charts based on auxiliary information can be investigated as future research directions.