Efficient Homogeneously Weighted Moving Average Chart for Monitoring Process Mean Using an Auxiliary Variable

In this paper, we propose an efficient control chart for monitoring small shifts in a process mean for scenarios where the process variable is observed with a correlated auxiliary variable. The proposed chart, called an auxiliary homogeneously weighted moving average (AHWMA) chart, is a homogeneously weighted moving average type control chart that uses both the process and auxiliary variables in the form of a regression estimator to provide an efficient and unbiased estimate of the mean of the process variable. We provide the design structure of the chart and examine its performance in terms of its run length properties. Using a simulation study, we compare its run length performance with several existing methods for detecting a small shift in the process mean. Our simulation results show that the proposed chart is more efficient in detecting a small shift in the process mean than its competitors. We provide a detailed study of the chart’s robustness to non-normal distributions and show that the chart may also be designed to be less sensitive to non-normality. We give some recommendations on the application of the chart when the process parameters are unknown and provide an example to show the implementation of the proposed new technique.


I. INTRODUCTION
Monitoring programs are designed to detect unnatural changes in process variables for a wide variety of applications, particularly in industrial and manufacturing settings.Control charts are popular tools for tracking processes of interest, ensuring they are kept in control by monitoring essential quality characteristics [1].To date, several univariate control charts have been proposed in statistical process control (SPC) literature; they are classified into (i) memoryless control charts and (ii) memory-type control charts for monitoring large and small-to-moderate shifts in the process, respectively.For example, the Shewhart chart is a memory-less control chart that uses only the current process information and not the past behavior of the process.It is very effective for detecting large shifts in the process mean (i.e., δ ≥ 2, where δ is the size of the shift in standard deviation units [2]).The homogeneously weighted moving The associate editor coordinating the review of this manuscript and approving it for publication was Zhiwu Li. average (HWMA) control chart by [3] is a memory-type chart proposed for efficient monitoring of small (i.e., δ ≤ 0.5) to moderate (i.e., 0.5 < δ < 2) shifts in the process mean.Other memory-type charts include the EWMA chart by [4], the CUSUM chart by [5], and the mixed EWMA-CUSUM chart proposed by [6].
These univariate classical charts are widely used in most of today's industries; their attractiveness is motivated by the simplicity of their construction, implementation, and interpretation, as well as their prompt detection of small, moderate, or large shifts in a process mean.These techniques have been implemented by [7] to monitor the quality of garments produced on the sewing floor, by [8] to monitor and control steam boiler generation for vacuum degassing processes, and by [9] to evaluate critical control point hygiene data.Also, see [10]- [13] for some other industrial applications of these classical charts.
Several applications of classical charts focus on monitoring the process in situations where the process variable is independent of other variables; however, in some cases, VOLUME 7, 2019 This work is licensed under a Creative Commons Attribution 4.0 License.For more information, see http://creativecommons.org/licenses/by/4.0/ the process variable may be observed along with another correlated auxiliary variable.The concept of using supplemental information to provide an efficient estimate of a population parameter is popular in the field of survey sampling [14].
Several researchers have studied and recommended the use of auxiliary variables in the monitoring of a process variable of interest, and have proposed a variety of different control chart tools for this purpose.For example, [15] proposed a regression control chart, while [16] proposed a cause-selecting control chart.Recently, [17] proposed a Shewhart-type chart in the form of a regression-based estimator, called a V r chart, for monitoring process variability.He compared the proposed V r chart with some other existing charts (specifically, R, S and S 2 charts for the same purpose), and showed that the V r chart was effective in detecting moderate to large shifts in the process variability under certain conditions on the correlation between the process variable and auxiliary variable.Similarly, a Shewharttype control chart using a regression-based estimator (M r chart) for monitoring a process mean (proposed by [18]) was shown to be more powerful at detecting shifts in the process mean.This work was later extended to an EWMA chart for detecting small-to-moderate changes in the process mean under different correlation structures between the process and auxiliary variables (see [19]- [24]).
Here, we propose a more efficient control chart for monitoring the process mean when the process variable is observed along with an auxiliary variable.The proposed chart, called an auxiliary homogeneously weighted moving average (AHWMA) chart, is an HWMA-type control chart that uses both the deviation of the process mean from its target value (known apriori or estimated from historical reference samples), as well as a regression estimator for the process mean provided through its relationship (or estimated relationship) with an auxiliary variable with which it is known to be correlated.
The rest of the article is organized as follows: in Section II, we outline the structure of the chart.Section III compares the AHWMA chart (run length) performance in detecting a small shift in the process mean with several other existing charts.Section IV gives a detailed study of the chart's robustness to non-normality.We give recommendations regarding the application of the chart when the process parameters are unknown in Section V. Section VI also provides an example to demonstrate practical implementation of the chart, followed by a conclusion and discussion in Section VII.

II. THE AHWMA CONTROL CHART
Consider a control chart based on observations z ij of the quality characteristics Z ij , for each of i = 1, . . ., m timepoints and j = 1, . . ., n sampling units per time-point (i.e., n is the sample size).Assume that these quality characteristics (Z ij ) are identically distributed as normal random variables with a known in-control mean (µ Z ) and standard deviation (σ Z ), i.e., Z ij ∼ N (µ Z , σ 2 Z ) and represents the main process variable.The HWMA statistic, H i (in Equation ( 1)), at time-point i, gives a specific weight to the current sample and the remaining weight is equally distributed among the previous samples, and is given by: where zi is the sample average for the ith sample, and w is a smoothing constant (also called the sensitivity parameter) selected such that 0 < w ≤ 1.The HWMA structure becomes the Shewhart plotting structure whenever w = 1.zi−1 is the average of the sample means of all of the previous samples (i.e., up to and including the (i − 1)th sample), and is given by zi−1 = 1 n i−1 k=1 zk .The mean and variance of the HWMA statistic in Equation ( 1) are given by µ H = µ Z , and where µ H = µ Z and σ 2 Z are the mean and variance of the normally distributed random variable Z [3].
Let an auxiliary variable, Y ij , be correlated with the main variable of interest, Z ij , with correlation ρ.We assume the observations of Z ij and Y ij are observed in pairs from a bivariate normal distribution, given by (Z , , where N 2 is the bivariate normal distribution, and µ Y and σ 2 Y are the population mean and variance of Y , respectively.We assume the linear relationship between the variables can be modeled using linear least squares obtained by adjusting the process mean at time i, z i , to reflect its known relationship with the auxiliary variable.This yields the regression-informed estimator (i.e., R i ) for the process mean given by: where b (given as b = ρσ Z σ Y ) is the slope of the regression line; given as the change in the process variable, Z , due to a unit change in the auxiliary variable, Y [14].The mean and variance of R are given as: respectively.
Using Equation (3), the plotting statistic (T i ) of the AHWMA chart is given as: where w is the smoothing parameter of the chart (selected such that 0 ≤ w ≤ 1), R i is the regression-informed estimate of the process variable, given in Equation ( 3) for the ith sample, and Ri−1 is the average of the sample means of all of the previous samples (i.e., up to and including the (i − 1)th sample) of the plotting statistic, and is given as The mean and variance of the plotting statistic in Equation ( 5) are given as µ H = µ Z (also called the center line of the AHWMA chart), and respectively.The time varying lower (L i ) and upper (U i ) control chart limits of the plotting statistic given in Equation ( 5) are given as: and respectively, where, C determines the width of the control limits; the values of C and w are chosen to achieve a desired in-control average run length (ARL) for the chart.ARL is the average number of plotted samples on the control chart before a shift is detected.We provide R-code [25] (in the supplementary material) which practitioners can use to obtain the value of C, given w, that fix the in-control ARL of the chart to a desired value.We adopted the ARL numerics algorithm for the EWMA chart [26]; implemented in the spc (R) package [27], to obtain an arbitrary start value (C start ) of the AHWMA chart limit, and used a binary search algorithm to determine C for the chart.

III. PERFORMANCE ASSESSMENTS AND COMPARISONS A. PERFORMANCE ASSESSMENTS
Here, we provide a comprehensive assessment of the AHWMA chart in detecting a shift in the process mean in terms of the chart's ARL and standard deviation of run length (SDRL).The value of the ARL when a process is in control is denoted by ARL 0 , while ARL 1 denotes the value of the ARL when the process is out of control.SDRL is used to determine the variation of the run length distribution for a given value of shift.Similarly, SDRL 0 and SDRL 1 can be defined as the SDRL for the in-control and out-of-control process, respectively.When comparing two charts, the ARL 0 is fixed to a specific value, and a chart having a smaller value of ARL 1 than another is said to be more efficient in detecting the shift in the process [28]- [31].
To ensure a fair comparison of the AHWMA chart with existing charts of the same ARL 0 , we examined the performance of the chart with w ∈ {0.03, 0.05, 0.10, 0.25, 0.5, 0.75}, and the corresponding values of C that fix ARL 0 to 500 are used, the R-code provided in the supplementary material finds the value of C (for each value of w), that fixes ARL 0 to 500.We examined the ARL performance of the chart under different correlation values between the process and the auxiliary variables.Specifically, we considered ρ ∈ {0.05, 0.25, 0.5, 0.75, 0.95}.The ARL values of the AHWMA chart are given in Tables 1 -5.In these tables, δ is the size of shifts, and is calculated as where n is the sample size at each time i (here, we assume n = 1 across i), and µ Z and µ 1 are the in-control and out-ofcontrol mean, respectively.The main findings of the AHWMA chart (cf.Tables 1 -5) are: • For fixed values of δ and ρ, the chart is more efficient for smaller values of w.For example, where ρ = 0.05 (Table 1), when δ = 0.5, the values of the ARL 1 when w = 0.03 and 0.75 were 20.05 and 132.08, respectively.Thus, the chart detects a shift in the process mean faster when a small value of w is used.
• For fixed values of δ, w and C, the chart is more efficient when large values of ρ are used.For example, when w = 0.03, L = 2.272, and δ = 0.5, ARL 1 values were 20.05 and 3.43 (in Tables 1 and 5) for ρ = 0.05 and ρ = 0.95, respectively.Thus, increases in the correlation structure between the process variable and the auxiliary variable lead to an increase in the chart's ability to detect a shift.
• The chart is ARL unbiased.That is, the ARL 1 values never exceed the corresponding ARL 0 for any choice of δ examined.
• As δ increases, the ARL 1 and SDRL 1 values approach 1 and 0, respectively, especially for large values of ρ; that is, the charts detect large shifts promptly.

B. COMPARISONS
We provide detailed comparisons of the proposed AHWMA chart with some existing control charts: the classical HWMA chart by [3], the classical EWMA chart by [4], the classical CUSUM chart by [4], the auxiliary-based EWMA chart (i.e., M X EWMA) by [19], and the auxiliary-based CUSUM chart (i.e,A ux CUSUM 2 by [22]), in terms of their ARL values.The auxiliary-based EWMA and CUSUM charts are also based on a regression estimator; they provide efficient applications of the classical EWMA and CUSUM charts, respectively, in those situations where the process variable is observed along with a variable.For comparison with the M X EWMA and A ux CUSUM 2 charts, we considered three different values of ρ: namely, ρ ∈ {0.05, 0.5, 0.95}.In all cases, the charts' parameters were set to values that fix ARL 0 at 500.We provide the charts' ARL results that optimized δ at w ∈ {0.05, 0.1, 0.2}.The results of the comparisons are provided in Table 6.As shown on the table, the AHWMA chart outperformed the classical CUSUM, EWMA and HWMA charts in detecting shifts in the mean, especially when ρ > 0.05.For fixed values of w and ρ, the AHWMA chart was more efficient than the A ux CUSUM 2 chart, especially for small-to-moderate values of δ (i.e., δ < 2).For fixed values of w and ρ, the AHWMA chart was less efficient than the M X EWMA chart in detecting moderate-to-large shifts (i.e., δ > 0.5) in the process mean.However, the AHWMA chart shows greater efficiency than the M X EWMA chart in detecting small shifts (i.e., δ ≤ 0.5) in the mean.

IV. ROBUSTNESS TO NON-NORMALITY OF THE CHART
The AHWMA chart described in Section II relies on the assumption that the process variable and the auxiliary variable are bivariate normally distributed.In practice, this assumption does not always hold.Non-normality is not a major concern with a large sample size because the central limit theorem warrants that the sample mean will be approximately normally distributed for any continuous variables [32].When n = 1, however, it is important to check the sensitivities of control charts to departures from normality [2].We refer readers to [33]- [35], and [36] for detailed studies on the robustness of the EWMA control chart to non-normality.
Here, we investigate the robustness of the AHWMA chart to non-normality.As mentioned by [33] ''a control chart is robust if its in-control run-length distribution remains stable (unchanged or nearly unchanged) when the underlying distributional assumption(s) (e.g.normality) are violated''.Following previous investigators [33]- [37], we considered a heavy-tailed bivariate distribution (the bivariate Student's t-distribution), and a skewed distribution (the bivariate gamma distribution).We denote the bivariate t-distribution with v degrees of freedom by t 2 (v).The probability density TABLE 3. ARL and SDRL values of the AHWMA chart when the correlation between the variables is ρ = 0.5.The values of C are chosen to fix the chart's ARL 0 to 500 for each chosen value of w .

TABLE 4.
ARL and SDRL values of the AHWMA chart when the correlation between the variables is ρ = 0.75.The values of C are chosen to fix the chart's ARL 0 to 500 for each chosen value of w .
The ARL 0 results in Tables 7 and 8 are summarized below: • For a fixed value of w, the ARL 0 values of the bivariate t-distributions are the same for all the correlation values (i.e., ρ ZY = 0.25, 0.5, or 0.95) examined.This result is due to the symmetry of the t-distribution.
• However, for a fixed value of w, the ARL 0 of the bivariate gamma distributions differ across all the values of ρ ZY examined.Here, the chart appears to be more robust to non-normality only for smaller values of ρ ZY .
• For both non-normal distributions, as expected, the ARL 0 value increases, and tends to converge to the required nominal ARL 0 of the AHWMA chart, for large degrees of freedom (v) or larger values of the shape parameter (i.e., α ≥ 50), especially when w = 0.3 or 0.05 is used.
• Importantly, the chart's ARL 0 value is more robust to non-normality only when a small value of w (i.e., w = 0.03 or 0.05) is used.This implies that small values of w (i.e., w = 0.03 and 0.05) are useful when the underlying distribution is not normal.
Table 9 displays the ARL 1 values for the AHWMA chart under bivariate normal, t and gamma distributions for various values of δ when w = 0.03, or 0.75, and ρ ZY = 0.25.The results in Table 9 indicate that the chart's ARL 1 values tend to approach values obtained for bivariate normal data when a smaller value of w (i.e., w = 0.03) is used.For example, when w = 0.03, v = 50, β = 50, and δ = 0.5, the ARL 1 for the AHWMA were 19.02 (normal distribution), 19.65 (t-distribution), and 19.18 (gamma distribution).The percentage deviation of the ARL 1 values obtained under the t or gamma distributions from ARL 1 values obtained under normal distribution are 3.13% and 0.84%, respectively.On the other hand, when w is large (i.e., w = 0.75), and other parameters are unchanged (i.e., v = 50, β = 50, and δ = 0.5), the ARL 1 for the AHWMA under normal, t and gamma distributions were 125.44, 118.63, and 68.10, respectively; the percentage deviation of these ARL 1 values from obtained under the normal distribution were −5.43% and −45.71% for the t and gamma distributions, respectively.

V. STEP-BY-STEP ALGORITHM FOR CONSTRUCTING THE AHWMA CHART WHEN PARAMETERS ARE UNKNOWN
The AHWMA chart in Section II was formulated based on the assumption that the parameters associated with the process variable and auxiliary variable are all known.However, these parameters are generally unknown in practice and need to be estimated.In this case, the regression model in Equation ( 3) would be based on estimated parameters, and is given as: where b is the estimated slope of the regression line; given as the estimated change in the process variable Z due to a unit change in the auxiliary variable Y [14], and μY ( μY = 1 m m i=1 Ȳi ) is the unbiased estimate of the mean of the auxiliary variable (i.e., µ Y ).The estimated mean and variance of R are given as R = μZ , and , where r is the estimated value of the correlation size between the variables, μZ and σ 2 Z are the unbiased estimates of µ Z and σ 2 Z , respectively.The μZ and σ 2 Z are calculated from a specified set of sample values measured when the process was known to be in control.They are given as μZ = 1 m m i=1 Zi and , where is an unbiasing constant [3], [40].
Using Equation (11), the plotting statistic for the AHWMA control chart based on estimated parameters is given as: VOLUME 7, 2019 The estimated mean and variance the plotting statistic in Equation ( 12) are given by μT = μZ , and The upper and lower control limits for the (plotting statistic given in Equation ( 12)) estimated time varying control chart are given as: where C determines the width of the estimated control limits.Also, the estimated center line (CL) of the AHWMA chart is given by: When the chart is based on estimated parameters, implementation occurs in two phases.In phase I (retrospective phase), a historical reference sample is studied to establish the in-control state and to evaluate the stability of the process [41], [42].Once the in-control reference sample is characterized, the process parameters are estimated from phase I and control chart limits are obtained for use in phase II.Phase II involves regular monitoring of the process.
If successive observed values obtained at the beginning of Phase II fall within the in-control limits calculated from Phase I, the process is considered to be in control.In contrast, any observed values during Phase II that fall outside the control limits indicate that the process may be out of control, and remedial responses are then required [43], [44].A shift in a process parameter needs to be detected quickly so that corrective actions can be taken as early as possible.
We give below a step-by-step algorithm to implement the chart in phase I and phase II [3], [45].).These are used to set the control limits in phase II.
• Phase II 4. At each i, simulate a bivariate sample (Z , ) of size n from the process.5. Compute the estimated regression estimator in Equation (11), and use this to compute the chart's plotting statistic, Ti , in Equation ( 12). 6. Use the estimated parameters from phase I (from step 3 above) to construct the estimated control limits given in Equation ( 14)- (15).Compared Ti , against these control limits.7. If Ti falls within the control limits, the process is considered to be in control.Alternatively, if Ti falls outside the control limits, the process is declared to be in an out-of-control state.

VI. ILLUSTRATIVE EXAMPLE
In this section, we provide an example to illustrate the implementation of the AHWMA chart, using the simulated dataset provided in [19].The data were obtained by simulating m = 20 bivariate samples, each of size n The values of the parameters used for the simulation were: Y = 1, ρ = 0.5, and δ = 0.5, where δ is the size of the shift applied to the in-control mean, µ Z , of the process variable of interest, and ρ is the correlation between the process variable and the auxiliary variable.The bivariate dataset is given in the first two columns of Table 10.We examined the ability of the AHWMA chart to detect a shift in the process variable and compared this to the M X EWMA, the classical EWMA and the HWMA charts.In all cases, the chart parameters: w and C, were chosen to fix the ARL 0 to 500.The parameters for the classical EWMA and M X EWMA were w = 0.03 and C = 2.483 (see Table 6).For the classical HWMA and AHWMA charts, we used w = 0.03 and C = 2.272 (see Table 6).We give the calculations for the AHWMA chart in Table 10, and the results for all the control charts are shown graphically in Figure 1.
The AHWMA chart detected the shift in the process mean faster than any of the other methods.In particular, it detected the shift after the 14th sample, whereas the M X EWMA chart detected the shift after the 15th sample, and the classical EWMA, and HWMA charts both detected the shift only after the 18th sample.

VII. CONCLUSION AND DISCUSSION
We propose here a new efficient control-chart method, for monitoring small shifts in the process mean where the process variable of interest is correlated with and observed alongside an auxiliary variable.Based on the homogeneously weighted moving average, the proposed chart uses both the process and auxiliary variable to form a regression estimator that yields an efficient and unbiased estimate of the mean of the process variable.We provided the design structure of the chart and examined its performance in terms of its run length properties.Our simulation results showed that the chart detects a shift in the process mean more rapidly than other methods.Also, the ARL comparisons showed that the proposed chart is more efficient than existing control charts used for the same purpose, especially when interest lies in detecting a small shift in the process mean.We provided a detailed study of the chart's robustness to non-normality.The chart's ARL values showed that the chart is more robust to non-normality when a smaller value of w is used.In particular, when a small value is chosen for the chart's smoothing parameter (for example w ≤ 0.05), the proposed chart can be designed to have an in-control ARL that is reasonably close to the ARL for the chart under a normally distributed process.We gave some recommendations on the application of the chart when the process parameters are unknown, and provided a step-bystep algorithm to construct the chart for phase I and phase II of SPC.Also, we applied the chart to a simulated dataset and showed that it detected a small shift in the process mean faster than other examined charts including EWMA, HWMA, M X EWMA, and A ux CUSUM 2 methods.We consider that the effect of estimating parameters during phase I of the process on subsequent performance of the AHWMA chart warrants further study.

VOLUME 7, 2019 TABLE 1 .
ARL and SDRL values of the AHWMA chart when the correlation between the variables is ρ = 0.05.The values of C are chosen to fix the chart's ARL 0 to 500 for each chosen value of w .TABLE 2.ARL and SDRL values of the AHWMA chart when the correlation between the variables is ρ = 0.25.The values of C are chosen to fix the chart's ARL 0 to 500 for each chosen value of w .

VOLUME 7, 2019 TABLE 5 .
ARL and SDRL values of the AHWMA chart when the correlation between the variables is ρ = 0.95.The values of C are chosen to fix the chart's ARL 0 to 500 for each chosen value of w .
MARTI J. ANDERSON is currently a Distinguished Professor with the New Zealand Institute for Advanced Study (NZIAS), Massey University, Auckland, New Zealand.Her current research interests include multivariate analysis, community ecology, biodiversity, ecological monitoring, experimental design, and resampling methods.RIDWAN ADEYEMI SANUSI received the B.Sc. degree in statistics from the University of Ibadan, Ibadan, Nigeria, and the M.Sc.degree in applied statistics from the King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia.He is currently pursuing the Ph.D. degree with the Department of Systems Engineering and Engineering Management, City University of Hong Kong, Hong Kong.His research interests include statistical process monitoring and applied statistics.MATTHEW D. M. PAWLEY is currently a Senior Lecturer in statistics with the School of Natural and Computational Sciences, Massey University, Auckland, New Zealand.His research interests include high-dimensional data monitoring, statistical process control, data mining and analytics, and applied statistics.94032 VOLUME 7, 2019

TABLE 6 .
ARL comparisons of the charts.

TABLE 9 .
ARL 1 with bivariate t and gamma distributions.

TABLE 10 .
Calculation of the AHWMA chart statistic and its limits.