High-Speed Monitoring of Multidimensional Processes Using Bayesian Updates

The advent of modern data acquisition and computing techniques has enabled high-speed monitoring of high-dimensional processes. The short sampling interval makes the samples temporally correlated, even if there is no underlying autocorrelation among covariates. In this study, we introduce a new process monitoring scheme in a Bayesian framework. The key strategy of this study is to incorporate sequential observations into the estimation procedure for the parameters of interest to update the prior distribution. Based on the updated prior, we obtain the most appropriate estimation of the process parameters at each sampling epoch by maximizing the posterior probability. In addition, conventional statistical process control and monitoring methodologies suffer from the “curse of dimensionality.” The closed form of the estimate developed in this study through Bayesian updates enables the proposed method to be effective for high-dimensional process monitoring. Various simulation studies demonstrate the superiority of the proposed scheme in the high-speed monitoring of high-dimensional processes. Moreover, a few sample paths of the estimated mean in a procedure of the proposed method are illustrated to provide practitioners with insights into the monitoring and control of the process. Finally, we provide a real-life application to illustrate the proposed method.

INDEX TERMS Autocorrelated process, Bayesian update, high-dimensional process, process mean monitoring, statistical process control.

I. INTRODUCTION
17 Statistical process control (SPC) and monitoring techniques 18 have been widely used to detect process changes by mon- 19 itoring quality characteristics or process parameters such 20 as mean and covariance. When there are multiple quality 21 characteristics, a standard approach is to consider simulta-22 neous monitoring of the mean by taking correlation into 23 consideration in a chart statistic. Hotelling's T 2 , multivari- 24 ate exponentially weighted moving average (MEWMA), and 25 multivariate CUSUM (MCUSUM) charts are good examples 26 of simultaneous monitoring charts used for multivariate SPC 27 (MSPC) [1], [2]. The advent of modern data acquisition 28 The associate editor coordinating the review of this manuscript and approving it for publication was Yu Liu . and computing techniques has enabled high-speed moni- 29 toring of high-dimensional processes. Although the tradi- 30 tional MSPC methods can be applied to monitoring relatively 31 high-dimensional processes, they mostly suffer from high-32 dimensional settings; this phenomenon is called the ''curse of 33 dimensionality'' [3], [4]. 34 Accordingly, many charts intended for high-dimensional 35 processes have been developed. Principal component analysis 36 (PCA)-based approaches are well suited for monitoring high-37 dimensional processes by focusing on only a few principal 38 components (PCs) [5], [6], [7]. Although they enjoy com-39 putational efficiency, the chart performance may deteriorate 40 depending on the shift direction, that is, they are directionally 41 variant [8]. In addition, because the principal component is 42 a linear combination of all variables, it is not expected to 43 slightly over time after the shift occurs, which is not neces-100 sarily a sudden jump. For example, in a milling machining 101 process, the milling tool starts oscillating gradually as it wears 102 out, resulting in an oscillating pattern of the out-of-control 103 signal [8]. Another example can be found in chemical pro-104 cesses. In the literature of the Tennessee Eastman Industrial 105 Challenge Problem (TE problem [13]), it is shown that most 106 of the measurements show insignificant autocorrelation when 107 the process is operating in normal mode. However, the sys-108 tem shows severe dynamic behavior when disturbances are 109 added [14]. Figure 1 illustrates the in-control data pattern of 110 one of the variables in the TE process, reactor coolant temper-111 ature, and Figures 2 and 3 show the out-of-control situation 112 after the disturbances are added. Figures 1-3 demonstrate that 113 the mean of the process in the out-of-control state is patterned 114 dynamically by time, while the process has a stable constant 115 mean in a normal operating state. The proposed method considers the probability distribution 117 of the process mean µ t in a Bayesian framework. In particu-118 lar, µ t can be observed on a stochastic basis and determined 119 through the path µ t−1 , µ t−2 , . . . , µ 1 , that is, the Bayesian 120 theorem updates the posterior distribution of µ t sequentially. 121 The rationale behind the underlying dynamics is that the 122 current process mean is essentially associated with previous 123 means in a high-speed monitoring environment. This suf-124 fices considering the possible autocorrelation in a high-speed 125 sampling process, and therefore, the model would explain 126 the out-of-control dynamics more appropriately, although the 127 in-control process mean is assumed to be constant over time. 128 In addition, we focus on the detection of the process mean 129 change, while the process variability remains unchanged.

130
Another main challenge is to overcome the computa-131 tional issue in multi-dimensional processes without dimen-132 sion reduction, which requires extensive computational 133 efforts as the dimension (p) increases. The recently devel-134 oped VS-based methods that adopt penalized likelihood 135 demand a large amount of computation as p increases. 136 VOLUME 10, 2022  propriate to be incorporated into process monitoring as a 154 pre-diagnosis procedure, especially in high-speed monitoring 155 processes.

156
The proposed method enjoys the computational benefit 157 in that the process mean at a sampling point, t, that is, 158μ t is obtained theoretically as a closed form via Bayesian 159 updates. The closed form of the estimate of the mean 160 makes process monitoring very effective when samples are 161 taken frequently in multi-dimensional processes, compared to 162 VS-based methods, which demand heavy computation. One 163 may argue that VS-based methods provide diagnostic infor-164 mation about faulty variables, although they are computa-165 tionally inefficient. Indeed, the proposed method does not 166 explicitly identify faulty variables, unlike VS-based meth-167 ods. However, the method provides meaningful probabilities 168 indicating which variables can potentially be out of control. 169 This would be attractive for practitioners to anticipate how 170 the process proceeds at the next sampling point.

171
The Bayesian framework has paid attention and been 172 widely used in statistical control of processes where the 173 use of Bayesian methods takes the advantage of the process 174 knowledge incorporated into the methodology through the 175 prior distributions. Many of them focus on the optimality of 176 the chart design to determine control policy [17], [18], [19], 177 [20], [21]. In addition, the Bayesian theory has also been 178 used to determine the time to signal of fault [22]. However, 179 quite a few Bayesian methods have been studied in process 180 monitoring, most of which are univariate process monitoring 181 with theoretical and practical progress [23], [24], [25], [26], 182 [27], [28]. Reference [29] mentioned that few researchers 183 have developed the extension to the multivariate setting. 184 Reference [30] developed the change point estimation proce-185 dure in Bayesian context given that an out-of-control signal 186 was raised. Reference [31] developed a Bayesian hierarchical 187 model to determine the means and directions of the shifts 188 with a given suspected out-of-control sample. Reference [32] 189 applied dynamic Bayesian estimation to detect machine fail-190 ures in a discrete multivariate situation. Recently, [33] applied 191 Bayesian sequential update when the assignable causes are 192 sparse. This paper proposes a Bayesian method to monitor 193 the process mean in multivariate online processes and shows 194 the advantage of using updated prior distribution in high-195 dimensional process monitoring.

196
The proposed method provides simplicity in using 197 Bayesian updates in process monitoring and opens further 198 investigation in statistical process monitoring. The novelty of 199 the work can be summarized into two major aspects. First, 200 the updated parameters at every sampling point in a high-201 speed sampling process enabled the monitoring statistic to be 202 more efficient than conventional control charts. Few studies 203 have been conducted in this circumstance. Although [33] 204 applied the Bayesian sequential update in high-dimensional 205 processes, it assumed sparsity in the out-of-control pro-206 cess. The proposed model is developed without sparsity 207 assumption, thereby generalizing [33]. Second, the proposed 208 method is computationally inexpensive. The closed-form 209 estimate developed in later sections enables effective in 210 high-dimensional process monitoring in contrast to many 211 previous studies that are computationally expensive and hard 212 to apply in practice.

213
The remainder of this paper is organized as follows. In the 214 next section, we describe our proposed method based on 215 a Bayesian update from the perspective of process moni-216 toring and control. In Section 3, we compare the proposed 217 through various simulations. In Section 4, we discuss the pos-219 terior probability distribution of the process mean and present 220 a predictive analysis that would be useful in monitoring and 221 diagnosing faults. Section 5 illustrates the proposed method 222 with a real-life example, followed by concluding remarks and 223 a discussion of future research in Section 6.
where Z t is a basis matrix to convert the unobservable mean to 236 a measurable observation, and ε t represents the randomness 237 that has zero mean with a covariance . Because the variabil-238 ity of the process measurements is represented by ε t statisti-239 cally, we assume that the variability remains unchanged over  high-speed monitoring process as follows: where the matrices T t and R t are the parameters used to 246 model the autocorrelation and moving errors [34]. η t is 247 the randomization term in the state of the mean and fol-248 lows N (0, Q). In the context of SPC, we consider the ran-249 domness to be serially independent and independent of ε t . 250 We also assume that the errors are identically distributed 251 over the sampling time, which makes sense to describe the 252 complete randomness of the in-control process. Some litera-253 ture on monitoring autocorrelated measurements attempts to 254 estimate the autocorrelation parameters first [33], [34], [35],

255
[36], [37], [38], [39] or at least assumes the autocorrelation 256 parameters to be known, that is, the matrices T and R are 257 estimated or given. Throughout the paper, we assume Marko-258 vian state dynamics with the identity matrices for T and R to 259 generalize the model under the assumption that the process 260 parameter is unknown and non-measurable. 261 We now incorporate the Bayesian probability into the Using the Bayesian rule for conditional 271 distribution, we can obtain the probability as can be obtained recursively as where P µ t , . . . , µ 1 can also be obtained similarly as using a Markovian property. By plugging (5) into (4), 283 we obtain the joint probability as Then, the posterior distribution in (3) can be derived as When the observations are independent and are only deter-292 mined by the current mean µ t , (7) can be written as The result from (8) provides important findings for the 296 Bayesian framework. The first term P x t | µ t is the likeli-297 hood probability, similar to the classical decomposition of the 298 Bayesian posterior distribution. However, the second term is 299 not a prior probability, P(µ), but a prior probability given 300 previous means. This is a significant finding that can be 301 interpreted as an updated prior distribution. By letting P(µ) 302 be a probability density function of µ as a prior distribution, 303 it can be seen as a conditional distribution given all previous 304 means by incorporating Markovian state dynamics as and the distribution is P µ 0 at time t = 0.

307
In fact, this can be observed in Gaussian Kalman update 308 as well. Let the location and scale parameters be θ and K, 309 respectively. As shown in [33] and [34], these two parameters 310 VOLUME 10, 2022 are determined recursively at every sampling epoch. More 311 importantly with the Gaussian error distribution, not only the 312 location parameter but also the scale parameter is updated 313 clearly as a closed form (see [34] for detailed derivation 314 of θ t and K t ). 315 Thus, we can conclude that the posterior probability is 316 proportional to the likelihood and updated prior probabilities 317 in the Bayesian context. 318 Now, we obtain the most probable process mean,μ t by 319 maximizing the posterior probability (MAP) as With the Gaussian errors for η and ε, the estimate is identical We set the predictive mean 327 to be determined using the previous estimate. Then, the mini-328 mization problem in (10) can be solved analytically by setting 329 the derivative with respect to x t equal to zero, that is, The solution of (10) has the following recursive form: which is the same solution obtained from a Gaussian Kalman 334 filter. It is interesting to note that the Kalman filter attempts 335 to find the most likely cause of the measurement given the 336 approximation made by a flawed estimation and has a solu-337 tion identical to that of the Bayesian approach [33], [34], [41].

355
The propagated estimate of the covariance matrix P t|t H t ] and also has a solution with 357 a recursive form: In (12), the exact covariance matrix is used to monitor 362 the process. However, with the assumption of stable vari-363 ability over time in a steady-state process, it is reasonable to 364 assume that the covariance of the mean is also stable. Thus, 365 we replace P t|t with an asymptotic covariance P ∞ , given 366 below: The equations (13) and (14) are called the discrete-370 time algebraic Riccati equation, which can be numerically 371 solved-this can be done by using the idare function in 372 MATLAB version R2019b or later. Finally, the chart triggers 373 an alarm when Q = μ t − µ 0 where 374 c is a predetermined threshold associated with a signifi-375 cance level and is obtained through Monte Carlo simulations. 376 We name our proposed chart the Bayesian SPC (BSPC). In this section, we report various simulations to demonstrate 381 the superiority of the proposed chart. The performance of 382 the control charts is commonly measured by the average run 383 length (ARL), which is the time to detect the out-of-control 384 process. The ARLs in the in-control and out-of-control states, 385 denoted as ARL0 and ARL1, respectively, are calculated as 386 follows:

III. PERFORMANCE ANALYSIS THROUGH NUMERICAL
. 388 We set ARL 0 to 200, which is equivalent to the significance 389 level of the test being set as 0.005. The basis matrix Z t is 390 considered to be an identity matrix in that the measurement x t 391 is observed directly from the mean µ t . One of the utilizations 392 of the basis matrix is to determine it as a square root of , 393 which is an orthogonal transformation of the measurement x t . 394 This will be useful when a practitioner considers the principal 395 components to monitor the process. The parameter matrices 396 T t and R t are also set as identity matrices, as stated in the 397 previous section. Lastly, the covariance matrices of errors η 398 and ε are assumed to be identical. In fact, the error distri-399 butions are not necessarily identical. Specifically, the covari-400 ance of the mean can be in any form based on engineering 401 knowledge or the structure of the process dynamics. However, 402 in this study, we let Q ≡ to represent that the variability 403 of the measurement is identical to that of the underlying 404 mean. Moreover, we consider the covariance matrix as 405 a correlation matrix without loss of generality and assume 406 ρ x ij , x kl = r √ (i−k) 2 +(j−l) 2 for i, k = 1, 2, . . . , p X and 409 j, l = 1, 2, . . . , p Y , where p X and p Y represent the hori-410 zontal and vertical dimensions for one frame of spatial data 411 information. Thus, p X × p Y = p, and 0 < r < 1 [8], 412 [42], [43], [44]. We consider a strong correlation by setting 413 r = 0.9.

414
In the first simulation, we compared BSPC with a tra-415 ditional Hotelling T 2 . The shift size δ was considered 416 through a noncentrality parameter, that is, . Without loss of generality, we assume µ 0 = 0. Typically, ARL 1 increases as the dimension increases, and  When a small shift size is expected in a process, practition-447 ers may consider EWMA for the measurement instead of x t , 448 that is, an EWMA vector z t = (1 − λ) z t−1 + λx t [2], [45]. 449 A CUSUM may also be the one of candidates for detecting 450 the small change in the process, but herein, we only apply 451 EWMA for simplicity. Based on the asymptotic Gaussian dis-452 tribution ofμ t , we can derive the posterior distribution in the 453 manner described in Section 2. Then, the monitoring statistic 454 becomes Q z = μ z,t − µ z,0 T P −1 z,∞ μ z,t − µ z,0 , where the 455 subscript z represents EWMA. The EWMA version of BSPC 456 is referred to as multivariate Bayesian EWMA (BEWMA) 457 throughout the paper. For the comparison models, we con-458 sider MEWMA and VSMEWMA, which are the EWMA 459 versions of T 2 and VSMSPC. The shift δ is determined as 460 an additive shift, and we only consider δ = 0.2, 0.4, 0.6, 0.8, 461 and 1 because EWMA-based control charts focus on small 462 shift sizes. The EWMA parameter λ is set to 0.2. In addi-463 tion, we consider the steady-state process in which the shift 464 occurs after the process is stabilized. To implement the 465 steady-state process, we generated the out-of-control signals 466 after 100 samplings and ignored any false alarms during the 467 100 samplings.

468
Tables 3 and 4 show the results of two different out-of-469 control scenarios in terms of sparsity. In that VSMEWMA 470 pursues high-dimensional process monitoring with sparsity, 471 we set the same sparsity as that in the previous experiment, 472 that is, p 0 = 2 for Table 3. In the Table 4 scenario, we set 473 p 0 = 8, which is less sparse than the value used for Table 3. 474 For both experiments, we set s = 2, which is the overall 475 best parameter for VSMEWMA. Table 3 presents the same 476 pattern of results as in Table 2. The proposed BEWMA out-477 performs MEWMA for all shift sizes. In extreme cases such 478 as δ = 0.2, BEWMA outperforms VSMEWMA because the 479 performance of VS-based charts is expected to decrease when 480 the shift size is small owing to the possible misidentification 481 of the potentially changed variables. However, as the shift 482 size increases, BEWMA performs similarly to VSMEWMA 483 and starts comparatively underperforming. In Table 4, with a 484 less sparse setup, we can see that the proposed BEWMA out-485 performs MEWMA and VSMEWMA in all shift scenarios. 486 In addition to the case with p 0 = 8, we can conjecture that 487 the performance of VSMEWMA deteriorates as p 0 increases 488 because the chart selects only two variables (s = 2) and loses 489 too much information about the process, whereas BEWMA 490 still performs well because the chart incorporates the proba-491 bilities of all variables into the monitoring process.

492
In the next experiments, we attempted various out-of-493 control scenarios with varying shift directions. This is worth 494 checking because most of the high-dimensional aiming charts 495 that have been recently proposed are directionally vari-496 ant, that is, the performance differs according to the shift 497 direction. Recently, [8] proposed a ridge-regularization-based 498 chart, showed the performance in various shift directions, 499 and calculated the relative mean index (RMI) to measure the 500 VOLUME 10, 2022   performance, which is given as where N is the number of out-of-control scenarios, better performance. The experiment was conducted under 516 the condition of p = 10, and a moderate size of selec-517 tion for VSMSPC was chosen, resulting in the overall best 518 performance under the scenarios we consider. In Table 5, 519 δ i represents the change in the ith variable; for example, 520 δ 1 = δ 3 = 0.25 means the shift occurs in the first and 521 the third variables with a shift amount of 0.25. In addi-522 tion, the bold values represent the best of the considered 523 charts. Table 5 shows that BSPC outperforms VSMSPC, 524 RMSPC, and T 2 in most of the scenarios. Accordingly, 525 BSPC shows the minimum RMI, which demonstrates the 526 superiority of BSPC among the charts under various shift 527 directions.

528
One of the advantages of the proposed chart is that BSPC is 529 directionally invariant. With the Gaussian error distributions 530 for the measurement and the state (i.e., mean),μ follows a 531 Gaussian distribution asymptotically with the covariance P ∞ . 532 Thus, the chart statistic measures the Mahalanobis distance 533 TABLE 6. Directional invariance of BSPC (p = 10, µ 1 0 = 2, 5, 10 for Cases 1, 2, and 3).
normal conditions in different ways, as discussed in Section 1. 555 In this section, we consider several patterns of out-of-control 556 signals, including a sudden jump, sigmoid, oscillating, and 557 multiple jumps. Moreover, we attempt to interpret the chart 558 from a different angle rather than focusing only on the out-of-559 control ARL. The idea of the proposed method is to maximize 560 the posterior probability at every sampling point based on 561 the rationale that the estimation with the past means and 562  where T is the sampling period and µ pling period T , we set T = 70 for illustration, and we 573 assume that the process is in-control for the first 20 samples. 574 The experiment was conducted with p = 25, and the shift 575 occurred for five variables, including the first variable. Let 576 µ i (t) for i = 1, . . . , 5 be the out-of-control signal of the 577 first variable at sampling time t. The following out-of-control 578 signals are considered.
In addition to the MSE, we also mark the first detection in  Table 7 presents the MSE and ARL1 for the four cases consid-596 ered, and Figures 3, 4, 5, and 6 illustrate sample paths for each 597 pattern, where we plot only the first variable. In the figures, 598 the solid and bold lines represent the true mean (µ 1,1 (t)) and 599 the estimated mean through Bayesian update (BSPC,μ 1,t ), 600 respectively. A dotted line with an asterisk represents the 601 observation, that is, T 2 as the measurement x t is the best 602 estimate at each sampling time.

603
From Figures 3-6, it is clear that the estimated mean 604 through Bayesian update tracks the true mean well even 605 after the process is changed, while the observations fluctuate 606 significantly. Accordingly, the results of MSE and ARL 1 607 in Table 7 demonstrate the superior performance of the 608 proposed chart to that of the conventional T 2 chart, both 609 in tracking the mean and in detecting the change in the 610 process. 611 VOLUME 10, 2022 FIGURE 7. Continuous stirred tank heater [41].

612
In this section, we apply the proposed chart to an industrial 613 application that was introduced by [46]. In a tank heater, 614 hot and cold water were mixed inside the tank, and the 615 mixed water was heated using a heating coil, as shown in pipe, and the outlet flow rate could be adjusted using the 625 valve on the outlet pipe. In addition, the steam coil in the 626 tank had a control valve on the steam line to adjust the 627 heat.

628
The main aim of the continuous stirred tank heater is to 629 maintain the same temperature in the tank and the outflow 630 while maintaining the volume of the water in the tank at the 631 desired value. Thus, it is reasonable to monitor the tempera-632 ture in the tank, outflow, and volume as quality characteris-633 tics. The tank is equipped with level and outlet flow sensors 634 that measure the water level in the tank (i.e., volume, m 3 /s), 635 temperature ( • C), and water flow rates (mA). The measure-636 ments were obtained using a differential pressure instrument 637

642
There is a controller that adjusts the setting of the level 643 measurement subsystem and cold-water valve position, called 644 the proportional integral controller (PI). The process con-645 tinues operating as normal while the PI functions prop-646 erly and goes out to control if there is any damage or 647 VOLUME 10, 2022

667
In this study, we apply a Bayesian approach to the monitoring 668 of high-dimensional and high-speed monitoring processes.

669
The main contributions of the proposed method can be sum-  In addition, the out-of-control pattern in many manufactur-679 ing and service processes is unknown, and the process may   In addition to SPC perspectives, the chart provides insight 690 by estimating the process mean sequentially. Moreover, the 691 directionally invariant property of the proposed chart guar-692 antees simplicity under various situations with correlated 693 covariates.

694
Through various simulation studies and a real-life applica-695 tion, we demonstrate the superiority of the proposed BSPC 696 in high-dimensional and high-speed monitoring processes. 697 Moreover, a simple extension by replacing the measurement 698 with an EWMA-transformed measurement would also be 699 attractive when the process change is expected to be small 700 based on engineering knowledge and experience. Although 701 the proposed method generally performs well under various 702 process scenarios, it can be tuned more appropriately using 703 engineering knowledge. For example, the state transition 704 matrix, autocorrelation parameters, and moving errors in a 705 specific process can be properly determined, which paves the 706 way for future research.

707
One of the key points in the BSPC procedure is updating 708 the covariance of the mean. Thus, future research should 709 be conducted to monitor process variability as well as the 710 mean. A large body of literature deals with variability mon-711 itoring [48], [49], [50], [51], where the estimation of the 712 covariance matrix is the key to the methods. In addition, the 713 estimation procedures are mostly computationally expensive, 714 whereas the propagation of the covariance in the proposed 715 charting procedure is obtained relatively simply as a recursive 716 form, as shown in (13). Thus, it will be interesting to develop a 717 chart to monitor process variability using the updated covari-718 ance matrix.

719
Another interesting research topic is to consider any spe-720 cial settings in high-dimensional processes. Although BSPC 721 generally performs well in most settings, including sparsity 722 cases, the method can narrow a case down to a specific 723 high-dimensional process with expected sparse changes. For 724 example, the error distribution of the process mean can be a 725 sparsity-advocated distribution rather than a Gaussian distri-726 bution as shown in [33], so that the distribution would repre-727 sent the sparse change in the mean more appropriately than 728 the generalized BSPC. In this case, however, updating the 729 mean and covariance may not be computationally easy, mak-730 ing it challenging to monitor high-dimensional processes. 731 Thus, [33] developed an algorithmic procedure to estimate the 732 sparse mean and propagated covariance matrix in such a case. 733 As such, in other special settings, the error distribution can be 734 specifically determined to make the methodology suitable to 735 the process setting.

736
Although the proposed method applies the Bayesian 737 update to estimate the quality characteristics, other popular 738 algorithms such as meta-heuristics, e.g., data envelopment 739 analysis, non-dominated sorting genetic algorithm, grey wolf 740 optimizer, and so on can also be used to obtain the optimal 741 parameters with given in-control and out-of-control circum-742 stances. Such optimizers are expected to be beneficial in 743 quality control and monitoring aspects, e.g., cost of sampling, 744 run length, and lessening false alarms.