Uncertain Box-Cox Regression Analysis With Rescaled Least Squares Estimation

Under the uncertain statistical framework by Liu [19], there is still a lack of an effective fitting method for uncertain linear models with Box-Cox transformation of response variables. For example, for the transformation parameter $\lambda $ , the uncertain least squares estimation will produce a severely low estimation result. In this paper, we propose uncertain Box-Cox regression analysis by utilizing the uncertainty theory to model the imprecise data and applying a generalized Box-Cox transformation indexed by its parameter to validate classic regression assumptions. We use rescaled least squares to estimate unknown parameters and provide an estimate for noises followed by residual analysis for these uncertain Box-Cox regression models. We also give the forecast values and confidence intervals and use a numerical example to demonstrate our methodology. Our work sets a uniformed framework for Box-Cox transformation on the uncertain regression, and extends such regression from linear to nonlinear cases, taking the Johnson-Schumacher growth model as an example.


I. INTRODUCTION
Regression analysis is a useful tool for studying how a response variable is impacted by other explanatory variables. This mathematical model has become prevalent in all scientific branches. The essential part of the model is in the prescribed assumptions, which is, to be specific, that the response variable, also named as the dependent variable, is linearly dependent on the explanatory variables, or independent variables, or predictors. Such linear regression models are often helpful in either estimating the underlying relationships of these variables or predicting the responses given new values of predictors.
Classical statistical inferences mainly include estimation for unknown quantities and significance tests for hypotheses. The most seminal estimation methods are the least squares by Legendre [14] and Gauss [12], and the maximum likelihood by Wilks [30]. With data collected from the real world, it is The associate editor coordinating the review of this manuscript and approving it for publication was Saeid Nahavandi . not suitable for direct applications of statistical models due to the nature of imprecision and fuzziness [19]. To handle such imprecise observations or fuzzy observations, people at first turn to the fuzzy set theory. Very influential literature of fuzzy sets concerning this aspect includes Tanaka et al. [28], Corral and Gil [4], [5], Sakawa and Yano [26], and Casals et al. [2], [3], to name a few.
An important improvement over fuzzy variables was the uncertainty theory proposed by Liu [18], which targeted specifically at the uncertain data problems. Similar with the classic probability framework, the uncertainty theory views the imprecise observations as the uncertain variables, which follow various uncertainty distributions (Liu [15]). Uncertain statistics is a new axiom system established by Liu [19], which can be used to analyze the expert reliability data and other data types not suitable for classical statistical inference methods. Data envelopment analysis (DEA) is the field where many advances of the application of the uncertain theory happened. For instance, Wen et al. [29] proposed some new ranking criteria and the uncertain DEA model, Lio and Liu [21] considered the imprecise input and output case, then Nejad and Ghaffari-Hadigheh [22] further refined the model so that the decision making unit could be evaluated efficiently. The important progress in uncertain statistical inference started from Yao [32]. Then Yao and Liu [33] extended the principle of least squares and thus proposed uncertain regression analysis. Lio and Liu [20] developed residual analysis as well as prediction for the regression models. As we all know that classic linear regression requires the normality assumption, which is often violated by real data, a Box-Cox transformation was then recommended by Box and Cox [1] as a pre-treatment, which has now become a popular and standard method in statistical inference. With the imprecise observations, Fang and Hong [9] adopted the uncertain regression analysis and considered three most frequently used transformations on the dependent variables. We notice that these three types of transformations, namely the logarithmic, square root and reciprocal transformations, can all be regarded as special cases of Box-Cox transformations, thus in this article, we generalize the results there to any Box-Cox transformation indexed by its parameter. Furthermore, we point out that the Box-Cox transformation could also be applied on some nonlinear models.
To be specific, we propose the generalized uncertain Box-Cox revised regression analysis to model the quantitative relationship between uncertain dependent variables after Box-Cox transformation and uncertain independent variable with imprecisely observations, which can relax the assumptions required by classic regression analysis. Since the Box-Cox transformation is applicable on both a linear model and a nonlinear model, such as the Johnson-Schumacher growth model, which was first noticed by Fang et al. [10], we also show the estimation of the relevant noises and describe how to perform residual analysis. Besides, estimation of forecast value and confidence interval will be explained in detail.
This paper for uncertain observed data is presented in the following order: The generalized uncertain Box-Cox regression models and uncertain Box-Cox Johnson-Schumacher growth models are defined in Section 2, followed by the rescaled least squares approach to the estimation problems. The residual analysis is handled in Section 3. The corresponding forecast values and confidence intervals are built in Section 4. For Section 5, we simulate relevant numerical examples to explain how our theories and methods work. Finally, we make some conclusion remarks in Section 6.

II. UNCERTAIN BOX-COX REGRESSION MODELS
In this section, the uncertain Box-Cox regression model will be established. The biggest difference between uncertainty analysis and classical statistics is the definition of independence between variables. A brief introduction to the uncertainty theory, including imprecise observations, uncertain variables, uncertain distributions, etc. (see chapter 2 in [19]) The vector (x 1 , x 2 , · · · , x q ) is made of uncertain explanatory variables, and the scalar y is the uncertain response variable. Following the traditional assumptions, we use function h to characterize how y is dependent on (x 1 , x 2 , · · · , x p ), then the uncertain Box-Cox regression model can be defined by where is the Box-Cox transformation with respect to the transformation parameter λ. Here h is a given function, vector β contains the regression coefficients and represents the white noise, i.e., noise with zero mean and finite variance, as in the classic regression settings only that the variables are uncertain variables instead of random variables. When this model is the uncertain Box-Cox linear regression model. This model with a transformation to the response y offers a more flexible parametrization of the relationship between the response and the covariates, compared to the uncertain regression model proposed by Lio and Liu [20]. Fang and Hong [9] considered the special case of model (1) where λ is predetermined, such as λ = −1, 0 or 0.5. For imprecise observations (x j1 ,x j2 , · · · ,x jq ,ỹ j ), j = 1, 2, · · · , n, assuming thatx j1 ,x j2 , · · · ,x jq ,ỹ j are uncertain variables following uncertainty distributions j1 , j2 , · · · , iq , j , j = 1, 2, · · · , n, respectively, they proposed by generalizing the least square estimation (LSE) of Yao and Liu [33]. However, when λ is unknown, the LSE easily fails. For instance, in the uncertain Box-Cox linear regression model (1) where for β = 0 provided thatỹ j > 1 for all j. Since if y > 1, then L(y, λ) → 0 when λ → −∞. In other words, the minimum point of L(β, λ) is (β * , λ * ) = (0, −∞), which is not a reasonable estimate.
Powell [25] pointed out a similar problem occurs for applying a generalized method-of-moments (GMM) estimator for the Box-Cox regression model in the statistical framework. Therefore, a rescaled method-of-moments estimation was introduced there for a Box-Cox regression model by replacing the objective function S n (β, λ) with being the geometric mean of the absolute values of observations y j , j = 1, 2, · · · n in the statistical sense, Q n and S n are defined in Powell [25]. Similarly, in the framework of uncertain theory, we consider rescaling L n (β, λ) in (2) to get the new objective function being the result of replacing log |y j | in (4) by uncertain observations, taking logarithmic and expectation. Hereby, we propose the rescaled least squares estimation (RLSE) for the parameters (β, λ) within the uncertain Box-Cox regression model (1) by or equivalently, Here The fitted model is then denoted by The expectations in (5) can be computed using the inverse uncertainty distributions of the datasets, with the help of Theorem 2.15 [17]. The following Theorem 1 reveal the explicit forms of the log-objective function log R n (β, λ) with respect to the inverse uncertainty distribution for two special models corresponding to different f (x 1 , x 2 , · · · , x q ; β), i.e., the linear and the Johnson-Schumacher growth model. Note that we assumex j1 ,x j2 , · · · ,x jq ,ỹ j are independent uncertain variables for every j = 1, · · · , n in the following theorems.
Proof: The RLSE of β 0 , β 1 , · · · , β q , λ in the Box-Cox linear model can be viewed as the solution to the optimization problem Here For any given j, it follows from Theorem 2.15 [17] that the inverse uncertainty distribution of Then from equations (2.163) and (2.206) [17], we obtain dα. VOLUME 8, 2020 Hence, the minimization formula (7) becomes Thus the above theorem is proved. Theorem 2: For imprecise observations (x j ,ỹ j ), j = 1, 2, · · · , n, wherex j ,ỹ j are independent uncertain variables with regular uncertainty distributions j , j , j = 1, 2, · · · , n, respectively, the RLSE of β 0 , β 1 , β 2 in the uncertain Box-Cox Johnson-Schumacher growth model is the solution to the optimization formula: Here and is the white noise.
Proof: The RLSE of β 0 , β 1 , · · · , β p , λ in the Box-Cox linear model can be considered as the solution to the optimization formula Here For any given j, it can follow from Theorem 2.15 [17] that the inverse uncertainty distribution of Then from equations (2.163) and (2.206) [17], we obtain So (9) is equvilent to Thus the above theorem is proved.

III. NOISE ESTIMATION AND RESIDUAL ANALYSIS
This section focuses on the noise in the uncertain Box-Cox regression model (1), linear or nonlinear. We propose an estimation of the variance for this term and perform residual analysis by estimating the mean and investigating the residual plot. For the imprecise observations (x j1 ,x j2 , · · · ,x jq ,ỹ j ), j = 1, 2, · · · , n and the parameter estimation (β * , λ * ), the unobserved noise can be approximately estimated by the residuals defined by the differences of L(ỹ j ; λ * ) and h(x j1 ,x j2 , · · · ,x jq ; β * ). For this case, we obtain the following definition.
For the estimation of the moments of , we havê that is, the expected noise can naturally be estimated by the average of the expected residuals, and the variance of whereˆ j is the j-th residual, j = 1, 2, · · · , n, respectively. Note that the assumption E( ) = 0 can be validated by the estimateê. Next we give the following theorems on computing estimationsê andσ 2 for two special cases of the uncertain Box-Cox regression models, namely the linear and the Johnson-Schumacher growth model.
Proof: For any given j, it can follow from Theorem 2.15 [17] that the inverse uncertainty distribution of The theorem can follow from equations (2.163) and (2.206) [17] immediately. Theorem 4: For imprecise observations (x j ,ỹ j ), j = 1, 2, · · · , n, wherex j ,ỹ j are independent uncertain variables following uncertainty distributions j , j , j = 1, 2, · · · , n, respectively, the fitted uncertain Box-Cox Johnson-Schumacher growth model is Then the estimated expected value of the noise iŝ where Proof: For any given j, it can follow from Theorem 2.15 [17] that the inverse uncertainty distribution of The theorem can follow from equations (2.163) and (2.206) [17] immediately.
We conclude this section by proposing a residual plot, which is an intuitive assessment of how well the model fits the data. This plot can also verify the homoskedasticity assumption, namely, the variance of the noise is irrelevant with the fitted uncertain variable h(x 1 , x 2 , · · · , x q |β * ).
Similar to the residual plot in statistics, we plot the residualsˆ i on the vertical axis and the fitted variables h(x j1 ,x j2 , · · · ,x jq |β * ) on the horizontal axis. Unlike the single plot in statistics, where the residuals and fitted values are real numbers, here both of them are uncertain variables with distributions, therefore a more complicated visual result is in need. For observation i, first we plot the mean value of the residuals E(ˆ j ) against the mean value of the fitted variables E(h(x j1 ,x j2 , · · · ,x jq |β * )), then we add a rectangle around this point to show the bound of this pair of uncertain variables. The bottom left corner of a rectangle shows the lower bounds of the residual and the corresponding fitted variable, while the upper right corner of the rectangle shows their upper bounds. Two examples of the residual plot are shown in Figure 1 and Figure 2. The plot shows that the residuals are evenly distributed around zero without clear patterns, which suggests that our model fits the data well and the homoskedasticity assumption holds.

IV. FORECASTING AND INTERVAL ESTIMATION
Using vector (x 1 ,x 2 , · · · ,x q ) to denote the new uncertain explanatory variables, with each following uncertainty distribution 1 , 2 , · · · , q , respectively, the goal is often to predict the corresponding response variable. A typical solution is via the fitted linear model where the noise has estimated expected valueê and variancê σ 2 , and is independent ofx 1 ,x 2 , · · · ,x q . The forecast uncertain variable of y with respect tox 1 ,x 2 , · · · ,x q becomes If normality for the noiseˆ is further assumed, then the inverse uncertainty distribution ofŷ iŝ where −1 (α) is the inverse uncertainty distribution of , i.e., for k = 1, 2, · · · , q. The uncertainty distribution forŷ, i.e. , can thus be derived fromˆ −1 . And we have the forecast value of y as Notice that here forecast value µ is the point estimate of y. When the interval estimation is required, the confidence interval with a specified confidence level α, for instance, 95%, is where b can be chosen satisfing b ∈ [0, 1 − α]. Following the classic statistical inference, b usually results in the interval with the minimum length. We illustrate this in the next section.

V. RELEVANT NUMERICAL EXAMPLES
In this section, we will fit the uncertain Box-Cox Johnson-Schumacher growth model and the Box-Cox linear model for two imprecise observation sets, providing the forecast value and the confidence interval for a given new uncertain explanatory vector, respectively. First we consider the uncertain Box-Cox linear regression model. The imprecisely observed data (x j1 ,x j2 ,x j3 ,ỹ j ), where j = 1, · · · , n with n = 24, is seen as in Fang [9] and they are independent uncertain variables following the linear uncertain distributions j1 , j2 , j3 , j , respectively. True β is set as (2, 1, 0, −0.5) and λ 0.5.
Firstly, we fit the following uncertain Box-Cox linear regression model The rescaled least squares estimation of β = (β 0 , · · · , β 3 ) and λ can be calculated from Theorem 1. The result is We have the fitted model Then we apply the residual analysis. According to Theorem3, the estimated expected noise of isê = 0 with the varianceσ 2 = 1.1471. We can see here the mean of is almost zero, which validates the zero-mean assumption. Figure 1 is the plot of the residual versus fitted variables. We can see that the points are evenly distribute around the vertical line of zero without outliers, and the rectangles do not have clear patterns, which supports the homoskedasticity assumption very well.
We fit the following uncertain Box-Cox Johnson-Schumacher growth model The rescaled least squares estimation of β = (β 0 , β 1 , β 2 ) and λ can be calculated by Theorem 2. As a result, Next we perform residual analysis. According to Theorem 4, the estimated expected noise isê = 0.0003 with the estimated varianceσ 2 = 0.7408. We can see here the mean of is almost zero, which validates the zero-mean assumption. Figure 2 is  Note that the minimization of the interval length (10) is computed by the function optimize, where the tolerance accuracy is 10 −8 and the numerical integration is calculated by the function integrate, both with R software.

VI. CONCLUSION AND DISCUSSION
We introduce a generalized uncertain Box-Cox linear regression model for the uncertain observed data, which can be adapted for both linear and nonlinear regression model, The unknown parameters are estimated via the rescaled least squares method which overcomes the problem that the original least squares estimation may fail to estimate λ, caused by the limiting behavior (3). We then perform residual analysis and carry out prediction and interval estimation for the dependent variables and new independent variables. Finally, we provide relevant numerical examples to illustrate how our theories work. Our proposed methods are suitable for fitting the uncertain dataset, such as the expert reliability data, under the uncertain statistical framework. In the future work, we may make use of standardized residuals in addition to the raw residuals. Besides, we will also consider introducing a penalized term in the uncertain maximum likelihood estimation.
SHIQIN LIU received the degree from the Information Science Department, Guangxi University, in 2008. Since her graduation, she has been a Teacher with the Department of Mathematics and Computer Science, Hengshui University. Her research direction is very wide, mainly topology, statistics, and differential equations.
LIANG FANG was born in Anhui, China, in 1984. He received the Ph.D. degree in statistics from Tsinghua University, Beijing, China, in 2019. He has been a Lecturer with the School of Economics and Management, Beijing Forestry University. His research interests include the uncertainty theory, forestry economic statistics, and spatial statistics.
ZAIYING ZHOU received the degree from the Department of Mathematical Sciences, Tsinghua University, in 2017. She has been a Teacher with the Center for Statistical Science, Tsinghua University. Her research interests include mathematical statistics and applications of statistical methods.
YIPING HONG was born in Tianjin, China, in 1992. He received the B.S. degree in mathematics and applied mathematics from Tsinghua University, Beijing, China, in 2014, where he is currently pursuing the Ph.D. degree in statistics. His research interests include the statistical inference on the spatial and spatio-temporal covariance models, the network model analysis combined with spatial data, and the uncertainty theory. His awards and honors include the Excellent Undergraduate Thesis Award (Tsinghua University).