AG-LRTR: An Adaptive and Generic Low-rank Tensor-based Recovery for IIoT Network Traffic Factors Denoising

The ubiquitous 5G-enable industrial Internet of Things interconnects a great number of intelligent sensors and actors. Network management becomes challenging due to massive traffic data generated by industrial equipment. However, the conventional single traffic factor is insufficient for the increasingly complicated network engineering tasks due to the poor representation capability. Besides, the insecure equipment with open communication access easily brings irregular network fluctuations to network traffic which interferes with the primary traffic factor. The simple and interfered traffic factor decreases the network management efficiency and misleads the operators. Motivated by that, we construct a comprehensive tensor model representing multi-dimension traffic factors to describe the network traffic beneficial characteristics. Meanwhile, an adaptive and generic low-rank tensor recovery (AG-LRTR) algorithm in the tensor singular value decomposition (t-SVD) framework is proposed for denoising. For effective tensor recovery, the alternating direction method of multipliers is employed to theoretically solve the partial augmented Lagrangian function of our objective with a closed-form solution. Numerical experiments on both synthetic data and real-world traffic data in IIoT validate that our proposed algorithm outperforms other state-of-the-art of tensor recovery algorithms.


I. INTRODUCTION
T HE Industrial Internet of Things (IIoT) is rapidly developing with the emerging 5G-enable communication technology. The 5G communication technique provides coinstantaneous broadband access [1], and thus the IIoT network expands the scope of various intelligent equipment extensively. Effective network management becomes increasingly critical for operators. However, with the tremendous growth of traffic data, one challenge is to abstract the beneficial characteristics from the massive traffic data. Nowadays, the traffic volume is still the basic factor in the IIoT network management [2], [3] and only reflects the flow-based correlations in successive slots. It loses the other intrinsic correlations in the traffic data, such as the packet-based correlation. For example, the packet interval arrival time is preferable to the routing program than traffic volume. The single factor model is insufficient to more complicated network engineering tasks gradually. Besides, another challenge is the irregular network fluctuations due to the insecure equipment with open communication access. Parts of the network engineering tasks are susceptible to noise interference, so the irregular network fluctuations degenerate network management efficiency. For example, the network intrinsic resilient capacity should have avoided route reprogramming in instant network congestion, but this fluctuation is likely to cause once redundant reprogramming. To enhance the representation capability and improve the network management efficiency, a new network traffic model is needed to represent the valuable factors, and precise denoising is essential to reduce unnecessary operations.
The supervisory control and data acquisition (SCADA) system is the central platform in the industry for aggregating and coordinating the network traffic transaction from edge equipment. The irregular network fluctuations are caused by network congestion, equipment failure or operator error as Fig. 1 shows. Many researchers have proposed network traffic modeling and denoising algorithms. The vector [4], [5] ,matrix [6], [7] and tensor [8] models are proposed respectively to recover the original traffic volume in which the tensor-based model outperformed the other models. As mentioned above, the volume only represents one single traffic factor, and the model representation capability is deficient. Inspired by the tensor recovery performance, we naturally choose the tensor to construct a novel traffic factor model. The essence of successful recovery is the low-rank property due to the multiple types of correlations in the traffic. Tensor recovery aims to realize low-rank tensor approximation from the noisy traffic factors. The regular optimization object of tensor recovery is to minimize the sum of the tensor nuclear norm and the reconstruction error. The same amplitude shrinkage for the tensor nuclear norm introduces additional bias and variance by the thresholding estimator in the recovery procedures. The reconstruction error only assumes the noise satisfies the zero-mean and ignores the non-zero mean influence. Both deficiencies result in the sub-optimal solution for the regular optimization object. To improve the optimization performance, our tensor recovery strategies in this paper are summarized as follows: 1) To enhance the tensor representation capability, we construct a compact and comprehensive traffic tensor model with ten factors: traffic volume, packet number, inter-arrival time (IAT), etc. Moreover, we further reveal that such a tensor satisfies the low-rank property.
2) We propose an adaptive and generic optimization object for precise denoising by minimizing the sum of the weighted tensor nuclear norm and the noise variance. The weighted tensor nuclear and generic noise Frobenius norm respectively alleviate the influence of the estimator and the non-zero mean noise.
3) The adaptive and generic optimization object is solved by the alternating direction method of multipliers (ADMM) algorithm. Each optimization procedure has a closed-form solution in ADMM. We further perform numerical experiments on synthetic data and a real SCADA traffic dataset as an IIoT example to validate the effectiveness of our algorithm.
The rest of this paper is structured as follows. Section II introduces the preliminaries of an effective low-rank tensor recovery procedure and the tensor singular value decomposition (t-SVD) framework. Section III details the tensor model, adaptive nuclear norm formulation, and solver algorithm for ADMM. Numerical experiments conducted on synthetic and real-world data are presented in Section IV, and we conclude this work in Section V.

II. PRELIMINARY OF TENSOR RECOVERY
Tensor recovery is realized on the basis of the tensor lowrank property and decomposition approach. The low-rank property is embedded in the structure of entity arrangement, and the decomposition approach affects the recovery performance. This section summarizes an effective low-rank tensor recovery procedure and then introduces the tensor singular value decomposition framework involved in this paper.

A. TENSOR RECOVERY
Multi-dimensional data is becoming prevalent in many areas, such as computer vision [9], [10] and information science [11]. Tensor as a multi-dimensional extension of the matrix is a natural choice in these cases and has the capability of capturing these underlying multi-linear structures. Although often residing in extremely high-dimensional spaces, the tensor of interest is frequently of low rank, or approximately so [12]. Lying at the core of high-dimensional data analysis, tensor decomposition serves as a valuable tool for revealing when a tensor can be modeled as lying close to a lowdimensional subspace [13].
As for data analysis by tensor decomposition, the first step is to construct an appropriate tensor that could contain intrinsic correlations in the data. Except for some kinds of natural tensor data, such as the hyperspectral data, the seismic data, or the colorful picture, other data need to be rearranged as some regulation based on the intrinsic correlation. For example, in some applications for video background model in [9], [14], the original 4-dimensions video data was reshaped to a 3-dimension tensor through matricization of the colorful frame along the time mode. However it would degenerate the performance due to the information loss of frame matricization. Furthermore, the IT traffic tensor constructed in [8] gains periodic pattern in addition to temporal stability and spatial correlation only for traffic volume. Although the tensor entries can be substituted by other network traffic factors, this model intrinsically ignores the multiple correlations between the factors and could not represent the complete traffic properties. Therefore, the constructed tensor model should contain the necessary correlations as much as possible.
Then the second step is to select a practical tensor decomposition approach to approximate the low-rank tensor. The two most popular tensor decomposition approaches, namely CANDECOMP/PARAFAC(CP) [15], [16] and Tucker de-composition [17] are known that the truncated CP or truncated Tucker is not the best low-rank approximation [18]. Compared with them, tensor singular value decomposition(t-SVD) [19] is based on the tensor-tensor product operator [20] and the calculation procedure needs to transform a tensor from the original domain to the Fourier domain along a fixed mode by discrete Fourier transform(DFT).As a newly emerged tensor decomposition paradigm, it has several properties similar to the traditional matrix SVD and decomposes any tensor with less prior information, so it is the optimality of the truncated t-SVD for data approximation. Therefore in this paper, the task of IIoT traffic denoising lies in the t-SVD framework.
The last and most important step is defining the tensor rank based on the tensor decomposition approach and solving the optimization object by corresponding rank relaxation as the tensor nuclear norm. In the t-SVD framework, the basic tensor rank is called tubal rank [20] which is defined as the number of nonzero singular tubes in the original domain and relaxed by the sum of all singular values in the Fourier domain. Due to the lack of considering the relations of the singular values in the original and Fourier domain, the basic algebra calculation is more complex and computational. For simplicity and elegance, the average tubal rank is defined as the mean rank of the block circulant matrix in [14]. Its rank relaxation is to sum the singular values of the first frontal slice in the original domain. It is rigorously deduced theoretically as a new tensor nuclear norm and has similar theorems with the matrix SVD. The other approaches are either extension [10], [21] or combination [22] underlain by the rank definition and relaxation in [20] or [14]. In this paper, we have the same rank definition and relaxation with [14], because the adaptive coefficients to shrink the nuclear norm needs to be calculated by the unique singular values.

B. TENSOR SINGULAR VALUE DECOMPOSITION
T-SVD operation as an extension to matrix SVD is based on the tensor-tensor product(t-product). For simplicity, we mainly introduce the correlation between the original domain and the Fourier domain caused by DFT. The basic definitions related to the t-SVD framework are given in Appendix. (1) where f old(·), bcirc(·), unf old(·) denote the fold,block circulant and unfold operation for a tensor respectively and A, B, C is block diagonal matrix of A, B, C. In the original domain, a 3-mode tensor can be regarded as a matrix, with each entry being a tube that lies in the third mode. Thus, the tproduct is analogous to the matrix multiplication, except that the circular convolution replaces the multiplication operation between the entries. In the Fourier domain, the t-product is equivalent to the matrix multiplication. The t-product enjoys many similar properties to the matrix-matrix product.
Then for any 3-mode tensors, A ∈ R n1×n2×n3 the t-SVD is defined in the original and Fourier domain as follows: where U ∈ R n1×n1×n3 , V ∈ R n2×n2×n3 are orthogonal, and S ∈ R n1×n2×n3 is an f-diagonal tensor. See Fig. 2 for an intuitive illustration of the t-SVD operation.

FIGURE 2. An illustration of tensor singular value decomposition
In this paper, we refer to the rank definition and relaxation in [14].
The entries on the diagonal of the first frontal slice S(:, :, 1) have the decreasing property as follow where n ′ = min(n 1 , n 2 ). It holds since the inverse DFT gives and the entries on the diagonal of S(:, :, j) are the singular values of A(:, :, j), so the tensor tubal rank is determined by the first frontal slice S(:, :, 1) and equivalent to the number of non-zero singular values of A. Based on the above properties, tensor average rank defined in [14] is the slice mean of the total rank in Fourier domain as rank a (A) = 1 n3 rank(A) and proved that the low average rank assumption is weaker than the low Tucker rank and low CP rank assumption, so it is more convenient to decompose a low rank tensor. Then the relaxation of tensor average rank can be rigorously deduced as summation of the singular values in the first frontal slice S(:, :, 1) and denoted as The discrimination between the tubal and average rank is the coefficient 1 n3 that is crucial to guarantee the convex envelope of the average tensor rank in a specific scope. Therefore, adopting an adaptive strategy for IIoT traffic denoising is possible based on the above-defined tensor nuclear norm.

III. TENSOR-BASED MODELING AND DENOISING FOR IIOT NETWORK TRAFFIC
To improve the IIoT network management efficiency, we conduct a novel tensor model with ten traffic factors based on the periodical transaction mechanism to enhance the representation capability firstly. We further validate that such a VOLUME 4, 2016 representation model has a low-rank property that underlies the effective denoising. Then an adaptive and generic optimization object is reformulated to improve the denoising performance, and finally a closed-form ADMM algorithm is proposed to solve the object.

A. TENSOR MODELLING AND LOW-RANK ANALYSIS
It is impossible to analyze the per packet due to the massive amount of traffic data in the industrial internet of things. Effective network management relies on multiple statistical factors which could represent the complete data exchange. In IIoT, SCADA systems are the central platform and automatically coordinate and manage the equipment actions to ensure that the infrastructure operates correctly and safely. The primary transaction mechanism [23] is polling field information and sending corresponding control commands periodically, as Fig. 3 shows. The acquisition and control data are transmitted at the determined time of one period. In addition, the same equipment category has stable operation logic and generates similar traffic data. Therefore, the total network traffic data can be characterized by periodic throughput patterns, clear statistics of packet size, predictable flow direction, and expected connection lifetime [24]. However, irregular network fluctuations will be caused by package loss, data delay or retransmission and payload changes as depicted in Fig. 3. These fluctuations finally reflect the variances in traffic volume, packet number, and packet interval arrival time (IAT). To enhance the representation capability, ten statistical factors listed in Table 1 are calculated periodically by the way provided in [25]. These factors are sufficient to represent the data exchange at the flow-based and packet-based levels and can be applied in most network engineering tasks. In all factors, the traffic volume and packet IAT contain four factors: maximum value, minimum value, mean value, and variance.
Therefore, a novel low-rank tensor model can be constructed based on the factors to represent the beneficial characteristics of IIoT network traffic. In this paper, a testbed SCADA traffic data named Electrical Power and Intelligent Control (EPIC) [26] is used as the real IIoT network traffic dataset, and it mimics a real-world power system in small scale smart-grid. This SCADA system interacts with six categories of equipment which are access point(AP), programmable logic controller(PLC), intelligent electronic IAT max Maximum inter-arrival time in a time slot 10 IAT min Minimum inter-arrival time in a time slot equipment(IED), switch(SW), history database(HIST), firewall(FW), and the others. Obviously, the most prominent traffic volumes are periodic as Fig. 4(a) shows, and the least common multiple periods can be set to 30s with the 1s time slot. For each traffic factor, we calculate each statistical value per second as a row vector and continuously repeat 30 times to form a factor matrix as the frontal slice, then stack the ten factor frontal slices along the third mode to construct a traffic tensor model with the size of 30 × 30 × 10 as Fig. 4(b) shows. We decompose the traffic tensor of each category of equipment by t-SVD and illustrate the tensor singular values of the first frontal slice as Fig. 5 shows. If the traffic has the same period, such as AP, IED, SW and FW, most of the singular values of the traffic tensor are relatively small, which means the optimum low rank. However, when the traffic is random as HIST or with various periods as PLC, the low-rank property is relatively weak due to the uniform dispersion in the Fourier domain caused by random. As for sparse traffic such as the others, it has the worst low rank due to the independent and poor intrinsic correlation.
Furthermore, we decompose the sum of all traffic, and the singular values are depicted as Fig. 6. Although the summation consists of different kinds of traffic, it still has the property of low rank, which means the periodical traffics data is overwhelming. Our proposed tensor model is capable to capture the intrinsic correlations in various aspects. In the following paper, the IIoT traffic factor tensor is deemed to contain all traffic by default.

B. ADAPTIVE AND GENERAL OPTIMIZATION OBJECT
The regular optimization object of the low-rank tensor recovery from the noisy measurements is to minimize the sum of tensor rank and reconstruction error as follows: where λ denotes the penalty coefficient. Because the rank operator is non-convex, the nuclear norm || · || * as the tightest relaxation of rank replaces the first term in (5). In the paper, we define the tensor rank as the average rank and   4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Index of Singular Value Two terms of the above optimization object have a deficiency in the optimization process. As for the first term r i=1 S(i, i, 1), the solver involving soft-thresholding shrink would cause some unavoidable biases [27], so the variance of the estimated tensor would be smaller than the original tensor when equally shrinking every singular value. Conspired by the adaptive nuclear norm for low-rank matrix approximation in [28], we can extend it to the tensor model by assigning weights to the singular values of a tensor.
Another term ||N || 2 F assumes that the noise has zero mean and ignores the common non-zero mean situation, which leads to the sub-optimal solution. Assuming the actual mean µ and varianceδ 2 of noise are unknown and µ is a variable representing noise's means distribution, the variance of N − µI µ can be expressed as var(N − I µ ) = var(N ) + var(I µ ) where I µ is the tensor with the same size of N and its all entries are µ. The variance of tensor can be calculated by the Frobenius norm || · || 2 F . If µ ̸ =μ, the unbiased variance will be larger thanδ 2 as the simple proof below where N = n 1 × n 2 × n 3 − 1 denotes the element number of tensor: Then the adaptive and generic optimization object is formulated as follow: where α i denotes the i-th adaptive coefficient. The optimization object (8) can be explained to minimize the sum of the tensor adaptive nuclear norm and the noise variance.
Then an adaptive tensor soft-thresholding (ATSVT) operation as a closed-form solution could solve the optimization object with the adaptive nuclear norm as follows.
Theorem 1. For any λ ≥ 0, Y ∈ R n1×n2×n3 and 0 ≤ α 1 ≤ · · · ≤ α r (r = min(n 1 , n 2 )) , a global optimal solution to VOLUME 4, 2016 the optimization problem is given by the ATSVT asX : where S τ (·) denotes the adaptive soft-thresholding operation. Further, if Y has a unique t-SVD,X is the unique optimal solution. The soft-thresholding operation for tensors shrinks the singular values in the Fourier domain. Following [29], the weights can be set as some power of the singular values of the tensor, i.e., α i = 1 σi(X ) γ , where γ ≥ 0 is a predefined constant. In this way, the order constraint in the Theorem is automatically satisfied. The fact that a closed-form global minimizer for the optimization object (9) would be proved as follows based on von Neumann's trace inequality [30] and the properties of t-SVD.
Proof. We first prove thatX is indeed a global optimal solution to (9). Since the weighted coefficients only depend on the singular values of X , by letting g = {g i } h i=1 = σ(X ) (which implies the entries of g are in non-increasing order), (9) can be written as: For the inner minimization, we have the inequality The last inequality is due to von Neumann's trace inequality. The equality holds when X admits the singular value decomposition X = U * Diag(g) * V * where U and V are the left and right orthogonal tensor in the t-SVD of Y. Then the optimization can be reformulated as follow The optimization object is completely separable and takes minimum when g i = (σ i (Y) − αi λ ) + . This is a feasible solution because {σ i (Y)} is in non-inceasing order, while {α i } is in non-decreasing order. Therefore, equation (14) is a global optimal solution to the objection function (10). The uniqueness follows by the equality condition for von Neumann's trace inequality when Y has a unique t-SVD, and the uniqueness of the strictly convex optimization.

C. CLOSED-FORM ADMM SOLVER ALGORITHM
ADMM algorithm is very efficient for some convex or nonconvex programming problems. The closed-form solution to each optimization procedure guarantees recovery performance. To solve the optimization object (8), the problem can be reformulated by the partial augmented Lagrangian function as follows, and we deduce the closed-form solutions to all formulations.
where P ∈ R n1×n2×n3 is the tensor of Lagrange multipliers and β > 0 is a penalty parameter. So the variables are updated sequentially in each iteration as follows where ρ ∈ (1.0, 1.1] denotes the adjustment coefficient to accelerate the convergence speed and β 0 is a small constant. And there exists a closed-form solution for each component in (14)- (18).
The term < P, M − L − N > + β 2 ||M − L − N || 2 F can be merged as β 2 ||M−L−N + P β || 2 F . For optimization problem (15) and (16), they can be rewritten as follow where the problem (20) has the same format as problem (9) and can be solved by the closed-form solution (10). For 6 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.
Then the problem (17) can be solved in original domain through calculating the mean of all entries as follow Finally, the pseudocode for low-rank tensor recovery from noisy measurement by adaptive nuclear norm is described in Algorithm 1

IV. PERFORMANCE EVALUATION
We conduct numerical experiments to validate the efficiency of our proposed AG-LRTR in zero mean and non-zero mean noise interference scenarios. Three experiments on the synthetic data are about to analyze the noise influence, evaluate the recovery performance and discuss the parameters variation respectively. One more experiment on the EPIC SCADA traffic data is conducted for practical application. The Peak Signal-to-Noise Ratio (PSNR) is used as the metric to measure the quality of the recovery performance and it is defined as equation (24). All the experiments are conducted on a PC with a 2.9 GHz CPU and 8 GB RAM.
where L denotes true ground tensor, and L denotes the recovered low-rank tensor. The larger value of PSNR corresponds to the higher quality of the results. Moreover, we compare the AG-LRTR algorithm with the other four algorithms for tensor recovery. They are abbreviated as LRMR, LRTR, G-LRTR, and TRPCA. LRMR represents the low-rank matrix recovery which needs to flatten the original tensor to a matrix and then recover the low-rank component by RPCA. The LRTR represents the low-rank tensor recovery in [31] to restore the hyperspectral image, and G-LRTR extends the LRTR only by generalizing the noise formulation. The TPRCA represents the tensor robust principal components analysis used in [14] to recover from the sparse noisy tensor. The results illustrate that our AG-LRTR algorithm outperforms the other algorithms in tensor recovery.

A. SYNTHETIC DATA
The synthetic tensor A ∈ R n1×n2×n3 with rank(A) = r can be generated by the tensor-tensor product directly as follows.
A = Q * R (25) where Q ∈ R n1×r×n3 , R ∈ R r×n2×n3 and all entries sampled in independent and identical uniform distribution of U (0, 1). Firstly, we summarize the influence of different kinds of noise on the tensor singular values, then compare the recovery performance of the five algorithms and finally discuss the parameters set of the proposed algorithm.

1) Noise influence
For compliance with the size of our proposed SCADA traffic tensor model, the size of original tensor is 30 × 30 × 10, and the rank is four in the experiment. To exploit noise influence on tensor rank, two major categories are specified: the zeromean noise and the non-zero mean noise. For zero mean noise, the noise with various variances ranging from 1 to 10 is added to the original tensor as Fig. 7 (a) shows. Because the non-zero mean noise can be divided as the sum of a zeromean noise and a constant value, only the influence of the constant value is depicted in the right picture of Fig. 7 (b). As we can see, no matter whether the mean of noise is zero or non-zero, the randomness property of noise increases all the singular values of the tensor in a similar tendency and  the increment is positively correlated with the variance of the noise. As for the non-zero mean noise, the constant mean only affects the largest singular value, as proved as follows. Moreover, the discrepancy between the original tensor and noisy tensor is diverse from each singular value as Fig. 8 shows. The influence of noise with different variances on each singular value of the original tensor differs, so the adaptive nuclear norm could shrink each singular value to a different extent, thus improves recovery performance. Proof. Assume a low rank tensor L ∈ R n1×n2×n3 and a constant tensor I µ ∈ R n1×n2×n3 , then representation of the tensor M = L + I µ in Fourier domain can be calculated as As for the second term in equation (26), all entries of the tensor I 1 are 1 and its block circulant matrix is an onesmatrix in which all entries are equal to 1. Then any frontal slices in Fourier domain only have the same value as follow where F n3 (i, :) denotes the i-th row vector, 1 is an onesvector and I 1 denotes an ones-matrix with size n 1 × n 2 . So each frontal slice is a rank one matrix and has only one singular value and its left and right singular vectors are onesvector with size n 1 and n 2 . Because the first row and column in F n3 is ones-vector, both the L (i) and I 1 (i) have the same singular vectors and I 1 (i) is in the subspace of the L (i) . Then the constant tensor only affects the largest singular value of the original tensor.

2) Recovery comparison
A low-rank tensor with size 30 × 30 × 10 is generated as the original and interfered with different noise as the measurement to comprehensively compare the recovery performance,. The results of recovery performance are listed in Table 2. We can see that our proposed AG-LRTR outperforms other algorithms in all cases. Especially when the noise has low variance, the AG-LRTR could improve recovery performance effectively because low variance on small singular values is much greater than on large singular values, as Fig. 8 shows. The adaptive nuclear norm could retain the information of the original tensor in main singular values and alleviate the noise influence on others. Besides, the G-LRTR and the LRTR have similar performance, which is better than the LRMR that loses the structure information by matricization. The TRPCA has the denoising capability when the mean of noise is zero. However, it fails to recover the original tensor from non-zero noise that dissatisfies sparsity.

3) Parameter discussion
The recovery performance of our proposed AG-LRTR algorithm mainly depends on the parameters of the problem (8) in which the adaptive coefficients α i , i = 1, 2, · · · , r are determined by the parameter γ, so we will discuss the influence of two core parameters λ and γ on recovery performance. Before that, we make the connection from our algorithm to other algorithms. If the parameter γ is set to 0, all adaptive coefficients are equal to 1 and the sum of weighted singular values can be regarded as the standard nuclear norm, so our algorithm degenerates to the G-LRTR. Moreover, if the mean variable of noise µ is fixed to 0, the optimization object would be the same with LRTR , then if all tensors only have one frontal slice, the tensor degenerates to a matrix and the LRTR is the same with LRMR. So based on the connection, we firstly set the parameter γ = 0 to discuss the influence 8 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and  of penalty coefficient λ as the same as the LRTR and then explore the optimal parameters (γ, λ) of AG-LRTR by the grid search strategy.
In the experiments, the size of the original tensor is 30 × 30 × 10 and the rank is 3. When γ = 0, the influence of λ on recovery performance is depicted in Fig. 9 which is consistent with the intuition. The penalty coefficient is to balance the nuclear norm of the tensor and the noise variance, so as the variance increases, the penalty coefficient needs to decrease to keep the nuclear norm and the variance in the same order. As for the optimal parameters (λ, γ) of AG-LRTR, the λ depends on the γ due to the adaptive nuclear norm as Fig. 10 shows where the variance of the noise is 1. We can see that the optimal λ corresponding to the peak PSNR decreases as the γ increases because more information of the larger singular values is retained so that there needs a lower λ to reduce the noise variance and keep the balance. The other variances of the noise have the same regulation.

B. TENSOR RECOVERY FOR THE REAL-WORLD DATA
We have introduced the SCADA systems and constructed a tensor model for the traffic of a testbed called EPIC in Section III-A, so the recovery performances of all algorithms are compared based on the low-rank SCADA traffic tensor as Fig. 11 shows. Obviously, the rank of the SCADA traffic tensor is 3 in Fig. 6, and the largest singular value is 10 that is much smaller than the synthetic data, so the variance of noise needs to be chosen cautiously. The recovery performances for SCADA traffic are depicted when the mean of noise is 0, and the noise variance varies from 0.02 to 0.18. Our proposed AG-LRTR algorithm still outperforms other algorithms and improves much better than the observed PSNR. The performances of the LRTR and LRMR are similar, which means that the correlations between the frontal slices are weaker than the synthetic data. The TRPCA performance is the worst even if it has the ability to denoising. When the mean of noise is non-zero, the recovery performances of all algorithms have the same tendency except the TRPCA, which fails to denoising.

V. CONCLUSION
Based on the recently developed t-SVD, an adaptive and generic low-rank tensor recovery algorithm is proposed to recover the original traffic factor tensor from the irregular network fluctuations in IIoT scenario. We construct a novel tensor model to abstract multiple correlations from the traffic data and retain as many traffic factors as possible by the adaptive and general optimization object. Numerical experiments on the synthetic data and real-world SCADA traffic verify our algorithm is effective in denoising for network management.
There are some interesting future works left. Because the increment of each singular value depends on the noise variance, the adaptive nuclear norm criterion needs to combine the information of noise with the singular values, which is ignored in this paper. Besides, the optimal parameters are short of a theoretical guarantee and found by grid search, so there needs to explore an effective approach for parameter selection and save the computational cost. Moreover, deep learning can be employed if we do not have much prior information about the original tensor and the noise.

PRELIMINARY DEFINITION AND PROPERTIES OF T-SVD
Firstly we briefly introduce the basics of the tensor notion and related operation. Scalars are denoted by lower-case letters such as i, j, k and vectors by bold lower-case letters such as a, b, c. Matrices are denoted by upper-case letters, e.g., X. Tensors are denoted by a calligraphic letter, e.g., X , and its entry are denoted by x i1,··· ,in for a N-mode tensor. Identity matrix with size n × n is denoted as I n . The fields of real numbers and complex numbers are denoted as R and C.
For a 3-mode tensor A ∈ C n1×n2×n3 , its i-th horizontal, lateral and frontal slice are denoted as A(i, :, :), A(:, i, :) and A(:, :, i) respectively and especially for the frontal slice, it can be abbreviated by A (i) . The tube along the third mode is denoted as A(i, j, :). In the field of complex number, the complex conjugate of A is denoted as conj(A) which takes the complex conjugate of each entry of A.
The inner product between A and B in C n1×n2 is defined as < A, B >= T r(A * B) where A * denotes the conjugate transpose of A and T r(·) denotes the matrix trace. The inner product of two same-sized tensor A, B ∈ R I1×I2×···×I N can be represented as the sum of the inner matrix product of each frontal slice and equals to the sum of numerical product of each entry.
For a tensor A, the l 1 , infinity and Frobenius norm are defined as follow where F n is the DFT matrix defined as This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. where ω = e − 2πi n is a primitive n-th root of unity in which i = √ −1. Note that F n / √ n is unitary matrix.
Lemma 1.1 : Given any real vector v ∈ R n , the associated v satisfied v 1 is real and conj(v i ) = v n−i+2 , i = 2, · · · , ⌊ n + 1 2 ⌋ (32) Conversely, for any given complex v ∈ C n satisfying (32), there exists a real block circulant matrix circ(v) holds To extend similar properties for tensor, some operation especially for tensor are needed as follow Operation 1.1(bdiag) : For any A ∈ C n1×n2×n3 , the block diagonal matrix is denoted as A ∈ C n1n3×n2n3 with its i-th block on the diagonal as the i-th frontal slice A (i) of A.
A (2) . . . Then based on the DFT of vector, the Lemma 1.1 can be extended to the tensor as follow Lemma 1.2 : For any tensor A ∈ R n1×n2×n3 , its DFT associated as A ∈ C n1×n2×n3 satisfied (F n3 ⊗ I n1 ) · bric(A) · (F −1 n3 ⊗ I n2 ) = A where ⊗ denotes the Kronecker product and (F n3 ⊗ I n1 )/ √ n 3 is unitary. Then, we have Conversely, for any given A ∈ C n1×n2×n3 satisfying (35), there exists a real tensor A ∈ R n1×n2×n3 such that (34) holds. Moreover, the following properties of tensor DFT are used frequently: Definition 1.2(T − product) :Let A ∈ R n1×l×n3 and B ∈ R l×n2×n3 . Then the t-prodect A * B is defined to be a tensor of size n 1 × n 2 × n 3 A * B = f old(bcirc(A)) · unf old(B)) There are some other concepts on tensor extended from the matrix as follow: Definition 1.3(Conjugatetranspose) : The conjugate transpose of a tensor A ∈ C n1×n2×n3 is the tensor A * ∈ C n2×n1×n3 obtained by conjugate transposing each of the frontal slices and then reversing the order of transposed frontal slices 2 through n 3 . Definition 1.3(Identitytensor) : The identity tensor I ∈ R n×n×n3 is the tensor with its first frontal slice being the n × n identity matrix, and other frontal slices being all zeros. Definition 1.3(Orthogonaltensor) : A tensor Q ∈ R n×n×n3 is orthogonal if it satisfies Q * Q = Q * Q * = I Definition 1.3(F − diagonalTensor) : A tensor is called f-diagonal Tensor if each of its frontal slices is a diagonal matrix.