Data Driven Prognosis of Fracture Dynamics Using Tensor Train and Gaussian Process Regression

Accurate prediction of the mechanical failure of structural components plays an important role in the design of engineering structures. However, the fracture process is challenging to model numerically due to the existence of an elastic singularity at the crack tip. The phase field model (PFM) is one promising approach for modeling brittle fracture using an auxiliary field variable to regularize discontinuities associated with sharp cracks. Unfortunately, it is generally computationally expensive due to the need to solve a fourth-order partial differential equation. In order to reduce the computational burden for parameterized problems, different reduced order modeling approaches such as proper orthogonal decomposition (POD) and discrete empirical interpolation method (DEIM), have been proposed. However, these frameworks are model-based and intrusive approaches and need in-depth efforts in modifying existing complex simulation codes. In this work, a tensor train (TT) approach is proposed in combination with Gaussian process regression (GPR) for modeling and predicting the dynamics of fracture in composite materials. As opposed to POD and DEIM, the TT-GPR approach is a fully data-driven and non-intrusive approach. In this work, we study the fracture of a brittle elastic 2D rectangular slab with a pre-existing crack under Mode I (crack opening) loading conditions. The high dimensional training data for the TT model was generated using PFM with the finite difference method. The predictions by the TT-GPR model are compared with the results from the finite difference method. The TT-GPR is robust enough to predict the catastrophic evolution of the crack growth trend with an accuracy of up to 95% with computational load reduced by two orders of magnitude.


I. INTRODUCTION
While the prediction of mechanical failure due to fracture plays a crucial role in structural and mechanical engineering design, the search for a numerical model that can capture the details of crack propagation can be quite challenging. The fracture process creates new surfaces, which must be dynamically tracked in the solution scheme. Further, the deformation in the material, such as the elastic singularity at the crack tip, is complex and interacts nontrivially with crack path selection [1]. Over recent years, different numerical methods have been developed for predicting crack path The associate editor coordinating the review of this manuscript and approving it for publication was Agustin Leobardo Herrera-May . such as the extended finite element method (XFEM) [2], the phase field approach [3] and higher order gradient damage models [4]. The XFEM approach enriches the traditional FEM through improved handling of discontinuities. In contrast to XFEM, the phase field method introduces an auxiliary field variable that defines the discontinuity implicitly. As for the gradient damage models, domain triangulation is enabled for complex geometries through the use of rational Bezier triangles and Lagrange multipliers are brought in to elevate the smoothness of the solution to the desired order. All of these methods typically require a large computational effort. Intrusive model-based approaches such as proper orthogonal decomposition (POD) and discrete empirical interpolation method (DEIM) are normally used for model reduction of a parameterized problem [5], [6]. Due to their intrusive nature, they required modification of the complex simulation codes that describe the model. Unfortunately, this may in turn reduce the numerical stability and capability of the prediction model due to the vectorization of multidimensional solutions that are required for deriving the POD/ DEIM modes [7]. Moreover, their success is limited to linear and bilinear problems [8]- [11]. Very recent examples of the use of POD /DEIM for fracture dynamic problems can be found in the work of Rocha et al. [12] and Nguyen et al. [13]. Other available approaches for reduction of computational burden for the fracture problem are the use of a combination of simplified continuum damage mechanics models along with the critical plane approach and assumption of a fixed damage evolution rate for multiaxial loading conditions [14] and a data driven method (trained offline with a combination of high fidelity simulation data and condition monitored data) leveraging on traditional machine learning techniques such as the artificial neural network for failure prediction and design for manufacturing / operation [15].
In this work, a purely data-driven and non-intrusive approach is proposed for predicting the fracture dynamics in a rectangular slab with an initial line crack (Fig. 1), which is an extension of our work proposed in [16]. The slab is under the application of a uniaxial load at the bottom and the top. Our approach is based on combining the tensor train (TT) and Gaussian process regression (GPR) methods. The tensor train decomposition works by approximating a d-dimensional array A(i 1 , . . . , i d ) of size d i=1 n i as a product of three-dimensional arrays as [17], [18]: The 3D array, G k (i k ), of size r k−1 × n k × r k with element G k (α k−1 , i k , α k ), is known as the tensor core of the TT. The symbols α l , i l denote the indices of the array. It should be noted that the first and the last core are matrices. If n = max i=1,...,n n i and r = max i=1,...,n r i , the TT representation uses only O(dnr 2 ) elements to approximate an exponential number of elements in the full array ( d i=1 n i ). Thanks to its capability for dimensionality reduction, TT has been used for solving the elliptic stochastic equations with the Galerkin method [19], computational quantum chemistry problems [20] as well as identification of nonlinear systems with Volterra series [21]. It is worth noting here that the computational complexity of the TT scales linearly with the dimensionality of the problem (d).
The structure of this paper is as follows. Section II discusses the phase field model for fracture dynamics and examines how to solve the model using the traditional finite difference technique. The phase field model is solved for different parameter values of threshold energy (E th ) and crack length (l) and the collected solutions are used as the training data for the TT-GPR method. Section III describes the mathematical framework for the TT decomposition of a multidimensional array and applies it to the collected phase field data to enable a ''separation of variables'' representation of the data. The TT analysis significantly reduces the number of elements in the acquired output simulation data and also fully decouples the input variables. The Gaussian process models are then fitted to further reduce the dimensionality of the output from the TT analysis with respect to the parameters of the process. The fundamentals governing GPR are also included in this section. By substituting the value of the GPR model at a new parameter value into the TT representation, the approximation of the solution at any new parameter value can be easily obtained. Section IV presents the results and discussions associated with our approach for the crack growth problem and finally, the conclusions and further recommendations for study are presented in Section V.

II. PHASE FIELD MODEL FOR FRACTURE
The phase-field method (PFM) is a regularization scheme that represents sharp interfaces using a diffuse phase field. Although it has been most successful at modeling phase transformations, it has also been applied to brittle fracture by Karma et al. [22] and validated for complex loading by Pons and Karma [23]. In this approach, a field variable φ is introduced as a measure of damage that varies continuously from 0 (unbroken) to 1 (broken). Following Karma et al. [22], the variational free energy E is given as: where D is a gradient penalty coefficient and g (φ) = 4φ 3 − 3φ 4 is a smooth, monotonically increasing function. E th defines a local energetic failure criterion. The strain energy E el = λ 2 ii /2 + µ ij ij is taken from linear elasticity theory, where λ and µ are the elastic constants and ij is the strain tensor. We solve for the time evolution of φ by taking the variational derivative of Eqn. (1), ∂φ/∂t = M δE/δφ, where M can be interpreted as the mobility of the crack. We solve this equation using the forward-time centralspace finite difference method. At the same time, we enforce VOLUME 8, 2020 mechanical equilibrium using δE/δu i = 0, where u i is the displacement. This is solved using an iterative method.
In this work, we consider the strip geometry shown in Fig. 1, with a block of material having an initial pre-crack of length l, where φ = 0 in the pre-cracked area. We impose Mode-I loading by using displacement controlled boundary conditions at the top and bottom of the coupon. The left and right boundaries are taken to be traction-free.

III. TENSOR TRAIN-GAUSSIAN PROCESS MODEL FOR FRACTURE PROGNOSIS A. TENSOR TRAIN DECOMPOSITION
The phase field model for the problem depicted in Fig. 1 is solved for a given data set of parameters p i 1 where n 1 , n 2 , n 3 , n 4 are the number of training sets (combinations of parameter values of E th and l), {x,y} represent the grid in the spatial domain (we are considering the crack evolution in 2D space) and t refers to the time steps, respectively. While there are several approaches for carrying out TT decomposition, we resort here to the tensor train -singular value decomposition (TT -SVD) approach for the training data, as it is the simplest approach to use.

Algorithm 1
• Input: Initialize a temporary tensor B = A, r 0 = 1 Compute SVD decomposition of: Obtain the TT core: Step 3: The last core, G 4 = B Hence, the 4D data tensor can be approximated by where r = [r 1 , r 2 , r 3 , r 4 ] is a vector of rank, α and α = [α 1 , α 2 , α 3 ] is a vector of indices, numel denotes the number of elements of the array, and reshape is the standard function for reshaping an array in MATLAB . In other words, each entry A (i 1 , i 2 , i 3 , i 4 ) in the 4D dataset array is computed as the product of four matrices. This representation is illustrated pictorially in Fig. 2. The first TT core in Fig. 2 corresponds to the parameter space p = {E th , l} with size, n 1 × r 1 . We propose to use the GPR surrogate model for fitting the r 1 columns of the first core. One can replace the first core with just a single row vector, whose elements are the outputs of the r 1 surrogate models evaluated at any value of the test parameters, to get an approximate solution of the fracture dynamics for any value of in the parameter space. The details on the Gaussian process regression are briefly described in the following subsection.

B. GAUSSIAN PROCESS REGRESSION
Gaussian process regression is a popular Bayesian approach for surrogate modeling. It places a Gaussian distribution over the function of interest, i.e. the columns of the first core in this  work, using a mean function µ(p) and covariance function κ(·, ·, ξ ) with hyperparameter, ξ . The GPR is used to model the latent functions, f(p), whose values at the training point are the columns of the first core of the TT decomposition of the data in the previous subsection. The covariance function is used to characterize the correlation between the values of the latent function. Various covariance functions and their combinations can be found in Refs. [24]- [26]. In this study, the zero where ξ l = (ξ f l , ξ 1l , ξ 2l ) is the vector of hyperparameters.
Given a training set of parameters P ∈ R n 1 ×2 and outputs y (l) = G 1 (:, α l ), the log-likelihood logL(y|X, ξ l ) is computed as: log L(y|X, ξ l ) = log N (y (l) |0, K lyy ) K lyy = K lff + ξ f l I n 1 ×n 1 222260 VOLUME 8, 2020 where I is an identity matrix, N (y (l) |0, K yy ) denotes the conditional Gaussian distribution, K lyy , K lff , are matrices of size n 1 ×n 1 , and (·) ij denotes the element in the i th row and j th column of the matrix. The GPR model is trained by optimizing the log-likelihood with respect to the hyperparameters, ξ . The trained surrogate model can be used to make predictions at a new set of testing parameter collected into a matrix P * ∈ R ntest×2 as: and the reliability of prediction in the form of variance is given by: C * l = K l * * − K * f l K yly −1 K f l * + ξ f l I ntest×ntest (12) With where (·) T denotes the matrix transpose, superscript * denotes the quantity related the testing point, K * f l , K f l * are matrices of size n test × n test , and ntest is the number of testing points. More specific details on the GPR can be found in [24]. Based on the presented TT-GPR framework, the following section presents the results of our application of this framework to the phase field crack propagation training data sets. depicted as filled blue rectangles in Fig. 3. The results of the PFM analysis are stored into a 4D data tensor of size 9 × 41 × 82 × 128. The data tensor is decomposed using Algorithm 1 explained above in the previous section. The first core has nine columns. Here, the use of the TT method helped reduce the number of outputs for surrogate modeling from 41 × 82 × 128 = 430,336 to just r 1 = 9. We simply construct 9 GPR models for these 9 columns in the single row vector shown in the second illustration of Fig. 2. The MATLAB function fitrgp [27] is used for fitting these GPR models with ARD squared exponential covariance function. VOLUME 8, 2020  The dynamic evolution of fracture for one single point in the parameter set p = {E th = 0.9, l i = 0.5} (the center point in the parameter space shown in Fig. 3) from t = 10: 10: 120 time units is illustrated in Fig. 4. After similar simulations are run for all the other points in the parameter space (blue rectangles in Fig, 3), Fig. 5 plots the averaged absolute error with respect (over all the nine training points) to the phase field model (PFM) at t = 10, 20, . . . , 120 units. It can be seen that the proposed method can predict with insignificant error for the training points, as should be expected. Moreover, the proposed method is able to effectively capture the time and duration of the catastrophic nature of the fracture dynamics very well when the phase field is unchanged. Other conventional methods such as POD would fail to achieve this. The trained TT model is also used for making predictions at two testing points depicted by red asterisks in Fig. 3. The absolute error of the proposed method with respect to the PFM at the two testing points are shown in Figs. 6 and 7. The average computational time over the nine runs of PFM was about 7000 sec (∼2 hours). On the other hand, the computational time for one TT-GPR model run is as low as 9 sec on a basic Intel i5-6400T personal computer with 8 GB RAM. The overhead time for TT decomposition and fitting of the nine GPR models is around one minute. The TT decomposition is done by using the TT toolbox Ver 2.2 for MATLAB R [27]. The L 2 error defined below is used to quantify the accuracy of the method: PFM (p, x, y, t) − φ i TT-GPR (p, x, y, t) p, x, y, t)) 2 (14) where i = (i 1 , i 2 , i 3 , i 4 ). The L 2 error for Case 1 is given in Table 1. By analyzing Figures 5-7, it can be observed that for the two testing points (Figures 6 and 7), the proposed method gives a larger error than for average error at the training points ( Figure 5), which is expected. Looking into Table 1, an acceptable accuracy (error less than 5%) is achieved with our proposed methodology with a computational load that is two orders lower than the traditional PFM method. From Figures 6 and 7, it can be seen that the error is largest at the tip of the initial crack which is expected due to the sharp edge of the phase field at this point. Moreover, the errors at the tip are significantly larger than in Figure 5 as these results correspond to the testing set (which is an unseen situation of a different combination of the parameter values, E th and l).

IV. RESULTS AND DISCUSSIONS
In general, the TT-GPR provides acceptable accuracy at most time snapshots except for the time interval of [100, 110] time units, when the crack starts to propagate. It should be noted that the proposed model was obtained purely from the data without any physical knowledge about the governing partial differential equations. In other words, the results reported here are for a purely data-driven model that is able to capture the non-gradual dynamics of crack growth very well even with no physical knowledge embedded inside.   TT-GPR framework and compared with the conventional high fidelity PFM results at these two testing points. These evolutions can be used for the prognosis of the failure time of the slab. It is worth nothing here that the difference in predicted failure times using the two methods are just within two time units, which further proves the effectiveness of our approach. CASE 2 and CASE 3 -Cases 2 and 3 consider the training and testing sets as depicted in Fig. 9 with a lower number of training data (Case 1 had 9 training data sets uniformly distributed in the parameter space while Case 2 has 7 points and Case 3 has only 6 points). The evolution of the average absolute error for Cases 2 and 3 with respect to the PFM method at similar time instants to Case 1 is shown in Fig. 10 and Fig. 11, respectively. The corresponding L 2 prediction errors are again listed in Table 1. From Figs. 10 and 11 and the L2 prediction errors, it can be seen that the resulting error is still acceptable in these cases with a marginal increase in error of around 0.5-1%.
The results from Cases 1-3 provide convincing evidence that the TT-GPR method can provide good approximation  when the interpolation of parameters is considered i.e. the testing parameters lie within the space bounded by the training parameters. As expected, the accuracy of prediction by TT-GPR improves with more training data sets.

B. CASE STUDIES: EXTRAPOLATION IN PARAMETER SPACE
We present two further case studies in this subsection wherein the TT-GPR model will be used for predicting the trends of the phase field variable for values of parameters outside the trained parameter space to test the robustness of the methodology. The testing and training points for this case are depicted in Figs. 12 and 13, respectively. We consider two different cases here -one with extrapolation towards larger crack lengths (l/L), referred to as Case 4 and the other for extrapolation along the threshold energies (E th ), referred to as Case 5.
CASE 4 -The average absolute error over three different test points (represented by red asterisks in Fig. 12) at time moments t = 10, 20, . . . , 120 time units is shown in Fig. 14.
The L 2 prediction error for this case study seems to be as high as ∼12% (Table 1). This high error may be attributed to the fact that the model is very sensitive to the initial crack length. Therefore, unless the physical model of crack growth is embedded into the TT-GPR framework, it is hard to accurately describe the crack growth dynamics here outside the trained regime for the crack length parameter.
CASE 5 -The results of the prediction in terms of the average absolute error are plotted in Fig. 15 for the case of extrapolation along the dimension of threshold energy, E th . The prediction errors are acceptable in this case and the L 2 error, as listed in Table 1 is ∼ 3.45%. The threshold energy does not have the same influence on the damage phase field as the initial crack length. Therefore, extrapolation along this parameter dimension stills yields acceptable results.

V. CONCLUSIONS OF THE STUDY
In this work, we have proposed a fully data-driven approach for predicting crack propagation in a brittle material. The proposed method leveraged on the combined use of tensor train (TT) decomposition and Gaussian process regression (GPR). The tensor train helps in dimensionality reduction and decoupling of the data. In particular, the TT reduced the output dimension of the fracture phase field model from 41 × 82 × 128 to merely 6-9 outputs in our five case studies. These 6-9 outputs from the TT expansion correspond to the values in the parameter design space. We constructed GPR models for these corresponding outputs and used them to recover all of the output over the whole time and space domain. Our proposed method is fully non-intrusive, which is different from physics-based tensor train methods [19], [20] used for a different application.
The results showed that the proposed approach can give L 2 prediction errors lower than 5% for all cases except when there was an extrapolation needed in the dimension of the initial crack length. In all cases, the computational time of the proposed TT-GPR method was approximately two orders of magnitude lower than the PFM. For a small compromise in the error, the reduction in computational time is very significant here. Note however that this work only considers crack propagation in a simple geometry. The dynamics of fracture and the ability of TT-GPR to replicate it for more complex specimens and crack geometries accounting for stochastic inhomogeneity in the material makeup will be our subject of study in the immediate future. Most importantly, it is worth mentioning again here that the computational complexity of the data-driven TT approach here scales linearly with the dimension of the problem and this implies that our proposed method may be very beneficial for solving 3D and higher dimensional mechanics of materials problems of fracture, fatigue and delamination, which will be the focus of our future work. We expect the benefits of the TT framework here to be all the more apparent in higher dimensional cases, which possibly makes TT a useful tool for solving inverse design problems.