Compression of Pulsed Infrared Thermography Data With Unsupervised Learning for Nondestructive Evaluation of Additively Manufactured Metals

Additive manufacturing (AM) of high-strength metals, which is typically based on laser powder bed fusion (LPBF), can introduce microscopic pores in the AM metal. Pulsed Infrared Thermography (PIT) offers several advantages for nondestructive imaging of subsurface defects in AM structures because the method is one-sided, non-contact and scalable to structures of arbitrary size. However, high-resolution PIT imaging results in the generation of a large volume of thermography data (~TB), which creates challenges for the storage and transmission of data. Compression of thermography data requires an approach that achieves high data compression ratio while preserving weak thermal features corresponding to microscopic material defects. We investigate thermography data compression using several unsupervised learning (UL) algorithms, which include Principal Component Analysis (PCA), Independent Component Analysis (ICA), Exploratory Factor Analysis (EFA), Sparse Dictionary Learning (SDL), and a novel lightweight Thermography Compressive Sparse Autoencoder (TCSA). Algorithms are benchmarked using PIT experimental data obtained from imaging of a stainless steel plate with calibrated porosity defects imprinted with AM process. For all algorithms, we obtain compression ratio >30 (highest compression of 46 is achieved with TCSA), and peak signal-to-noise ratio for reconstruction accuracy >73dB. Compared to existing methods, advantages of UL algorithms include achieving high compression ratio while preserving weak features to allow extraction of microscopic material defects from images. UL-based methods have general applicability because they are adaptable to compression of different data types, and allow for memory-efficient training and rapid on-line augmentation of the model.


I. INTRODUCTION
Additive manufacturing (AM) of metals is an emerging method for cost-efficient fabrication of low volume complexshape structures. In particular, AM provides the option of manufacturing custom-shape structures from high-strength superalloys, such as stainless steel 316 and Inconel 718, which are difficult to machine with conventional methods.
The associate editor coordinating the review of this manuscript and approving it for publication was Pedro Neto .
AM structures with minimal welds, as compared to conventionally produced ones, potentially offer longer service life in high-temperature corrosive environment of nuclear reactors [1]. AM of high-strength metals, which have melting temperatures above 1300 • C, is currently based on Laser Powder-Bed Fusion (LPBF) method [2]. Because of the intrinsic features of LPBF, keyhole and lack of fusion microscopic pores can appear in the AM metal [3]. Before deployment in a nuclear reactor, nondestructive evaluation (NDE) of an AM structure needs to be performed to identify possible flaws. Because long-term behavior of AM metals in a nuclear environment is not known, condition of AM structures needs to be monitored through in-service NDE inspections.
In principle, X-ray computed tomography (XCT) [4] can provide high resolution imaging of metals. However, XCT requires symmetric body of revolution shapes, and penetration depth is limited to distances on the order of a centimeter. In addition, XCT imaging resolution and structure size are inversely proportional. Neutron tomography allows for longer penetration depth, but has a potential negative side effect of activating the metal. Ultrasonic NDE is scalable to arbitrary structure sizes and shapes, but requires direct contact of an ultrasonic probe with structure surface. Because of the rough surface finish of AM metals, ultrasonic NDE also faces challenges. Traditional ultrasonic testing methods are more sensitive to incomplete penetration infusion defects in AM materials, and less sensitive to pores and subsurface defects [5]. Laser ultrasonic inspection performs better in the detection of pores [6]. However, surface roughness is an impediment for accurate detection of defects. Eddy current imaging is frequently used in nuclear reactor in-service NDE applications because the inductive probes are non-contact and resilient to the harsh environment. Eddy current testing has the potential to detect subsurface pores, but measurements could be affected by temperature and surface irregularities [7]. In addition, Eddy current imaging typically requires time-consuming raster scanning with a single probe. Imaging with inductive probe arrays suffers from lower resolution, as compared to single probe imaging.
Pulsed Infrared Thermography (PIT) method offers several advantages for NDE of subsurface defects in actual AM metallic structures because the method is non-contact, onesided, scalable to arbitrary structure size, and uses megapixel detector array for imaging [8], [9]. PIT involves rapid deposition of a heat pulse on material surface, followed by recording sequential measurement of surface temperature distribution with fast frame Infrared (IR) camera. As heat diffuses into the material bulk, thermal resistance of internal pores results in locally slower heat decay on material surface. Thus, information about internal material defects is extracted from the time-dependent stack of thermography frames. While PIT is limited to detection of sub-surface pores, these pores are more likely to lead to crack formation compared to the internal ones [10], [11].
Detection of internal microscopic defects with thermal signatures comparable to camera noise level requires imaging of large structures with spatial resolution on the order of tens of microns. At each spatial location, hundreds of frames are acquired with a high-speed IR camera, which produces volumes of imaging data. Generating a massive amount of data (∼TB) can overwhelm IR camera storage capacity, which could limit the amount of imaging during the in-service inspection. Analysis of NDE data may involve sharing data sets between remote users located outside of the nuclear facility. Transferring a large volume of imaging data over a network could be excessively time-consuming and not feasible for practical applications. Compression of data for rapid transmission could remedy this problem. However, the challenge is to achieve a reasonably high compression ratio while preserving weak features of interest with amplitudes close to noise equivalent temperature difference (NETD) detection limit.
In this paper, we investigate data compression of PIT images using several unsupervised learning (UL) algorithms. UL is a subset of machine learning (ML), which aims to self-discover latent patterns in unlabeled data with minimal human supervision [12]. The advantage of using UL includes minimizing the workload to prepare and label the training dataset. Performance of several UL models is benchmarked using PIT data obtained from measurements of AM metallic structure with imprinted calibrated defects. Note that a similar problem of rapid processing of a large volume of imaging data arises during in-situ monitoring of LPBF process with IR thermography. While thermography data compression methods discussed in this paper are demonstrated for PIT measurements, these methods are general and applicable to in-situ monitoring applications as well [3], [13], [14].
This paper is organized as follows. Section II provides an overview of thermography data compression. In Section III, we describe the PIT system used for imaging of an AM stainless steel specimen with calibrated imprinted internal defects. Section IV describes the principles of several UL algorithms for PIT data compression. Section V benchmarks the performance of UL algorithms in compression of PIT images. Section VI contains conclusions of this paper.

II. OVERVIEW OF THERMOGRAPHY DATA COMPRESSION
Data compression approaches for ultrasonic, Eddy current, and thermography NDE imaging data have been investigated recently as part of signal processing strategies. Data compression algorithms considered for ultrasonic NDE applications include Walsh-Hadamard Transform (WHT), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT) [15]- [19]. The WHT, which utilizes the unitary and orthogonal transform is composed of rectangular waveforms, can be readily implemented with existing libraries of numerical routines. However, this method suffers from low reconstruction accuracy. The DCT method represents data with a sum of cosine functions. The DWT method correlates the input ultrasonic signals with wavelet kernels for data compression. The DCT and DWT methods can achieve a high compression ratio for ultrasonic data. More recently, machine learning algorithms, such as Wavelet Packet Transformation Convolutional Autoencoders (WPTCAE) and UL, have been used to compress ultrasonic data [20], [21]. The WPTCAE method uses the wavelet packet transformation for signal decomposition and sub-band elimination to compress data, optimized by convolutional autoencoders to find the best wavelet kernel. The UL method accomplishes data compression by learning principal latent patterns and removing VOLUME 10, 2022 redundant information in data. A high compression ratio of ultrasonic data has been demonstrated through the application of machine learning algorithms. In Eddy current imaging, compressive sensing [22] and Principal Component Analysis (PCA) [23] were used to compress data. Compressed data was subsequently reconstructed, and detection of subsurface material defects in images of metal plates was demonstrated [24], [25].
In thermography NDE, a lossless compression algorithm was developed, which combined multiple coding mechanisms to compress thermographic images with a 6.5 compression ratio [26]. For active thermography, Thermographic Signal Reconstruction (TSR) method was proposed, which combined data compression with defect detection [27]. TSR involves curve fitting for temperature transient signals with a fifth to eighth-degree polynomial on a log-log scale. Active thermography data was compressed by storing only the estimated polynomial coefficients for each temperature signal. Thermographic sequences fitting based on genetic and differential evolution algorithms were proposed, which improved compression performance by using few fitting coefficients to replace temperature signals [28]. However, this iterative algorithm required a large amount of computation and a long processing time.
A space/time mapping (STM)-JPEG (Joint Photographic Experts Group) algorithm was developed to compress pulsed thermographic data for the detection of subsurface defects in a polyethylene plate [29]. Data compression was accomplished by reducing the redundancy in temperature values for each thermography image. This was achieved by linearly mapping temperature values in high-bit (14-bit or 16-bit) into the low-bit (8-bit), followed by further image compression with JPEG algorithm. The STM-JPEG method resulted in a high compression ratio ranging from 24 to 55. However, subsurface material defects were not detectable in decompressed thermography images.
An approach based on the virtual wave concept was developed to estimate specimen thickness from pulsed thermographic data [30]. The virtual wave method uses a local transformation kernel to convert ''thermal waves'' (observed thermographic data) into virtual acoustic waves. Virtual acoustic waves were analyzed with ultrasound reconstruction algorithms, such as the frequency domain synthetic aperture focusing technique, to eliminate the virtual time dimension for data compression. However, the thermal to ultrasonic conversion process increased algorithm runtime and potentially led to the loss of information in thermographic images. This reduced the effectiveness of the virtual wave method for applications to the detection of weak thermal signals with amplitude near NETD detection limit, and signal-to-noise ratio SNR < 1.
A pulse-compression method was developed to detect defects with step heating thermography using halogen lamps [31]. Material defects were detected by convolving the acquired thermograms with a matched filter to estimate true impulse response. This resulted in increase of SNR, which led to higher compression of thermography data without loss of information related detection of defects. However, the drawback of pulse-compression method was that fidelity of the impulse response reconstruction was affected by numerical noise. In addition, experimental measurement of background was required, which increased the complexity of method implementation. As an extension of pulsecompression method, a barker-coded thermal wave imaging was proposed to evaluate defects in steel material [32]. A shorter Barker-code (7-bit) was used to process thermograms in the pulse-compression algorithm, which increased the thermography data compression ratio.
A data-processing algorithm for stepped thermography was developed to detect subsurface defects of carbon fiber reinforced polymer (CFRP) [33]. This algorithm outperformed TSR in data compression by using fewer fit parameters. Newton's law of cooling was used to compress thermography images with a 98.88% compression ratio by fitting transient temperature signals using Gauss-Newton algorithm. Data compression was implemented by storing the estimated polynomial coefficients for each temperature signal. However, when using the reconstructed temperature matrix, not all material defects were detectable in thermography images due to the loss of information in reconstruction. Additional sixteen matrices with temperature information and coefficients need to be stored. This increased complexity of using this algorithm.
An adaptive algorithm was developed based on the lifting discrete wavelet transform with set partitioning embedded blocks to verify the viability of image compression in Vibrothermography [34], [35]. This algorithm efficiently orders wavelet coefficients by significance and concentrates sets with high energy in the transformed domain. This allows signals with high information to be condensed based on their energy content. Using this method, thermography images were compressed with a compression ratio of 14. High reconstruction accuracy with a mean squared error (MSE) of 13.9 of thermal signals corresponding to material defects was demonstrated. However, the structures imaged in this study contained large (mm-size) defects. Compression of images of structures with microscopic material defects might not be equally viable.
ML has been used in thermography image analysis for insitu monitoring during 3D printing and for non-destructive evaluation of the structure after manufacturing [36]- [42]. However, to the best of our knowledge, ML and UL, in particular, has not been used for the compression of thermography data. In general, UL methods include clustering analysis [43], latent variable model learning [44], and neural networks [45]. In clustering analysis, such as k-means clustering and hierarchical clustering [43], the approach is to learn latent features in data, and then cluster samples with similar attributes into the same group. In the latent variable model learning, such as the PCA [46], Independent Component Analysis (ICA) [47], Blind Source Separation (BSS) [48], the approach is to learn the principal latent patterns in the data by linearly projecting data samples into the new representation space. This results in data dimensionality reduction, where in the new space few latent patterns can represent most of the information in data. Therefore, redundant information and random noises are removed, and few latent patterns can be used to compress and reconstruct data sets. In non-linear latent variable model learning, such as the kernel-PCA [49] and manifold learning [50], the approach is to learn the non-linear latent structure in the data and to generate optimized latent patterns to represent data. However, non-linear methods are time-consuming and require a largememory space to train non-linear models.
Recently, neural networks (NN) have emerged which learn the latent representation in datasets by utilizing the multilayer perceptron architecture. One popular approach is the autoencoder [51], which is an unsupervised NN that aims to learn the principal latent representation in data while maintaining the maximal similarity between input data and reconstructed data. The autoencoder consists of an encoder and a decoder block. The encoder learns to encode the input training data into a low-dimensional representation. The decoder learns to reconstruct the data to be as close to the input data as possible by using compressed representation from the encoder. There exists several types of autoencoders. In the regularized autoencoders, the sparse autoencoder enforces the sparsity constraint in training the neural network [51]. This constraint allows few hidden neurons to be active simultaneously to enable the autoencoder to learn principal patterns and richer representation in datasets. The denoising autoencoder is trained to learn a good representation in the noisy input data, and then recover the original undistorted data [52]. The variational autoencoder [53] regularizes the training process to ensure the latent space has good generative properties by encoding the input data as a distribution over the latent space.
The autoencoders have been used for classification, data augmentation, and data compression [51]. For the lossy image compression, the recurrent neural network (RNN) [54] and the optimized compressive autoencoder [55], which uses the 2D convolution and sub-pixel architecture, have been used to demonstrate data compression benchmarks comparable with that of JPEG 2000. For the time series data compression, the Long Short Term Memory (LSTM)-autoencoder [56] and the temporal convolutional networks [57] have been demonstrated to achieve compression ratio and efficient encodings, which are comparable to those of standard methods. Compared with standard compression methods, the autoencoders are more flexible in applications to compression of various data types. However, autoencoders such as the LSTM-autoencoder, convolutional autoencoder, or deepautoencoder, are computationally expensive.
In this paper, we present a comprehensive benchmark study of thermography data compression with UL. A novel lightweight thermography compressive sparse autoencoder (TCSA) neural network is proposed, which is demonstrated to result in a high compression ratio of 46. Compared with the state-of-the-art methods in thermography data compression, UL algorithms offer several advantages. These include achieving high compression ratio with UL, while preserving weak features to allow extraction of microscopic material defects from images. UL-based methods have general applicability because they are adaptable to compression of different data types, and allow for memory-efficient training and rapid on-line augmentation of the model.

III. PULSED INFRARED THERMOGRAPHY IMAGING OF CALIBRATED IMPRINTED DEFECTS IN ADDITIVELY MANUFACTURED METALLIC PLATE A. PULSED INFRARED THERMAL IMAGING SYSTEM
Schematic representation of PIT imaging system, data compression and analysis of thermography imaging data is shown in Figure 1. In the PIT imaging method, pulse trigger initiates a high-energy capacitor discharge through a white light flash lamp to deposit a heat pulse on the specimen surface. As heat diffuses into the bulk of the material, the megapixel fast frame IR camera synchronized to pulse trigger starts recording specimen blackbody radiation. The photo-counts can be converted into time-resolved images of temperature distribution T(x,y,t) on the specimen surface. The stack of recorded thermography images (thermograms data cube) is further processed with image analyzer unit for data compression, reconstruction, and detection of material flaws in images. The photograph of the experimental PIT imaging laboratory setup is shown in Figure 2. A white light flash lamp powered by Balcar ASYM 6400 capacitor source delivers a thermal pulse of 6.4 kJ/2ms to the specimen surface. Flash light is collimated with a flat lens. To increase absorption of heat, the surface of material is usually painted with washable graphite black paint. Imaging is performed with highspeed mid-wave IR (3-5µm) cooled detector array camera (FLIR X8501sc) with NETD = 20mK. The X8501sc model provides a maximum spatial resolution of 1280 × 1024 pixels with a frame rate of 181Hz at full window size.

B. IMAGING OF AM METALLIC SPECIMEN WITH IMPRINTED CALIBRATED DEFECTS
A stainless steel 316 (SS316) plate was fabricated with the LPBF method using an EOS metal 3D printer. The dimensions of this plate are 76mm × 76mm × 3mm (length × width × thickness). A set of calibrated defects containing un-sintered metallic powder and consisting of hemispherical porosity regions were imprinted into this SS316 plate. These defects were imprinted during the fabrication using an STL (stereolithography) file with a drawing of the pattern of hemispherical inclusions, which is shown in 3D rendering in Figure 3(a). Photographs of the front and side view of the AM printed SS316 plate are shown in Figure 3(b). The SS316 plate was imaged with a PIT system using camera settings of 181Hz frame rate and 1280 × 1024 pixels full imaging frame. A total of 1200 frames were acquired for a total imaging time of approximately 6.6s. The IR camera was fitted with a 50mm lens and 2.54cm (1in) extender ring to give an imaging spatial resolution of 25µm/pixel. The imaged area of the plate was approximately 32mm × 26mm. Imaging settings were determined through experimental optimization in the imaging parameters space. The size of the recorded data cube of 1280 × 1024 × 1200 pixels was 2.93GB on hard drive storage. To simplify the calculations, a smaller data cube of 720 × 864 × 1200 pixels was selected, which takes up 1.39GB of storage memory. The segmented data cube contains an image of the SS316 plate section with dimensions 18mm × 22mm. Figure 4(a) shows the drawing with labels indicating diameters and depths of the imprinted defects (distance from the top of hemisphere to plate surface). The pattern of defects is a grid with diameters ∅ = 1, 0.75, 0.5, and 0.25mm, and depths d = 1, 0.75, 0.5 and 0.25mm. Distances between nearest defects in both horizontal and vertical directions are 12mm. Distances from the defects in the last row and last column to plate edge are 20mm. The diameters of defects decrease along the horizontal direction from the left to right. Defects depths increase along the vertical line from top to bottom. The red wireframe box with dashed lines indicates the target imaging area (4 defects with diameters ∅ = 1 and 0.75 mm and depths d = 0.5 and 0.25 mm). Reconstruction of these defects is shown in Figure 4(d).
Results of imaging the plate with 75keV transmission X-rays are shown in Figure 4(b). Spatial resolution of X-ray detector is 30µm. Defects with diameters ∅ = 1 and 0.75mm, and depths d = 1, 0.75, 0.5, and 0.25mm are visible, while there is insufficient contrast in the image to observe defects with diameters ∅ = 0.5 and 0.25mm and depths d = 1, 0.75, 0.5 and 0.25mm. The red wireframe box in Figure 4(b) indicates location of defects visualized in Figure 4(d). Figure 4(c) shows an example of a recorded raw thermogram with 720 × 864 pixels. None of the material defects are visible in the raw recorded PIT data. Visualization of reconstructed defects in SS316 specimen using previously developed Neural Learning-Based Blind Source Separation (NLBSS) algorithm is shown in Figure 4(d) [48].

IV. UNSUPERVISED LEARNING FOR THERMOGRAPHY DATA COMPRESSION AND RECONSTRUCTION
Compression of thermography data cube (720 × 864 × 1200 3D matrix with 16-bit elements described in Section III) was investigated using several UL models. The UL algorithms include PCA, ICA, Exploratory Factor Analysis (EFA), Sparse Dictionary Learning (SDL), and a novel Thermography Compressive Sparse Autoencoder (TCSA) NN. The flowchart of data compression and reconstruction procedure, which was followed for benchmarking performance of all UL models, is shown in Figure 5. Each UL model was trained with observed thermography data until convergence criteria were met. During training, the thermography data cube was transformed into a condensed 2D data matrix. Efficient data compression is achieved by learning principal features and removing redundant information from data through training and development of principal dictionaries. Following compression, data were reconstructed with the UL model developed during training. Fidelity of data reconstruction was evaluated by recovering images of defects with the NLBSS algorithm. Performance of UL models is benchmarked using data compression ratio, reconstruction accuracy, training time, reconstruction time, and visibility of material defects in images processed with the NLBSS algorithm.

A. PRINCIPAL COMPONENT ANALYSIS (PCA)
The PCA method [23] is used to find latent patterns in high-dimensional data, and to represent the data with fewer orthogonal dimensions, which are called principal components (PC's). These PC's are obtained by maximizing the variance of training data, and minimizing the MSE between the original data and reconstructed data. The largest variance of the data is contained in the first principal component. Each subsequent PC has incrementally decreasing contribution to total data variance. In this study, the PCA was trained to compress thermography data with fewer PC's by removing redundant information. The equation used to train the PCA is Equation (1) shows the objective function, which is the MSE between the observed thermography data X and reconstructed thermography dataX . This function is minimized to guarantee the reconstruction accuracy and maximize the variance in each principal component θ , which is constrained to be orthonormal according to Equation (2). During training, the Lagrange multiplier method [58] is used to iteratively update the θ to find the minimum of the objective function subject to Equation (2). If the convergence is not satisfied, which means that the MSE is not small enough for the desired criteria, and the θ does not maximize the variance in data, we will use this θ to initiate another iteration. Otherwise, we apply the learned PC's to compress and reconstruct thermography data.
To further optimize memory efficiency and computation in training, we also trained the Incremental-PCA [59] model using a mini-batch fashion to compress thermography data. The Incremental-PCA is flexible enough to dynamically adapt to new patterns in data for in-service NDE and VOLUME 10, 2022 online AM process monitoring. Incremental-PCA training was implemented using the singular value decomposition (SVD) [48] algorithm to find principal latent features in acquired thermography data. The SVD closely resembles PCA but suffers less from numerical noise because the covariance matrix does not need to be calculated. Also, dividing the massive thermography data into batches for training allows for more memory-efficient training compared to that for regular PCA, since memory complexity is constant.

B. INDEPENDENT COMPONENT ANALYSIS (ICA)
The ICA method [47] is used to separate multivariate mixed signals into clusters of independent subcomponents. These independent subcomponents are trained by maximizing the non-Gaussian distribution of training data, while ensuring that the subcomponents are uncorrelated. During training, high dimensional thermography data is clustered into independent subcomponents with low dimensions. Thermography data for a particular measurement can display non-Gaussian distribution. However, the distribution of the sum of N temperature measurements approaches Gaussian distribution as N → ∞ regardless of the distribution of each temperature measurement [47]. The ICA is implemented according to the following equations:Ŝ In Equation (3), X is the matrix of the thermography data, W is the separation matrix, S andŜ represent the source signals and estimated low-dimensional independent components. The separation matrix W is trained to separate data into low dimensional independent components while maximizing the non-Gaussian distribution. The estimated independent components are constrained to be uncorrelated by Equation (4). Next, we applied the fast fixed-point optimization algorithm to search for the direction that maximizes the non-Gaussian property in training. This involves finding unit vector w i , where w T i is the ith row vector of the separation matrix W , so that the projection w T i X T maximizes the non-Gaussian property. Unit vectors w i are updated using the learning rule to estimate different independent components. The non-Gaussian property is measured with Negentropy [47] according to: In Equation (5), O(w) is the objective function used to estimate the Negentropy to measure the non-Gaussian property. G is the contrast function used to optimize the training performance, and g is the Gaussian variable with zero mean and unity variance. We then apply the Newton-Raphson [60] method to iteratively update the unit vector w to maximize the objective function O(w) under the constraint of Equation (4). Implementation details of the algorithm to estimate and update the unit vector w can be found in [47]. If the convergence is satisfied, which occurs when the vectors w in the current and previous iteration point in the same direction within a specified tolerance, we use vector w to estimate the independent component.

C. EXPLORATORY FACTOR ANALYSIS (EFA)
The EFA method [61] is used to find linear transformation of lower-dimensional latent factors in data by removing redundant information and random noises. The main advantage is that EFA can model variance in every direction of the input space independently which adds flexibility in training the latent factors. In addition, EFA models perform better in the presence of heteroscedastic noise. In this study, EFA is used to train the latent factors in measured thermography data. The flowchart of EFA algorithm is shown in Figure 6. The maximum likelihood estimation [47] is used as the objective function, and the expectation-maximization (EM) algorithm [62] is used to search for the optimal solution for the objective function. The EM iteration alternates between conducting an expectation (E) step and a maximization (M) step. In the E-step, a lower bound function is created to calculate the expectation of the log-likelihood using the current estimate for the parameters. In Equations (6) and (7), is the factor loading matrix that contains the latent factors that need to be learned. This loading matrix maps the thermography data x from the high dimension into the low dimension for compression. Here is a diagonal matrix that consists of the sensor noise variances, µ is the mean of data, S is the covariance of data, N is the number of samples used for training, and C is a constant that does not depend on , . The SUM and DET operations represent the calculation of the sum of diagonal elements and the determinant of the matrix, respectively.The E-step function equation is given as The M-step updates the parameters by maximizing the expectation of log-likelihood found in the E-step. The closedform update equations are In Equations (9) and (10), , represent the newly updated parameters, I is the identity matrix, and the operation DIAG sets the off-diagonal elements of the matrix to zero. Convergence criterion is satisfied when the fractional change between the updated average log-likelihood and the one from the previous iteration is smaller than some specified tolerance. Following convergence, learned parameters for estimating the latent features are used to compress thermography data.

D. SPARSE DICTIONARY LEARNING (SDL)
The SDL method [63] aims to find the sparse representation of the multivariate data by using optimized base vectors, called atoms. The base vectors form a dictionary, which is learned in training. Compared with PCA and ICA, SDL is advantageous for sparse representation of thermography data, since the atoms in the learned dictionary are not required to be orthogonal. The equation of the objective function to train the SDL is: In Equation (11), X , D, R represent the matrix consisting of the observed thermography data, the dictionary matrix, and the representation matrix, respectively. Therefore, DR represents the reconstructed thermography data. The first term in Equation (11), which calculates the difference between X and DR, is the reconstruction loss. The goal is to minimize this loss to guarantee that the learned dictionary matrix D provides good representation of the data. The first term in Equation (11) is calculated using the L 2 norm, which is more stable convergence during training as compared calculations using L 1 norm. In the second term in Equation (11), λ is the regularization parameter, and L 1 norm is used to calculate a sparse solution for R. In addition, to prevent the dictionary matrix from reaching arbitrary large values resulting in small VOLUME 10, 2022 values in R, values of D are normalized L 2 to be less than or equal to unity. This ensures sparsity and convergence in SDL training [63].
Next, we applied the matching pursuit algorithm [64] to optimize the search for the principal latent atoms to represent the thermography data. In practice, if the dictionary matrix D includes many vectors, it is computationally difficult to search for the atoms for sparse representation of data. Thus, we use the matching pursuit algorithm to iteratively update the dictionary matrix D and representation matrix R to yield a sparse representation of matrix X . In the first step of the algorithm, the atoms are found by searching for the maximum inner product values of atoms and observed data X . These atoms were used to initially reconstruct the matrix X . In the subsequent steps, the atoms were determined by iteratively searching for the maximum inner product between updated atoms and the residual, which is the reconstruction loss left after subtracting results of previous iterations. The convergence criteria are satisfied when the loss is smaller than specified tolerance. If the convergence criterion is not reached and the learned dictionary does not reconstruct thermography data well, we use the dictionary in a current iteration in the next iteration.
To achieve performance efficiency when training on large datasets [63], we also trained the SDL model by dividing the acquired thermography data into small batches (mini-batch-SDL). This stochastic training mechanism scales up well for large datasets and adopts the incremental learning [65] method. The incremental learning allows the model to periodically acquire new information whenever a new training set becomes available, while preserving the knowledge learned in previous training data sets. Therefore, the mini-batch-SDL training mechanism lowers the memory consumption and computational cost, and is well suited to learn and compress the dynamic thermography data for in-service NDE of AM structures.

E. LIGHTWEIGHT THERMOGRAPHY COMPRESSIVE SPARSE AUTOENCODER (TSA) NEURAL NETWORK
The autoencoder [51] is an unsupervised NN that learns how to encode and compress data, and then reconstructs the data using the encoded latent representation. The sparse autoencoder regularizes the autoencoder NN using the sparsity penalty to enhance the compression performance. In this study, we implement a lightweight thermography compressive sparse autoencoder (TCSA). Figure 7 shows the schematics of the TCSA architecture. This NN is fully connected and consists of a 3-layer (1200-64-26) dense encoder followed by a 3-layer (26-64-1200) dense decoder architecture. To further enhance performance in data compression, this NN is optimized by using the residual connection [66] and LeakyReLu [67] activation. We apply the L 1 normalization as a sparse penalty to regularize outputs from the hidden layer. The residual connection accelerates the training speed while improving the accuracy [66]. For TCSA, we apply the residual connection across the bottleneck to directly transmit information on the encoded latent patterns from the hidden layer of the encoder to the hidden layer of the decoder. This enables the decoder to learn the encoded representation more clearly from the thermography data. In addition, we benchmarked several activation functions which are commonly used in the autoencoder, such as ReLu and Sigmoid. Ultimately, we selected the LeakyReLu as the activation function for the hidden layers, since this function results in the fastest convergence of TCSA. The activation function adds non-linear features into the NN to allow learning complex patterns in data. In addition, for training optimization, we apply the Adam [68] algorithm to stochastically optimize the training procedure. The Adam optimization is computationally efficient and requires little memory space. Compared with the classical stochastic gradient descent optimization, which maintains a single learning rate for all weight updates, the Adam optimization computes individual adaptive learning rates for different parameters. Also, the Adam algorithm combines advantages from the Adaptive Gradient Algorithm and Root Mean Square Propagation [69] by calculating the exponential moving average of the gradient and the squared average to yield faster convergence during the training. In this study, we trained TCSA NN using 500 epochs with a batch size of 256 to minimize the objective function to the desired loss. The loss is measured by using the MSE. The objective function is calculated as: In Equation (12), the first term represents the MSE in calculation of loss between the reconstructed thermography dataX and input thermography data X . In addition, to regularize the autoencoder to enhance the compression performance, we add the L 1 normalization term scaled by the parameter λ on the activation H , which yields a sparse solution for the model. Next, the Adam optimization algorithm is applied to minimize the objective function to reconstruct the 9102 VOLUME 10, 2022  thermography data as close to the input data as possible. If the model does not converge to the desired loss, another iteration is initialized to continue training the NN.

V. BENCHMARKING OF UNSUPERVISED LEARNING ALGORITHMS PERFORMANCE
Summary of benchmarking of UL algorithms performance in compression and reconstruction of thermography data is shown in Figure 8. Performance is compared on the basis of compression ratio, memory space saving, reconstruction accuracy, UL model training time, compression time, and thermography data reconstruction time. Next to each UL model, Figure 8 lists the number of principal latent dictionaries (e.g., 30 dictionaries for PCA). A higher number of dictionaries represents better reconstruction but lower compression ratio. Each UL model was trained to obtain the highest compression ratio using the minimal number of latent dictionaries. Another constraint was to preserve enough features in the data during compression to allow detection of material defects after processing reconstructed data with the NLBSS algorithm. The compression ratio is calculated as the size of the measured thermography data divided by the size of compressed data. The memory space saving, which is calculated as the difference between 1 and the reciprocal of compression ratio, represents the reduction in memory storage size relative to the original thermography data. All models achieve compression ratios >30, with corresponding space saving of >96%.
Reconstruction accuracy is measured with Peak Signal to Noise Ratio (PSNR), which estimates absolute errors in calculations of similarity between initial and reconstructed after compression thermography data. The PSNR is calculated as In Equation (13), MAX is the maximum value of the original thermography data represented with 16 bits depth. MSE is the mean square error between the measured and reconstructed thermography data. For all UL models, PSNR > 73. A higher PSNR indicates better reconstruction. VOLUME 10, 2022 Training, compression, and reconstruction times were calculated by running UL models on the Intel (R) Core (TM) i7-8750H, CPU@2.20GHz 2.21GHz computer with NVIDIA GTX 1070 GPU 16 GB RAM. Figure 9 shows eight images with material defects visualized after thermography data processing with the NLBSS algorithm. For reference, Figure 9(a) shows the image of NLBSS processing of the original thermography data, which is the same as the image in Figure 4(d). Images 9(b) to 9(h) show images processed with NLBSS following compression and reconstruction with PCA, Incremental-PCA, ICA, SDL, Mini-Batch SDL, and TCSA NN. All four imprinted material defects visible in Figure 9(a) are also visible in images of NLBSS processed data after compression and reconstruction with UL models. Different color schemes were chosen in the images in Figure 9 to highlight material defects.
According to the results in Figure 8, PCA outperforms the other UL models with overall performance across compression ratio, memory space saving, training, compression and reconstruction times categories. ICA yields the same data compression ratio, and shorter compression and reconstruction times, but requires longer training time than PCA. While incremental-PCA and mini-batch-SDL models have lower performance metrics, they offer the possibility of rapid on-line augmentation of the model with additional training data. The EFA model, which requires the lowest data compression time, offers the advantage of potentially better performance in the presence of heteroscedastic noise. When using the TCSA, we obtained the highest compression ratio of 46.15 with a space-saving of 97.83%, and fast compression time. Although TCSA takes the longest time to train in comparison with other UL models, the training time is faster compared with most state-of-art NN, such as the deep NNs, the 1D temporal or the 2D spatial convolutional NNs, and the LSTM-autoencoder. Training of these NN's is very time consuming (takes days to train on the computer used in this study), and heavily depends on the availability of computing power (multiple GPUs).

VI. CONCLUSION
In this paper, we have investigated the performance of UL algorithms for compression of thermography data. Data compression is an enabling technology for high-resolution nondestructive imaging of AM structures with PIT, in particular for in-service NDE applications. Performance of UL models was benchmarked with data obtained from PIT imaging of AM stainless steel plate with calibrated imprinted subsurface porosity defects. The benchmarking study included several existing UL methods, which were adapted for compression and reconstruction of thermography data, and a novel lightweight TCSA neural network introduced in this paper. Compared to existing algorithms for thermography data compression, UL algorithms allow for flexibility in model re-training, and are adaptable to the compression of different data types.
Benchmarking of UL algorithms performance included comparisons of data compression ratio, reconstruction accuracy, model training time, data compression time, and data reconstruction time. Microscopic subsurface material defects are not visible in the original thermography images (data without compression). These defects become visible after processing the images with the NLBSS algorithm. One of the main challenges in compression of thermography images is that low S/N intensity features could be lost during compression. Therefore, compression performance with UL algorithms depends on the learned manifolds to preserve weak features of interest. Compression performance can be potentially improved by training UL models to learn non-linear manifolds in the data. However, training nonlinear models is time-consuming and requires large memory resources.
To evaluate the fidelity of data compression with UL models, we have demonstrated detection of microscopic calibrated material defects in original thermography images, and reconstructed thermography images following compression. Microscopic material defects were not readily visibile in the original recorded thermography images. Processing the data with NLBSS algorithm was required to visualize these defects. After compression and decompression of thermography data, the same material defects were visualized after applying NLBSS. All algorithms compress thermography data with compression ratio >30, with data compression and reconstruction times on the order of 10s. Overall, PCA and ICA models show the best performance.
The new TCSA neural network has the highest compression ratio of 46.15, and a compression time comparable to that of ICA. This is achieved through nonlinear dimensionality reduction while preserving the features of interest through the training process. However, TCSA training and reconstruction times are significantly longer than those of other algorithms evaluated in this study. Nevertheless, the training time of TCSA is substantially smaller than that of other state-of-the-art autoencoders. To address this, memoryefficient incremental training will be investigated in future work.
While this paper investigated compression of ex-situ or NDE thermography data, a similar challenge of big data compression arises in in-situ AM process monitoring with thermography. This problem can be potentially addressed with data compression algorithms investigated in this work, as will be investigated in future studies.
In summary, the main scientific contributions of this research are: 1. Comprehensive benchmark study of UL algorithms to determine the best solutions for thermography data compression.
2. Demonstration of UL-based high compression ratio and high reconstruction accuracy of images, sufficient to detect microscopic subsurface defects in AM metals.
3. Demonstration of novel TCSA NN with a notably high compression ratio of 46.
SASAN BAKHTIARI (Senior Member, IEEE) received the B.S.E.E. degree from the Illinois Institute of Technology, Chicago, IL, USA, in 1983, the M.S.E.E. degree from the University of Kansas, Lawrence, KS, USA, in 1987, and the Ph.D. degree in electrical engineering from Colorado State University, Fort Collins, CO, USA, in 1992. Since 1993, he has been with the Argonne National Laboratory, Argonne, IL, USA, where he is currently a Senior Electrical Engineer and the Section Manager of the Sensors, Instrumentation, and the Nuclear Science and Engineering Division, NDE Group. He has been involved in applied research in the areas of electromagnetic guided wave and radiating structures, induction sensor technology, acoustic sensing, signal processing, and analytical and numerical modeling. He has conducted theoretical and experimental work on electromagnetic and acoustic/ultrasonic NDE techniques and active and passive millimeter-wave sensing techniques for various industrial and scientific applications. He has authored more than 150 journal articles, conference proceedings, technical reports, and chapter contributions. He was a recipient of two R&D 100 Awards, and he is also an inventor of a number of patents related to noncontact electromagnetic and acoustic sensing technologies.
ALEXANDER HEIFETZ (Member, IEEE) received the B.S. degree in applied mathematics and engineering science, the M.S. degree in physics, and Ph.D. degree in electrical engineering from Northwestern University, Evanston, IL, USA, in 1999, 2002, and 2006, respectively. From 2006 to 2008, he was a Canary Foundation Postdoctoral Fellow at Northwestern University, working on optical light scattering. He joined the Argonne National Laboratory, Lemont, IL, USA, as the Director's Postdoctoral Fellow with the Nuclear Engineering Division, in 2008. He became an Electrical Engineer (a Technical Staff Member), in 2011, and a Principal Electrical Engineer with the Nuclear Science and Engineering Division, in 2017. At Argonne, he has been involved in nuclear energy enabling technology research at the interface of electrical engineering (optics, electromagnetics, ultrasonics, machine learning) and nuclear engineering (heat transfer, thermal hydraulics and materials mechanics). His specific research projects focused on thermal tomography nondestructive evaluation for metal additive manufacturing, advanced reactor high temperature fluid sensing, ultrasonic communications on nuclear pipes, and machine learning for thermal hydraulics sensing. He is a co-recipient of the best paper awards at 2019 and 2020 IEEE International Conference on Electro-Information Technology (EIT). He is also an Adjunct Professor in civil, materials and environmental engineering with the University of Illinois at Chicago, and a member of the Northwestern Argonne Institute for Science and Engineering. VOLUME 10, 2022