3D-CNN and Autoencoder-Based Gas Detection in Hyperspectral Images

The detection of gas emission levels is a crucial problem for ecology and human health. Hyperspectral image analysis offers many advantages over traditional gas detection systems with its detection capability from safe distances. Observing that the existing hyperspectral gas detection methods in the thermal range neglect the fact that the captured radiance in the longwave infrared (LWIR) spectrum is better modeled as a mixture of the radiance of background and target gases, we propose a deep learning-based hyperspectral gas detection method in this article, which combines unmixing and classification. The proposed method first converts the radiance data to luminance-temperature data. Then, a 3-D convolutional neural network (CNN) and autoencoder-based network, which is specially designed for unmixing, is applied to the resulting data to acquire abundances and endmembers for each pixel. Finally, the detection is achieved by a three-layer fully connected network to detect the target gases at each pixel based on the extracted endmember spectra and abundance values. The superior performance of the proposed method with respect to the conventional hyperspectral gas detection methods using spectral angle mapper and adaptive cosine estimator is verified with LWIR hyperspectral images including methane and sulfur dioxide gases. In addition, the ablation study with respect to different combinations of the proposed structure including direct classification and unmixing methods has revealed the contribution of the proposed system.


I. INTRODUCTION
I MAGING spectroscopy has been used by physicists and chemists for more than three decades to identify materials and their compositions. The concept of hyperspectral remote sensing started in the mid-80s and has been widely used by geologists for mapping minerals to this day [1]. The detectability of the material is determined depending on the spectral range of the spectrometer, its spectral resolution, the abundance of the material, and the strength of the absorption properties in the measured wavelength region [2].
The gas leaks in particular in developed countries in the last decade were one of the crucial environmental problems. Some Okan Bilge Özdemir is with the Department of Computer Engineering, Artvin Çoruh University, 08000 Artvin, Turkey (e-mail: okanozdemir@artvin.edu.tr).
Digital Object Identifier 10.1109/JSTARS. 2023.3235781 gases are harmful to the environment and contribute to global warming. They present both short-term risks such as explosions and long-term risks such as cancer to workers or people living close to the leaking facility. To minimize these effects, environmental authorities need to monitor chemical and industrial plants to control gas emission levels. Infrared remote sensing technology, which offers many advantages over traditional gas detection systems, is one of the proposed solutions for this aim as such solutions allow monitoring the scene from a safe distance [3].
To this end, forward-looking infrared hyperspectral cameras are placed in potentially dangerous areas for gas detection from safe distances. These cameras, which are designed to capture images at different wavelengths, can operate in two different regions, which involve medium-wave infrared (3-5 µm) and long-wave infrared (7-14 µm) bands. Until now, these cameras have been utilized for the detection of different gases such as carbon dioxide, propane, methane, sulfur, butane, freon, ammonia, difluoroethane, diethyl ether, sulfur hexafluoride, and phosgene [4], [5], [6], [7]. The detection of gases in such studies is mainly achieved by utilizing conventional statistical detection methods along with the basic signal processing operations such as data transformation, background suppression, dimension reduction, linear regression, and matched filtering [4], [6], [7], [8], [9].
As one of the pioneer studies for gas detection, Pogorzala [10] proposed a pixel-based method using linear regression in synthetic images for the detection of ammonia (NH3) and Freon-114. Later, Vallières et al. [4] presented a method that first converts the hyperspectral radiance data to luminance temperature data. After performing background removal on the temperature data, the resulting cube undergoes spectral matched filtering [11] to distinguish gas-containing pixels. Finally, the detection is carried out by applying thresholding to the resulting scores after matched filtering. In another study, Spisz et al. [12] first applied principal component analysis for background removal, and then utilized matched filter and spectral angle mapper to detect various chemical compounds. A different study using hyperspectral imaging [13] focused on the automatic detection of waste gases. The proposed method first filters the possible areas in the scene by means of detecting critical wavelengths and using the correlation coefficient metrics to select pixels with high concentration. The target gases are then detected using a spectral matched filter algorithm on the selected pixels.
While the presented studies mainly utilize background information, Hirsch and Agassi [14] presented an algorithm for gas detection without requiring background information. The method first applies K-Means segmentation and performs a This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ spectral analysis on each segment. The final decision is made by applying thresholding on the correlation between the calculated information and the target gas signature. Another method of performing thresholding in the ultimate stage of gas detection is proposed by Kastek and Piątkowski to reveal the gases in turbulent stack fumes [15]. In particular, a spectral angle mapper is utilized to reveal the correlation between the pixel spectra and the signature of the target gas. Later, Kuflik and Rotman [16] conducted a study aiming to find the minimum number of bands required for gas detection with synthetically generated data. The proposed methods by Sabbah et al. [17] and Safak et al. [18] follow similar strategies. First, they convert the raw data into luminance temperature data. Then, the correlation between the pixel spectra and the signature of the target gas is computed and thresholded for detection. In another study presented by Theiler and Love [19], satellite images are employed for the detection of NO2 plumes and SO2 plumes rather than the captured hyperspectral data from in-scene sensors.
Other than the correlation and thresholding-based conventional methods, the new trend in gas detection is to exploit the accumulated experience in deep learning-based detection. One of the few examples of these studies proposed by Kim et al. demonstrated the performances of classification-based deep neural networks and convolutional neural networks (CNNs) for different gases [20]. Among these studies, Zhang et al. [21] developed a classification-based method using a CNN for the detection of CO2. Kumar et al. [22] utilized an region-based CNN (RCNN) structure, rather than CNN, in their work to detect Methane gas plume emissions. The authors stated that their hyperspectral mask-RCNN (H-MCRNN) method is suitable for the rapid scanning of large areas. Finally, Gu [23] performed a detailed study on the hyperparameter optimization of the H-MCRNN method for the detection of methane gas.
In addition to these studies, there are also studies using hyperspectral unmixing for gas detection. In the study presented by Henrot et al. [24], they demonstrated the performance of the hyperspectral unmixing-based model using the linear mixture model by using synthetic and real-time series of hyperspectral images. In another study, Shi et al. [25] presented a method based on hyperspectral unmixing where low-degree mixed pixels in the hyperspectral image were also used by using the sparse greedy algorithm. In a similar study, Tochon et al. [26] performed gas plume detection and tracking from hyperspectral video sequences using hyperspectral unmixing. A linear mixture model was used in the study to extract information about the concentration of the gas as well as the location information. Finally, Fiscante et al. [27] proposed a method based on unsupervised sparse unmixing for the detection of sulfur dioxide.
When these studies are examined, one of the common processes for gas detection is the conversion of radiance data in the thermal domain to the luminance temperature values. The underlying reason for such a process is that the gases become visible due to the emission and/or absorption when there is a temperature contrast between the background and gas components in the scene. In addition, almost all of the presented methods utilize a thresholding operation in the final stage of the processing chain for the ultimate detection. This thresholding operation on the other hand is dependent on the statistical characteristics of the utilized data, which in turn makes the adaptation of these algorithms very challenging for a generic application. It is thus essential to develop methods which do not require changes in the threshold values with respect to the changes in the data for the ultimate decision.
Another important aspect of the previous hyperspectral gas detection methods is to work directly on the resulting data after the luminance temperature conversion. However, the acquired radiance data in the thermal LWIR range is always a mixture of the thermal radiation of the background and the gas molecules in the air. Therefore, the handling of the hyperspectral gas detection in the LWIR spectrum as a combined problem of unmixing and target detection is crucial for better detection performances. Finally, the existing studies on hyperspectral image analysis mostly handle the detection problem within a single stage. However, a two-stage detector, where the possible locations of the target are found in the first stage and then the target is classified in the second stage, is an alternative promising approach, as in the case of the latest deep learning-based object detection networks working on RGB images.
In this article, inspired by the latest two-stage deep learningbased object detectors, we have proposed a deep learning-based gas detection method combining unmixing with classification. The proposed method first converts the radiance data to luminance temperatures and then performs unmixing on the resulting data to detect endmembers and related abundance values for each pixel. Considering the success of autoencoders for the representation of complex scenes in deep learning, we have utilized 3D-CNN and autoencoder for the unmixing part to model the interactions in the thermal LWIR range, rather than using conventional geometry-based unmixing methods. Then, the proposed method further classifies the extracted endmembers by using a three-layer fully connected network, which examines the presence of the target gas signatures in the endmembers.
In this regard, the first contribution of the proposed 3D-CNN and autoencoder-based gas detection method for LWIR hyperspectral images is to eliminate the need for the selection of optimum threshold values in previous conventional methods. Second, the proposed deep learning-based detection method with hyperspectral unmixing significantly improves the detection performances for gases with respect to the existing gas detection methods using statistical detection methods such as spectral angle mapper (SAM) and adaptive cosine estimator (ACE) in LWIR thermal range. Finally, the ablation study for the possible different combinations of the proposed system reveals its better performances with respect to similar structures using conventional unmixing methods and also classification methods directly working on the data without unmixing.
The rest of this article is organized as follows. Section II gives the details of the proposed deep learning-based model for gas detection. Section III describes the test data, distance metrics, and performance metrics. Section IV presents the experimental results and the comparisons with the baseline gas detection methods. Finally, Section V concludes the article.

II. PROPOSED METHOD
The proposed gas detection method mainly consists of three stages. The first stage is preprocessing stage, where the raw data is converted to luminance temperature data. In the second stage, the pure materials in the data are determined by the proposed 3D convolution and autoencoder-based hyperspectral unmixing network along with the abundance ratios of these materials in each pixel. The proposed 3-D convolution and autoencoderbased model in this stage is a special design which addresses the essential constraints in unmixing problem, namely the positivity constraint and sum-to-one constraint for the weights of the abundances. Given the multidimensional characteristics of hyperspectral images and their specific features in LWIR range it is clear that CNN structures without autoencoder cannot sufficiently address these specific constraints for unmixing problem. In the last stage, it is determined whether there is gas in the endmembers obtained from the data with the proposed three-layer fully connected network for detection. Ultimately, the class assignment is performed for each pixel according to the abundance ratio. Every part in the proposed framework addresses one essential problem for the deep learning-based gas detection in hyperspectral images.

A. Radiance to Luminance Temperature Conversion
In the literature, most of the studies use luminance temperature of hyperspectral data for gas detection applications [4], [6], [7], [8], [9], [13], [14], [16], [18] as such information indicates a more steady characteristic than the radiance data for varying conditions. The conversion of hyperspectral radiance data to luminance temperatures is carried out in three stages. The data obtained in the first step is converted to the brightness temperature value [28]. Then, the blackbody curve is calculated in accordance with the Planck curve [29] with respect to the maximum temperature inherited in the observed data. This curve is then further processed to eliminate the atmospheric effects to obtain the ultimate data for the detection [30].
The first step of luminance-temperature conversion can be expressed as where c1 and c2 are constants, λ is the wavenumber, and L represents the spectral radiance data [30].
In the second step, the Planck formula, which is used to obtain the blackbody curve, is calculated by using the luminancetemperature data, as where c1 and c2 are fixed values, and λ is the wavenumber. The last step is the elimination of atmospheric effects from the data with the resulting black body curve. To this aim, one of the common methods, namely, the blackbody radiation curve compensation algorithm [8] is adopted in this research. The final where λ is wave number, C λ is the corrected brightness value at λ, BB max is the maximum brightness value in the Black-Body curve, B(λ, T ) is the blackbody brightness value at λ and S λ is the radiance value measured by the hyperspectral sensor at λ [30]. The corrected luminance temperature values are utilized for gas detection in the further stages.

B. Proposed 3-D Convolution and Autoencoder-Based Hyperspectral Unmixing
After the luminance temperature conversion, the next step is unmixing to find the relevant endmembers and abundance ratios. The proposed hyperspectral unmixing model for gas detection consists of two main parts as 1) 3D-CNNs; and 2) autoencoder. The 3-D convolution and autoencoder-based hyperspectral unmixing network is used both to obtain the signatures of the pure pixels in the data and to calculate the abundance value in each pixel. Fig. 1 illustrates the general scheme of the proposed deep learning-based network for hyperspectral unmixing. Table I illustrates the more detailed structure of the proposed model with the related parameters. In accordance with the table, the main stages of the proposed network layout are as follows.
The convolution layer, which is the first part illustrated on the left part of the figure, consists of a flattened layer followed by three different 3-D convolution layers. 3-D convolution filters allow to include both spectral and spatial information. In the figure, P is the input channel of the hyperspectral data corresponding to the number of spectral bands in the hyperspectral image. L1, L2, and L3 are the spectral dimensions of 3-D convolution filters. The output of this section is a flattened layer of 1-D vectors transformed from 3-D inputs. These 1-D vectors enter the next section as input. The x, y, and z parameters in the definition of the filters correspond to the terrestrial coordinates in the x and y The second part of the proposed model, namely the autoencoder part shown on the right side of Fig. 1, basically consists of two main stages: encoder and decoder. Autoencoders are often used to reduce the dimension of the input to a smaller size and learn data structures by restoring them to the input size. This structure primarily reduces the spectral signal received in the encoder part to the number of endmembers and performs the abundance detection process. Then, hyperspectral unmixing is performed by generating a signal again in the decoder layer. Constraints such as positivity and sum to one in the hyperspectral unmixing process can be applied in the last layer of the encoder part. In the proposed model, the output of the 3-D convolution section is given as input to the autoencoder section. Then, the encoder-decoder structure is established, which provides abundance estimation and endmember estimation operations. It should be noted that the weights between the encoder network and the decoder network correspond to the endmembers and the nodes in the last layer of the encoder network gives the abundance values for each endmember in the proposed design.
The most important factor in the proposed model, which affects the performance in the autoencoder part, is the normalization layer. This layer, which is applied before the encoder output, enforces the two most important constraints in hyperspectral unmixing. These are the constraint of positivity, which is the positive encoder output provided by rectified linear unit (ReLU), and the constraint of the sum to one. This layer is applied as follows: where i is the index of the node, w is the weight matrix, and T is the length of the node. Finally, this layer provides both the detection of pure materials in the data and their abundance to be used in the ultimate decision. The proposed 3D-CNN encoder-based hyperspectral unmixing algorithm is used to detect pure spectral signatures in data and the abundance values of these materials in each pixel. The detection of gas spectral signatures within these endmembers is performed in this ultimate layer. For this purpose, a 3-layer fully connected neural network is designed for gas detection within the endmembers. The parameters of the network with 3 layers are given in Table II. The loss function of this network is selected as the typical cross-entropy-loss metric. The proposed network is trained using the spectrum of background and spectral signatures of the gases which are determined as targets.

C. Gas Detection Over Endmembers and Abundances
The hyperspectral images in a natural scene can be represented with two components, which are: 1) the background thermal radiation; and 2) the radiation due to the targeted gas. Given the input spectrum to the trained network, the network classifies the test data as background or one of the gas classes. In the proposed model, the endmembers are given as input to this network to determine the class for each endmember. If a gas signature is detected in one of the tested endmembers, the pixels with more than 50% abundance values in the hyperspectral data are marked as the detected gas type. Considering that the background signature is very dominant in LWIR images, we experimentally observed that the given threshold is the logical threshold to decide whether the tested pixel is closer to the background or the target gas signature.

III. DATASETS AND DISTANCE METRICS
Two different real hyperspectral LWIR gas images are used in this study. These images are acquired with the Long Wave In-fraRed camera manufactured by TELOPS. Images of the scenes are given in Fig. 2. Since gas diffusion is not fully predictable, ground truth data are roughly generated.
The Methane images in Fig. 2(a) are taken at 877 cm -1 and 1285 cm -1 spectral ranges. The methane image has 124 bands and a spatial size of 200 × 200 pixels. The ambient temperature was determined as 300 K. There is only Methane gas in this scene shot from 2-3 m away. The sulfur dioxide images in Fig. 2 are taken at 851 cm -1 and 1288 cm -1 spectral ranges. This image has 171 bands, and the ambient temperature is determined as 302 K. This image also has 200 × 200 pixels. There is only sulfur dioxide gas in this scene shot from a distance of 5 m. Ground truth maps for methane and sulfur dioxide gases are given in Fig. 2. Method performance comparisons are performed using the ground truth information given in the figure. Fig. 3 also shows the spectral signatures of methane, butane, and sulfur dioxide gases with their absorbance characteristics between 851 cm -1 and 1290 cm -1 to cover both images. These signatures are taken from the NIST database [31]. The gas detection process which is carried out in the last stage of the proposed system utilizes these signatures with the background information.
Different distance metrics are used for abundance estimation accuracy, endmember estimation accuracy, and the cost function of the optimization algorithm. These metrics are selected as mean squared error (MSE) and SAM. The distance metric SAM Fig. 3. Brightness temperature signals for methane, sulfur dioxide, and butane gases. [32] has been used both as a cost function and for endmember estimation accuracy. The formulation of SAM is given as where x i is the reference spectral signature,x i is the estimated spectral signature.

IV. EXPERIMENTAL RESULTS
The experimental results and comparisons for the proposed 3-D convolution and autoencoder-based method and the conventional gas detection methods are presented in this section. The first part of the experiments illustrates the performance of conventional gas detection methods, based on SAM and ACE methods. The performances are given for both methane and sulfur dioxide gases. The second part reveals the performance of the proposed method in comparison with the conventional methods. The performance of the endmember estimation and abundance estimation processes are both discussed for different distance metrics such as SAM and MSE for the optimization of the proposed 3-D convolution and autoencoder-based network.
Due to the technical limitations to exactly determine the position of the gas molecules in gas detection studies, it is not possible to have a regular receiver operating characteristics analysis as in the other detection studies for solid targets. Therefore, the overall experimental evaluation goes over the score images or approximate ground truth data extracted with visual inspection. Rather than using uncertain ground truth data, we perform the experimental evaluation over score images in the performed study.

A. Experimental Results for Conventional Methods
The SAM [32] and ACE [33] algorithms are used for methane and sulfur dioxide gas detection. The SAM method measures the angle between two spectral signals in radians. The similarity between the two vectors is high when the angle is small in the SAM method. SAM metric ensures a more robust evaluation due to its invariance to scaling compared to the MSE metric. Different than the SAM algorithm, ACE suppresses the background information by utilizing the covariance matrix of the data during the detection. The conventional gas detection methods based on SAM and ACE generally perform the detection process by comparing the reference target signature with the spectral signature of each pixel. These methods are applied to the resulting data cubes which are obtained after transforming the hyperspectral cubes into luminance temperature data. In this regard, in this study, the SAM-based method for gas detection was proposed by Öztürk et al. [18] and the ACE-based gas detection method proposed by Omruuzun and Cetin [8] are selected as the baseline methods for the comparisons. Fig. 4 shows the output scores for detection of methane and sulfur dioxide gases with the SAM and ACE-based gas detection methods for each pixel before the thresholding. The regions where the sulfur dioxide gas is found seem to be more clearly selected by the SAM method compared to ACE. In addition, the background suppression is more apparent for sulfur dioxide in Fig. 4(a) and (b) compared to the methane in Fig. 4(c) and (d). The main problem for both of the output images obtained with the statistical detection methods is to regularly determine the threshold value for the decision. As such a selection is dependent on the statistical characteristics of the utilized images, which makes the generic application more challenging, there is a need for a different method to determine the desired targets in the data. One of the main motivations of this work was to eliminate such a thresholding process by means of utilizing a deep neural network-based approach.

B. Experimental Results for the Proposed Method
The experiments for the proposed method involve determining the parameters of the deep learning structure used in the hyperspectral unmixing part. For this purpose, the performance of different cost functions with the proposed system has been examined. The SAM and MSE are selected as cost functions.

1) Experimental Results for Methane Gas:
The results obtained with the proposed method are given in Fig. 5(a) and (b) for the distance metrics SAM and MSE. The pure pixel signatures obtained for these results are given in Fig. 5(c) and (d), respectively. The spectral signature of one of the estimated endmembers, illustrated with orange color in Fig. 5(c) and (b) are similar to the spectral characteristics of methane gas taken from the NIST database [31] which is illustrated in Fig. 3.
Although there is such similarity in both methods, the methane gas characteristics are more apparent in the results obtained with SAM. The SAM distances between the estimated endmembers and the spectral signature of methane are 0.05 and 0.11 radians when the SAM metric is utilized as the distance metric in the proposed method. Accordingly, the SAM distances between the estimated endmembers and the spectral signature of methane are 0.07 and 0.12 radians for the case of the MSE metric. The proposed method successfully extracts the endmembers for the background and inscene gas.
The extracted endmembers are classified as described in Section II-C. The results obtained after the classification process g are given in Fig. 5(e). While the presented method can successfully extract the gas regions more apparently, its performance with respect to the SAM and ACE-based methods are also superior as can be revealed from the comparison of Figs. 4 and 5.
2) Experimental Results for Sulfur Dioxide Gas: The experimental results for sulfur dioxide are given in Fig. 6 for the SAM and MSE metrics in the proposed method Similar to methane gas results, the cost function SAM reveals more successful results for sulfur dioxide gas. This can be explained by the fact that when background and gas signatures are selected, the angle difference when using SAM is greater than the one when using MSE.
The examples marked with red color among the results obtained using the SAM and MSE distance metrics are given in Fig. 6(a) and (b). The pure pixel signatures obtained for the marked results are given in Fig. 6(c) and (d). Although the spectral characteristics of the sulfur dioxide gas appearing in the orange signatures are included in both methods, the sulfur dioxide gas characteristic is more apparent in the results obtained with SAM. 0.9 and 0.12 SAM were taken for 0.9 and 0.21 MSE. The result of the automatic detection of sulfur dioxide gas is given in Fig. 6(e).
Similar to methane gas, the proposed method also reveals a successful detection performance for sulfur dioxide gas. The proposed deep learning-based gas detection method eliminates the need for thresholding. While such a thresholding operation requires a delicate selection for different data in traditional methods, the proposed method only utilizes the abundance values for such a selection with its deep learning-based structure without being affected by data changes. This threshold value, which is not affected by data changes, appears as a parameter change in the proposed deep learning-based method. The proposed deep learning-based method gives much better results compared to the conventional gas detection methods. Without loss of generality, the proposed method can also be easily adapted to different gases. In addition, the experiments reveal that the proposed method can successfully work for different gases with the same system parameters such as learning rate, batch size, and cost function.

V. DISCUSSION
The proposed deep learning-based gas detection method combining unmixing with classification gives better performances than the other gas detection methods proposed for hyperspectral LWIR images as illustrated in the previous section. Another important discussion for the performed study is however to understand the contributions of the proposed system with respect to its possible different combinations. To this aim, we have performed an ablation study in this section by considering the main parts of the proposed system.
The proposed system is composed of three main stages namely, 1) luminance temperature conversion; 2) 3D-CNN based autoencoder; and 3) a 3-layer fully connected network for gas type detection. The first stage, luminance temperature conversion, is one of the essential processes in gas detection to suppress the variations in hyperspectral gas radiation in LWIR thermal range. Therefore, the ablation study has skipped this part and focused on the following two main aspects after temperature conversion.
1) The proposed system combining unmixing with classification is compared with direct classification without performing unmixing on the data to reveal the contribution of unmixing to gas detection. 2) The proposed system using 3-D convolution and autoencoder-based unmixing is compared with a similar structure using conventional unmixing methods in order to reveal the contribution of autoencoders for gas detection.

A. Comparison of the Proposed System With Direct Classification Without Unmixing
In order to compare the proposed gas detection using hyperspectral unmixing with the direct classification, we have utilized support vector machines (SVM) directly on the data after luminance temperature conversion. For the experiments, the SVM algorithm [34] is trained with the average background signals extracted from the data and predictions are made on the data with the spectral signatures taken from NIST database [31]. The radial basis function kernel is utilized for the SVM. Implementation is carried out with the Python Sklearn library [35]. Fig. 7(a) and (b) illustrates the detection results for SVM for the sulfur dioxide and methane, respectively. Although the results for sulfur dioxide contain a gas region, the gas pipe in the scene is also detected as the target, which significantly decreases the performance. In the case of the LWIR image for methane, both the part, where the gas outlet is located and the entire closed area are marked as methane. In both of the LWIR images, SVM detects different small areas as gases in different regions. The detection performances are lower compared to the results in Figs. 5(e) and 6(e) for the proposed method, and the results in Fig. 4(a)-(d) given for the state-of-the-art gas detection methods.

B. Comparison of the Proposed System With the Similar Structure Using Conventional Unmixing
The experimental results in the case of using a traditional endmember estimation and abundance estimation method instead of the utilized 3-D convolution and autoencoder-based unmixing are presented in this section. Vertex component analysis [36] is used for endmember estimation as one of the baseline methods for endmember estimation in the literature. Fully constrained least square [37] method is utilized for the abundance estimation for the extracted endmembers. Fig. 8(a) and (b) show the abundance estimation results for Sulfur Dioxide and Methane data for one of the endmembers. The conclusions for the abundance results for the other endmember are also similar. The threshold value used for the abundance estimation is determined as 0.5 as in the proposed method while presenting the gas dominant regions. The results indicate that the abundances for the gas regions are not found correctly with vertex component analysis (VCA). The pure pixel signatures corresponding to the extracted endmembers are given in Fig. 8(c) and (d). As observed in the figures, the two endmembers indicate some similar characteristics with respect to each other. The background signal in LWIR thermal range is selected as an endmember by VCA. However, this background signal also exists with different abundances in the other pixels as well. As VCA selects endmembers from pure pixels in the data, the other endmember also resembles the background. Fig. 8(e) and (f) show the final decision images for sulfur dioxide and methane, respectively. For both images, the gas exit regions are detected instead of the gas regions. In the visual examinations made for this experiment, it has been observed that when different threshold values are used, very similar results to the ones in the SVM experiments can be obtained as well. However, as in other methods involving manual threshold selection, threshold selection is considered as a separate problem. For this reason, the selection of the most abundant material in the pixel was applied here as in the proposed method.

VI. CONCLUSION
We have proposed a deep learning-based gas detection method which combines 3D-CNN and autoencoder-based hyperspectral unmixing with neural network-based classification. The experiments reveal that a detection approach combining deep learning based unmixing with classification is better than the existing methods to handle the gas detection problem in LWIR range. An ablation study with respect to the possible different combinations for such a system, such as using direct classification methods or using the same structure with other unmixing methods are also performed. The proposed system combining unmixing with classification has given better performances than direct classification without performing unmixing. In addition, the 3D-CNN and autoencoder-based unmixing has indicated better results than the conventional unmixing for the proposed gas detection framework. The experiments have revealed that using SAM as a cost function in the proposed method yields more successful results than the MSE metric. The performed study does not require thresholding, unlike the conventional gas detection methods. Finally, the proposed gas detection method achieves better results than state of the art gas detection methods in LWIR range due to its high learning capacity with 3-D convolutional layers. Without loss of generality, the proposed system can be adapted to different gases by integrating the target gas signature into the classification module of the proposed system in the last stage.