Deep Convolutional Neural Network With Attention Module for Seismic Impedance Inversion

Seismic inversion is an approach to obtain the physical properties of the Earth layers from the seismic data, which aids in reservoir characterization. In seismic inversion, spatially variable physical parameters, such as impedance (<inline-formula><tex-math notation="LaTeX">$Z$</tex-math></inline-formula>), wave velocities <inline-formula><tex-math notation="LaTeX">$(V_{p}$</tex-math></inline-formula>, <inline-formula><tex-math notation="LaTeX">$V_{s})$</tex-math></inline-formula>, and density, can be determined from the seismic data. Among these, impedance is an important parameter used for lithology interpretation. However, the inversion problem is nonlinear and ill-posed due to unknown seismic wavelet, observed data band limitation, and noise. This requires complex wave equation analysis, prior assumptions, human expert effort, and time to analyze the seismic data. To address these issues, deep learning methods were deployed to solve the seismic inversion problem. In this article, we develop a deep learning framework with an attention module for seismic impedance inversion. The relevant features from the seismic data are emphasized with the integration of the attention module into the network. First, we train the attention-based deep convolutional neural network (ADCNN) by supervised learning with predefined acoustic impedance (AI) labels. Next, we train the ADCNN in an unsupervised way with the physics of the forward problem. In the proposed method, the predicted AI is used to calculate the seismic data (calculated seismic), and error is minimized between the input seismic data and calculated seismic data. Unsupervised learning has an advantage when the labeled data are inadequate. The proposed network is trained with Marmousi 2 dataset, and the predicted experimental results show that the proposed method outperforms in comparison to the existing state-of-the-art method.


Deep Convolutional Neural Network With Attention Module for Seismic Impedance Inversion
Vineela Chandra Dodda , Member, IEEE, Lakshmi Kuruguntla , Member, IEEE, Anup Kumar Mandpura , Member, IEEE, Karthikeyan Elumalai , Member, IEEE, and Mrinal K. Sen , Member, IEEE Abstract-Seismic inversion is an approach to obtain the physical properties of the Earth layers from the seismic data, which aids in reservoir characterization.In seismic inversion, spatially variable physical parameters, such as impedance (Z), wave velocities (V p , V s ), and density, can be determined from the seismic data.Among these, impedance is an important parameter used for lithology interpretation.However, the inversion problem is nonlinear and illposed due to unknown seismic wavelet, observed data band limitation, and noise.This requires complex wave equation analysis, prior assumptions, human expert effort, and time to analyze the seismic data.To address these issues, deep learning methods were deployed to solve the seismic inversion problem.In this article, we develop a deep learning framework with an attention module for seismic impedance inversion.The relevant features from the seismic data are emphasized with the integration of the attention module into the network.First, we train the attention-based deep convolutional neural network (ADCNN) by supervised learning with predefined acoustic impedance (AI) labels.Next, we train the ADCNN in an unsupervised way with the physics of the forward problem.In the proposed method, the predicted AI is used to calculate the seismic data (calculated seismic), and error is minimized between the input seismic data and calculated seismic data.Unsupervised learning has an advantage when the labeled data are inadequate.The proposed network is trained with Marmousi 2 dataset, and the predicted experimental results show that the proposed method outperforms in comparison to the existing state-of-the-art method.
Index Terms-Attention module, impedance inversion, neural networks, seismic data.

I. INTRODUCTION
T HE seismic reflection method is an effective method used to get the Earth subsurface layers information.In seismic Vineela Chandra Dodda and Karthikeyan Elumalai are with the Department of Electronics and Communication Engineering, SRM University, Amaravathi 522502, India (e-mail: vineelachandra_dodda@srmap.edu.in;imkarthi@gmail.com).
Anup Kumar Mandpura is with the Department of Electrical Engineering, Delhi Technological University, New Delhi 110042, India (e-mail: amandpura@gmail.com).
Mrinal K. Sen is with the Department of Geological Sciences, Jackson School of Geosciences, The University of Texas at Austin, Austin, TX 78712 USA (e-mail: mrinal@utexas.edu).
Digital Object Identifier 10.1109/JSTARS.2023.3308751reflection method, a pulse with short duration (seismic wavelet) is sent from the Earth's surface.The pulse penetrates inside the subsurface layers of the Earth to a certain depth.However, due to the impedance contrast between the adjacent layers, the waves are reflected from layer boundaries, and the reflected waves are recorded at the Earth's surface.The recorded seismic data contains information about the seismic source, reflection coefficients and noise.The reflection coefficients have information about the Earth layers, because it is derived from the layer impedance.To obtain the reflection coefficients from the seismic data, the data undergoes several processing steps, such as deconvolution, denoising, and NMO correction [1].After these processing steps, the seismic data retains only reflection coefficients.
The processed seismic data have amplitude and time information which gives only structural interpretation.Therefore, to obtain stratigraphic interpretation and reservoir characterization, we need to inverse the seismic data/reflection coefficient that gives the physical parameters of the layers.It is known as inverse modeling (seismic inversion).The seismic inversion retrieves the physical properties of the Earth layers from the seismic reflection data.In seismic inversion process, spatially variable physical parameters, such as layer impedance (Z), P-wave (V p ), S-Wave (V s ) velocity and density, porosity, sand/shale formation, and gas saturation, are estimated from the seismic data.These parameters have physical and geological meaning about the Earth subsurface layers, which helps in reservoir characterization [2], [3].Seismic impedance inversion, AVO inversion, and full waveform inversion are the commonly used seismic inversion methods, which helps to obtain the Earth subsurface properties.Among all these methods, seismic impedance inversion method is extensively used in the seismic industry and is an important goal in reflection seismology.Seismic impedance inversion is a powerful method for the Earth subsurface layers analysis, reservoir characterization, and fluid prediction.The impedance is a rock property, which gives information about lithology, porosity, and other factors [4].However, the impedance inversion problem is usually ill-posed, nonlinear, and nonunique because of unknown seismic wavelet, noises, and band limited nature of observed seismic data [5].All these issues are to be taken into consideration while solving an inverse problem.
Since 1960s, researchers have put forward many seismic impedance inversion methods, which are categorized based on poststack and prestack seismic data [6].In the poststack inversion method, the acoustic impedance is estimated from the seismic data by integrating well data and the basic stratigraphic interpretation.However the prestack inversion methods transform the seismic angle/offset into P-impedance, S-impedance, and layer's density by integration of well data and horizon information.Further, the poststack inversion methods are divided into two types: deterministic and stochastic inversion methods [7].The deterministic inversion methods are based on optimization methods, which can provide a good fitting model.These optimization methods aim to minimize the error between synthetic and observed seismic data.These methods produce smooth models but the uncertainties about the predicted values are not assessed [8].Most commonly used deterministic inversion methods are band limited recursive inversion [9], colored inversion [10], and sparse spike inversion methods [11].However, the stochastic seismic inversion methods retrieve the best-fit inverse model from the seismic data based on the probability density function of the data, which helps to assess the uncertainties [12].Most commonly used prestack inversion methods are simultaneous inversion [13], elastic impedance inversion [14], and AVO inversion methods [15], [16], [17].
Nevertheless, the conventional methods used in seismic inversion have some limitations, such as complex wave equation analysis, longer simulation times, and more human expert effort to analyze the seismic data.Moreover, the conventional methods incur convergence issues and high computational cost.Hence, to solve those limitations, researchers have come up with ideas to use artificial intelligence techniques in various geophysics problems, such as fault interpretation [18], seismic data denoising [19], seismic horizon estimation [20], seismic inversion [21], and so on [22], [23].Deep learning (DL) is the subpart of machine learning that has prominence and applicability in wide areas of science and engineering [22], [24].The DL has lots of scope to explore the seismic data for various applications, such as denoising, seismic inversion, and interpretation.In contrast to conventional seismic inversion methods, DL methods does not necessarily require the forward operator or wavelet matrix explicitly [25].In [22], the Earth surface elastic model is estimated (seismic inversion) from seismic data using convolutional neural network.The robustness of the network to predict P-impedance of seismograms is tested and has shown good accuracy for seismic data generated with source wavelet phase outside the training data.However, the CNN was unable to predict P-impedance for the seismic data generated with various wavelet frequencies.Further in [26], temporal convolutional network (TCN) was proposed to estimate seismic impedance from the seismic data.TCN network is a combination of both RNN and CNN, which overcomes the limitation of overfitting in CNN and gradient vanishing in RNN [27], [28].Moreover, long-term and short-term dependencies are captured by the network without the need of large number of learnable parameters.Later in [29], fully convolutional residual network (FCRN) with transfer learning approach was used for seismic impedance inversion.Although, FCRN has shown good accuracy and robustness against noise and phase difference of the seismic data, but the results were not accurate when tested with seismic data of different geological features.Hence, the authors proposed transfer learning approach, i.e., the parameters of FCRN trained on Marmousi 2 data were used as initialization for a new FCRN.In the next step, FCRN is trained with traces from overthrust model and tested the performance of FCRN.However, the major obstacle is in the availability of labeled data.Hence, researchers worked on alternative approaches to predict impedance with the limited usage of labeled seismic data.Therefore in [30], physics constrained seismic impedance inversion method was proposed based on DL where 2-D bilateral filtering constraint was proposed to improve the spatial continuity of the inversion results.In addition, it also reduces the nonuniqueness of the inversion problem.Later in [31], cycle-consistent generative adversarial network (CCGAN) was used for seismic impedance inversion.The CCGAN extracts information contained in the unlabeled data and in addition adversarial learning helps in better prediction rate.Moreover, a neural network visualization method was adopted to visualize the features learned from the trained model and compared with conventional open-loop CNN model.However, CC-GAN suffers from training instability like most of the GAN models.Hence in [32], Wasserstain cycle-consistent GAN-based network was proposed.Here, the authors improved the CCGAN with integration of Wasserstein loss with gradient penalty as the loss function.The network was tested on the 3-D seismic advanced modeling data.
However, in the field of geophysics, geological information is in nature multiscale in seismic data.The extracted feature by the CNN kernel plays different roles for different tasks.Different feature maps (FPs) obtained from various kernels acquire a variety of different features, which together contribute to accurate results.Attention focuses on processing these informations to achieve better accuracy under limited resources.Hence, we propose to integrate the attention module with the CNN and improve the accuracy of network model for the estimation of acoustic impedance.A block attention module was integrated into the network to extract features from the two dimensions; channel and spatial axes.The channel attention module emphasizes "what" features need to be extracted from the input data whereas spatial attention module says from "where" the feature has to be extracted in the input data.Therefore, the attention module helps for efficient information flow in the network thus leading to better representation power.We applied the attention module for two cases: supervised and unsupervised.In supervised case, we used true labels of AI and estimated the optimum parameters to predict impedance.In unsupervised case, the input to the network is seismic data, wavelet and low-frequency model of seismic data.Here, we do not use true AI labels.The unsupervised learning (UL) method finds applications where there is no labeled data.We demonstrate the effectiveness of the proposed method in each of these cases.In our work, we considered the P-impedance inversion in all the cases.The contributions of this article are as follows.
1) We introduced an attention mechanism to improve the CNN performance, i.e., convolutional block attention module (CBAM) and allow the neural network to focus on certain regions of an image that are most relevant to the task.The combination of both the channel and spatial attention blocks allows CBAM to selectively focus on the most important channels and spatial locations within an FP, allowing the CNN to effectively capture both local and global contextual information.2) Our study involved an analysis of two different approaches for learning network parameters: supervised and UL methods.In cases, where labeled data are available, supervised learning can be employed.This approach can also be utilized in transfer learning, where a pretrained model (obtained through supervised learning) is initialized before training the network in an unsupervised way.On the other hand, UL is employed when labels are not present.Consequently, this article encompasses both supervised and UL methodologies.3) We used a novel activation function scaled exponential linear unit (SELU) during the training process.The advantages of the SELU activation function are improved performance, self-normalization, stability, and efficiency.Moreover, we used Bayesian optimization (BO) tuner to optimize the hyperparameters, which resulted in time saving when compared to manual tuning.The rest of this article is organized as follows.In Section II, we discuss the methodology, which contains mathematical model formulation in Section II-A and we present the proposed method in Section II-B.In Section III, we illustrate and analyze the results of existing and proposed method.Finally, Section IV concludes this article.

A. Mathematical Model for Impedance Inversion
In this section, we formulate the mathematical model of the impedance inversion problem [1].According to convolutional model, the seismic trace is modeled as where S R n is the seismic trace, W R n×n is the seismic wavelet, and r R n is the normal incidence reflection coefficient, which can be represented in terms of impedance z as where z = vρ in which v is the velocity, ρ is the density, and t is the layer number.The extraction of reflection coefficients from the seismic trace is viewed as an inverse problem, given the seismic trace and wavelet information.In general, the inverse problems are nonunique and ill posed whereas in the seismic data, this is due to the band limited nature of the wavelet.Therefore, the recorded seismic traces S is band limited (low and high frequencies are filtered by the wavelet).Hence, we add constraints, such as sparse reflectivity series, known wavelet, and a low-frequency model, to obtain a unique solution for inverse problem.Let s i denote the ith seismic trace of length M. The group of N seismic traces {s 1 , s 2 , s 3 , . . ., s N } is expressed as Let z i be the corresponding acoustic impedance traces Here, we considered the system to be noise-free.The (1) is reduced to

B. Proposed Method
In this section, we first describe the attention module.Second, we discuss the proposed method with supervised learning approach for seismic impedance inversion and third, we describe the UL approach for seismic impedance inversion.
1) Attention Module: In this module, visual system of humans was taken as inspiration where we use an attention mechanism, i.e., series of glimpses to focus on main scenes rather than processing the whole scene for better visualization.Similarly, we added the attention block in the CNN architecture to better capture the features from the input data.The extracted FP is given as an input to the attention module, which further helps to obtain features from both channelwise and spatial attention based on CNN architecture [33].The FP obtained from hidden layer say L ∈ R C×H×W is given as input to the attention module, then it outputs channel attention map A c ∈ R C×1×1 and spatial attention map A s ∈ R 1×H×W .The attention process is given as follows: The channel attention values are copied to the spatial dimension and vice-versa in (4).L is the final output from the spatial attention module.In channel attention module, FPs are created based on interchannel relationship between the features.The spatial dimension of the input FP is squeezed using both average pooled and max pooling to improve the representation power of networks.The obtained average pooled and max pooled features are L avg c and L max c , respectively.These features are passed to a network which has one hidden layer of size set to R c/k×1×1 , where k is the reduction ratio.The output feature vectors are merged using elementwise summation given as where σ is the sigmoid function, Z 0 and Z 1 are the weights and SELU activation function is used after Z 0 .In spatial attention module, FPs are obtained using interspatial relationship between the features which focuses on "where" is the informative part.
Here, the max pooling and average pooling are applied to the channel axis and the obtained feature descriptors are concatenated.The spatial attention map A s (L) ∈ R H×W is created after passing the concatenated descriptors through a convolution layer where f 3×3 is a convolution operation with size 3 × 3 and σ is an activation function "sigmoid."These channel and spatial attention modules can be arranged in series or parallel.In our case, sequential arrangement has shown better accuracy than the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.parallel arrangement.The output of the complete attention module is given as input to the next layer.The detailed architecture of the attention module is as shown in Fig. 2.
2) Supervised Learning: CNNs are widely used in various research fields and achieved good results due to their feature extraction capability [34].We used poststack seismic data to estimate AI.The proposed network architecture for supervised case is as shown in Fig. 1.We used three convolution layers (layer1, layer2, and layer3), hence the network was named as deep convolutional neural network.These layers convolve the  input vector with a defined kernel given in layer configuration.The size of layer1 is 60 × 1 in which 60 is the number of output FPs with a stride of 1.The stride is a parameter in network filter that determines the movement of filter.The size of kernel is chosen in accordance with the central frequency of the source wavelet to capture maximum features.
The attention module is placed in a sequential order after layer1 in the network, as shown in Fig. 1.The output of attention module is passed to next convolution layer (layer2).The size of layer2 is 30 × 1 in which 30 is the number of output FPs.Layer3 is the same size as layer1 and layer2 and has one output channel with stride 1.After convolution layer, we add nonlinearity to the network with activation function.Various activation functions, such as tanh, sigmoid, ReLU, ELU, and SeLU, exist in the literature.We used SeLU because it helps the network to converge faster with a good fit compared to existing activation  functions.In addition, it helps to prevent vanishing gradient problem, which usually occurs with sigmoid function.During the training process, the output of the network is compared with the true impedance log and the loss is calculated using mean squared error (MSE) as the cost function where z i is the true AI data and ẑi is the predicted AI data.
3) Unsupervised Learning: In case of UL, where the true AI data are not available (such as field data), we use UL approach to estimate AI from input seismic data.The network architecture of the UL approach is shown in Fig. 3.The difference lies in the terms of cost function when compared to supervised learning.Here, we minimize the error between input seismic data and calculated seismic data.The output generated from the network (predicted impedance) is used to generate seismic trace (forward Algorithm 1: An Algorithm for Attention Module.

Require: input data
Step 1: Input the feature map.Compute max pooling and avg pooling on spatial dimension to obtain the descriptors L c avg and L c max .
Step 2: L c avg and L c max are given as input to the shared multi layer perceptron to obtain A c (L).
Step 3: Element wise summation needs to be done as shown in (5).Then Multiply input feature with the obtained A c (L) and initialize as an input to spatial attention module.
Step 4: To obtain spatial attention map, do avg-pooling and max pooling along the channel axis.
Step 5: Concatenate the feature descriptors L s avg , L s max generated and apply convolution operation to obtain spatial attention map A c (L).
Step 6: multiply input with A s (L) to obtain final refined feature map.
modeling) given in (1).The low-frequency model is added to the network output, calculate the reflectivity and convolve with source wavelet to obtain calculated seismic trace (s cal ) The calculated seismic trace is compared with input seismic trace (S) to minimize the loss using the MSE as a cost function where s i is the input seismic trace and s i cal is the calculated seismic trace.The optimum weights and biases are obtained by minimizing the cost function mentioned in (7) and (8) using backpropagation algorithms.Various optimization algorithms, such as stochastic gradient descent, adaptive gradient algorithm, root-mean-square propagation, adaptive moment estimation (ADAM) [35], have been studied in literature.In our work, ADAM is used as an optimization algorithm for back propagation.Let θ = {W k , b k }, the Adam optimizer update equation for θ t is given by where k(t) and l(t) are first and second moments evaluated from k t /1 − β 2 and l t /1 − β 1 after bias corrections.The terms of exponentially moving averages (l t and k t ) are obtained by using the formula respectively.The exponential decay rates are β 1 and β 2 for the first and second moments with values 0.9 and 0.999, respectively [19].Here, g t is the gradient calculated with regard to time and learning rate (η) is chosen as 0.001.Algorithm 2: An Algorithm for Seismic Impedance Inversion Using ACNN by Supervised and Unsupervised Learning.

Supervised learning
Step 1: Initialize the parameters such as W, b, batch size in the network.Randomly sample the data for training.
Step 2: for N epochs steps do Step 3: Input the seismic data S to the network in Fig. 1 and predict the AI (ẑ i ) Step 4: update the weights(W) and biases (b) using (7) Step 5: end for

Unsupervised learning
Step 1: Initialize the parameters such as W, b, batch size in the network.Randomly sample the data for training.
Step 2: for N epochs steps do Step 3: Input the seismic data S to the network in Fig. 3 and predict the AI (ẑ i ) Step 4: Calculate reflection coefficients from pre-dicted AI and convolve with wavelet to obtain seismic trace(calculated_seismic) Step 5: update the weights and biases by minimizing the cost function in (10) using ADAM.
Step 6: end for Output: Optimized parameters.

III. NUMERICAL RESULTS
The results of seismic impedance inversion are demonstrated in this section, and the proposed method is compared with the existing methods.The results are validated on Marmousi 2 model, which is briefly explained in the following.The efficiency of the proposed method is analyzed and compared with the state-of-the-art existing method [25].To measure the accuracy of the proposed method, MSE, Pearson's correlation coefficient (PCC), and coefficient of determination are computed between the estimated and true impedance traces.

A. Marmousi 2
The Marmousi 2 dataset is an extension of classical Marmousi model created by Allied Geophysical Laboratories [36].The classical Marmousi model consists of single reservoir, which was widely used for AVO analysis and to validate the imaging algorithms.The classical Marmousi model was extended to Marmousi 2, it is based on the Northern Quenguela Trough in the Quanza Basin of Angola.The Marmousi 2 model covers upto 3.5 km in depth and 17 km across.The model consists of 199 horizons and in addition water layer was extended to 450 m thus leading to complex stratigraphic details.

B. Training the Network
The acoustic impedance logs for Marmousi 2 are obtained by multiplying their p-velocity and density logs shown in Fig. 4. For each impedance log, we calculate the corresponding seismic trace using (1), as shown in Fig. 5.Both the impedance logs and seismic traces are normalized before training the network.We selected 60% of data for training the network and 40% of data are used for testing.In our work, first we train the network in a supervised way with true AI labels.We trained the network with 2000 epochs and a batch size of 32.An epoch means training the network with the complete training data once.After training (supervised), the network is tested with test data to estimate the acoustic impedance, which is shown in Fig. 6.
We randomly extracted the true and predicted impedance traces and a comparison plot is made, as shown in Figs. 8  and 9, for the existing and proposed method, respectively.For instance, consider the trace at depth 200 m, we can see a good correlation in Fig. 9(a) compared to Fig. 8(a).The importance of attention module is visualized through FPs, which are plotted in Fig. 10.We selected every 8th FP among the configured 60 FPs where Fig. 10(a) shows input FP to the attention module, i.e., channel attention module.Fig. 10(b) and (c) denotes the output of channel attention module and spatial attention module, respectively.From Fig. 10(c), we observe that the prominent features are extracted from the output of attention module.This clearly indicates the prominence of attention module.In particular, we notice that output features from the attention block show the layer boundaries very clearly.Further, we trained the network in an unsupervised way, i.e., without the need for true AI, as shown in Fig. 3.The hyperparameters are chosen as in the case for supervised learning.The trained network is tested with test data and the estimated AI is as shown in Fig. 7.For comparison, we have taken random impedance traces at various depths which are shown in Figs.11  and 12.The training loss curve is shown in Fig. 13.From these plots, we can observe the superiority of the proposed method compared to the existing method.When compared to the supervised learning, UL (correlation of 0.9764) has less correlation since in supervised we use true AI labels where as in unsupervised, we do not have an idea of true AI labels.
Table I shows various performance metrics used to evaluate the proposed method when compared to existing methods on Marmousi 2 dataset.The brief description about these metrics Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Coefficient of determination (r 2 ): r 2 provides measure of how well observed outcomes are obtained from the model based on the proportion of total variation of outcomes given by the model where z is the average of {z i } N i=1 Pearson correlation coefficient: PCC is a statistic used to measure the correlation between two variables (data).It gives information about the magnitude and direction of correlation The results are produced by performing simulations on an Intel Xeon Silver 4216 CPU @2.10 GHz (two processors) with 256 GB RAM, 64-bit operating system.The software used is Spyder environment from Anaconda Navigator.The training loss is calculated for both supervised and unsupervised approaches shown in Fig. 13.It took around 2 min.to run the python code and obtain the results for supervised case where as for unsupervised case it took 5 min to get the results.The reason behind this is as UL has to perform forward modeling to generate calculated seismic data.As a result, computation time is increased compared to the supervised case.The hyperparameter tuning is done through BO tuner.The obtained parameters (shown in Table II) are used in the training process where number of layers is chosen as 3 and ADAM as optimizer with a learning rate of 0.001 to minimize the cost function.In addition, we performed noise resistance tests for the proposed method.We added Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.gaussian noise to the seismic data and analyzed the accuracy of inversion (supervised learning).The proposed method works efficiently even if the input seismic data are noisy with PCC of 0.963.The noisy seismic data and estimated impedance graphs are shown in Fig. 14.

TABLE II OPTIMAL HYPERPARAMETERS
However, the proposed method has some limitations, which is common with DL methods.The training data are very important in data-driven methods.In general, in these methods, the training and testing data with similar characteristics shows better performance.With the use of different distributed and large amount of data in the training process, we can obtain the generalized model, which works well on any test data but at the cost of computational resources.Hence, in the future we would like to use the concept of federated learning to better optimize the computational resources.

IV. CONCLUSION
In this work, we presented a novel approach to address the impedance inversion problem.Our method involves incorporating the attention module CBAM into the neural network architecture, enabling the retrieval of salient features from the input data.We explore two different data analysis approaches: supervised and unsupervised.Supervised learning is utilized when labeled data are available, while UL is employed in the absence of labels where physics of the inverse problem is used.In addition, we leverage the SeLU activation function for its demonstrated stability and efficiency.To automatically optimize the hyperparameters and reduce network training time, we utilize BO.We used poststack seismic data to demonstrate the results.The results show significant improvements compared to existing methods, as evidenced by metrics, such as MSE, PCC, and coefficient of determination in estimating AI.

Manuscript received 18
May 2023; revised 12 August 2023; accepted 19 August 2023.Date of publication 29 August 2023; date of current version 12 September 2023.This work was supported by the Department of Science and Technology, Science and Engineering Research Board (DST-SERB), India, through Core Research under Grant CRG/2019/001234.(Corresponding author: Karthikeyan Elumalai.)

Fig. 10 .
Fig. 10.FPs (randomly taken eight FPs from 60 FPs) for the attention module.(a) Input FPs to the channel attention module.(b) Output of the channel attention module.(c) Output of the spatial attention module.

TABLE I COMPARISON
OF VARIOUS METRICS WITH THE EXISTING CNN METHOD FOR MARMOUSI 2 DATA (SUPERVISED)