The Underground Explosion Point Measurement Method Based on High-Precision Location of Energy Focus

Source positioning based on energy time-inverse focus is a hot subject in the sphere of shallow underground source positioning. Due to the grave wave group aliasing and the complex, irregular geological structure typical of the shallow underground explosion, the reconstruction accuracy of the energy focus is low and thus the recognition of the focus is a difficult task, ultimately leading to a low accuracy of source positioning. To address the above problems, this article proposes a method based on deep learning energy focus recognition, whereby the process of recognizing and positioning the energy focus in an energy field is made equivalent to the end-to-end feature extraction of the energy field–energy focus. The time-variant characteristics of explosive vibration signals are put to use in the construction of an adaptive time window. First, within the time window and by combining cross-correlation and autocorrelation operations, a 3D energy field image sequence in the time-space domain is produced by grouped energy synthesis; second, a densely connected 3DCNN network is built and, through multiple layer span layer splicing, a higher repetitive use is made of the focus point features in the energy field images; third, a spatial pyramid pooling network is used to extract multi-scale features from different focus areas, which helps achieve high-precision focus recognition. Finally, numerical simulations and field tests were conducted.The results demonstrated that compared with the quantum particle swarm optimization (QPSO)-based energy focus search method, the proposed one is more effectively in recognizing the coordinates of the focus in the energy field, thus allowing high-precision localization of shallow underground sources. This method is of some engineering application value in the field of underground source positioning.

inrushes and monitoring underground chamber blasting, etc.
Compared with deep-seismic, deep coal mining, and petroleum exploration source positioning or similar extensive, deep, and long-duration source positioning, the shallow source positioning method described in this article offers the following characteristics: (1) The number of sensors required for shallow positioning is small, and the layout can be arbitrary, unlike the high-density layout in natural earthquake source positioning; (2) The depth of underground seismic sources are shallow, typically not more than 100m. The geological structure within shallow depths is complex and unknown, so it is impractical to develop a shallow speed field model from the deep earth crust structure model; (3) The vibration wave group is complex in aliasing and, as the near-field soil, due to its constitutive characteristics, is elastoplastic, the elastic wave is greatly affected as a result of reflection and refraction by the ground, weakening the phase [4], [5]. Therefore, the conventional natural earthquake positioning method is not fit for locating shallow sources.
With the ever-advancing earthquake exploration and computational imaging theory, the positioning technology based on energy field imaging has become a hot subject in the field of underground source positioning. It does not depend on the extraction accuracy of the seismic phase characteristic parameters and works by scanning the focus position of the underground energy field to locate the source. It is one of the best ways to solve the problem of shallow underground space source positioning.
Gajewski et al., who first proposed this method, suggests that the signals obtained by the ground sensors be used as the boundary conditions for the time-inverse wave field simulation and reconstruct the time-inverse energy field of the source through 2D and 3D single pass acoustic wave equation continuation [6]. The feasibility of the method has been verified by many numerical simulation experiments. Sava et al. proposes some improvements on this method: after performing quasi-Wigner distribution function transform on the microseismic time-inverse wave field, the energy field is reconstructed using the interferometric imaging condition, which improves the energy field focusing accuracy [7]. Li et al. propose a weighted elastic wave interferometric imaging method for microseismic positioning, which improves the reconstruction accuracy of the energy field [8].
As can be seen from above, researchers mainly try to improve the accuracy of source positioning by improving the reconstruction accuracy of the energy field. However, this type of method depends to a higher degree on the speed model when performing energy field imaging, thereby poses certain requirements on the number and layout of the sensors.
Due to the complexity of the underground velocity model as well as the limited number and arbitrary layout of the sensors in shallow source positioning, the conditions for high-precision reconstruction of the underground energy field are beyond attainment. Before expanding the applicability of this method, an urgent problem that needs to be solved in the field of underground positioning with low energy field reconstruction accuracy is how to quickly and effectively recognize the energy focus. Kao et al. proposes a positioning method of the grid search class, the main principle of which is to decompose the underground monitoring area into grids of equal size, after which it is determined by analyzing and calculating each grid whether the source location is situated in the grid [9]. Angus et al. has improved on the method by using the speed model correction combined grid search method, which improves the positioning accuracy of the source [10]. Lagos and Velis, drawing on previous research, compare global optimization calculating methods such as very fast simulated annealing (VFSA), QPSO, and grid search class methods and, by analyzing 2D and 3D seismic source positioning results, conclude that the accuracy is the highest when the energy focus is recognized using QPSO [11]. Although the QPSO-based energy focus recognition method improves the accuracy of the energy field source positioning, it involves some blindness and randomness in the search, leading to unstable focus recognition and poor robustness.
To address the above problems, this article proposes a positioning method based on deep learning energy focus recognition, whereby the process of recognizing and locating the energy focus in an energy field is made equivalent to the end-to-end learning of the energy field-energy focus. First, with a small number of sensor nodes, the number of energy field samples along with the generalization of the model are increased by group mutual imaging. Second, with a densely connected 3DCNN and in a multi-layer span layer splicing mode, the number of features of the energy focuses is maximized with a small number of energy field samples. Finally, a 3DSPP network is used to perform high-rise multiscale feature extraction on focus areas of different sizes to find the focus coordinates, hence a method for extracting features of energy focuses.

II. PRINCIPLE OF UNDERGROUND 3D ENERGY FIELD TIME-REVERSE IMAGING
Time-inverse imaging operates on the reciprocity of the direction of vibration wave propagation [12]. From the vibration signal acquired by the sensor array a reverse wave field in a reverse propagation is developed, as shown in Fig. 2 (a). With reasonable imaging conditions, a virtual 3D energy field containing source information is established, of which the energy focus point is the source, as shown in Fig. 2 (b). The time-inverse propagation mode and the imaging conditions of the wavefield are the keys to high-precision time-inverse imaging [13], [14].

A. REVERSE PROPAGATION OF UNDERGROUND WAVEFIELD
Suppose the vibration wave propagates from the source X s = (x s , y s , z s ) to the surface, and the signal recorded by the i-th sensor X ri = (x ri , y ri , z ri ) is D(X ri , t). According to the underground wave theory, the forward propagation process of the wavefield can be expressed as [15]: where, G and S are respectively the Green function and the source function, t and w are respectively the propagation time and the frequency; and D is the signals recorded by the sensors, and F −1 represents inverse Fourier transform.
If the signal D(X ri , t) acquired by the i-th sensor is taken as a virtual source of a reverse propagation in the underground space. Then the signal at any underground point can be expressed as: where, * represents complex conjugation, R i (X , t) is the wavefield formed by the i-th sensor in time-inverse propagation, and X = (x, y, z) represents a location in the underground space. A 3D high-order finite difference algorithm is used to evaluate the following acoustic wave equations, the solution of which leads to the time-inverse back propagation field. The differential formula is: where, v denotes the wave propagation speed in the underground medium, T is the total time length of the vibration signal recorded by the sensor, and t is the propagation time of the reverse propagation wavefield.
With the optimal layer matching method as the boundary condition, assuming the calculation area is D = the expression of the boundary absorption condition in the process of back propagation is as follows:

B. IMAGING CONDITIONS
Proper imaging conditions are critical to suppressing noise and reaching accurate imaging of the source [16]. There are two types of noise in the process of the time-inverse imaging: 1. Noise in the vibration signal acquired by the sensor; 2. Noise generated by the high-order finite difference algorithm in time-inverse back propagation (calculation error). For the two types of noise, the following two imaging conditions are proposed.

1) AUTOCORRELATION IMAGING CONDITIONS
In the process of wavefield reverse-time continuation, every spatial point in the ground is autocorrelated. The instantaneous energy information of each point is obtained by linear amplitude superposition. Energy information is integrated over the time domain to produce 3D time-reverse energy field images, and the specific imaging condition is as follows: where, R i (X , t) is the wavefield formed by the i-th sensor in time-inverse propagation, N is the total number of sensors, and T is the total time length of the vibration signal recorded by the sensors. This method suppresses the noise in sensor signals, but is unable to eliminate the imaging noise generated by the time-inverse back propagation of the wavefield [17]. VOLUME 8, 2020

2) CROSS-CORRELATION IMAGING CONDITIONS
In this method, each detection point is treated as an independent source, and time-inverse back propagation is then performed separately. Cross-correlation operation is performed on the wave field information of each spatial point. The energy is integrated over the time domain, to arrive at the 3D energy field in the spatial domain. The imaging condition is as follows: where, R i (X , t) is the wavefield formed by the i-th sensor in time-inverse propagation, N is the total number of sensors, and T is the total time length of the vibration signal recorded by the sensors.
The noise between the time-inverse wavefields of the sensors can be eliminated through cross-correlation operation, which improves the resolution and SNR of the imaging. However, this method is insufficient in its ability to suppress the noise in the sensor signal. Meanwhile, for all sensors 3D high-order finite difference operation shall be carried out during time-inverse back propagation, which will inevitably involve greater calculation load and lower efficiency [18], [19].
To address the above problems, this article proposes a group mutual imaging condition, as follows: where, Image(X , t) is the instantaneous 3D energy field images of the underground space, R i (X , t) is the wave field formed by the i-th sensor in reverse propagation, L is the number of sensors in the group, M is the number of sensor groups, t 1 and t 2 is the starting time and end time of the time window, and T is the length of the time window. For vibration signal R(t), find its Hilbert transform spectrum R h (t) to get its analytical signal The local signals near time t can be expressed using analytical signal c(t): where, (t) = d dt arg c(t), which represents the instantaneous frequency. The symbol * stands for complex conjugation.
The length of the adaptive time window at time t is expressed as T (t), defined to be: where, k is the weight. In Eq. 7, the sensor arrays are grouped for imaging based on the principle of high-dimensional spatial similarity. First, the efficiency of time-inverse imaging and the diversity of energy field samples are improved by grouping. Second, autocorrelation and linear amplitude superposition operations are performed within the sensor groups. This helps to eliminate the noise of the vibration signal, improve the resolution of the energy field imaging at each moment. Finally, inter-group cross-correlation operation is performed to eliminate the imaging interference caused by the time-inverse back propagation, to arrive at the 3D energy field for each moment.
In the time domain, the time window length is determined having regard to the instantaneous frequency characteristics of the signal. The energy field information within the length of the time window is linearly superimposed. This superimposed energy field is used as the instantaneous energy field at that moment, as shown in Fig. 3.
The energy focusing degree of the instantaneous energy field image is improved in this way, and at the same time, the 3D energy field images in the space domain are converted into a 3D energy field image sequence in the time-space domain.

III. METHOD OF ENERGY FIELD FOCUS RECOGNITION AND POSITIONING
So far, researchers mainly use QPSO and other swarm intelligent optimization algorithms to locate the energy focus by scanning a large area [20]. This method operates on the energy focusing principle, constructs the energy flow objective function using the QPSO algorithm. This objective function is evaluated by iterative optimization to find the energy focus. The energy objective function is as follows [18]: where, A T k is the signal amplitude recorded by this sensor at time T , e T k is the direction of propagation, e k (m, x, y, z) is the direction cosine of the seismic wave propagation at the k-th sensor as estimated by the above finite difference time-inverse numerical simulation, m is the stratum velocity information, (x, y, z) is the source coordinates, |·| represents the vector dot product, and p represents the 2-norm, and N is the number of sensors. Locating the source by QPSO involves random parameters in the particle updating equation, and additionally the initial particles are randomly generated too. Therefore, the source location results are tainted with some randomness, unable to guarantee the robustness of the solution [14], [21].
In respect of the above problem, the advantages of deep learning in the field of image feature recognition is put to full use by equating the focus point positioning process with the building and training process of the end-to-end network model of the energy field image-focus point. The height, length, width and number of channels of the energy field image make up the 4D tensor information for deep learning; Fig.4 shows a network model consists of a densely connected 3DCNN + a 3D pyramid network + an FC. First, a dense convolutional 3DCNN is employed to reuse the feature information in the network through multiple layer span layer splicing, which effectively solves the problem of gradient disappearance caused by limited samples. Second, the 3DSPP model is used to extract multi-scale and high-resolution features of the feature map, which improves the recognition rate of the focus. Finally, the FC layer is connected and the source coordinates are output.

A. 3DCNN BASED ON DENSE STRUCTURE
Dense convolutional neural network (DenseNet) is composed of multiple dense connection modules [22], [23]. In each dense connection module, in order to maintain feedforward resistance, each layer splices up the output of all previous layers, with the splicing result used as the input of this layer, and then the obtained feature map is passed to all subsequent layers. Therefore, dense connection modules make it possible to reuse the features to a greater degree, and feature redundancy can be alleviated by learning fewer parameters, and the problem of gradient disappearance is solvable.
Assuming that the network consists of L layers, the output of layer l is expressed as x l . For each layer, a nonlinear transformation H l (·) is implemented, which includes three consecutive operations: batch normalization (BN), rectified linear unit (ReLU), and convolution. Therefore, x l can be expressed as follows: where, [x 0 , x 1 . . . x l−1 ] means splicing up the features from layer 0 to layer l − 1. Taking advantage of dense connections in DenseNet, this article introduces it into 3DCNN to form a dense structurebased 3DCNN [24], [25]. With this network, the feature information in the 3D energy field image is extracted. Its network structure is shown in Fig. 5.
The proposed densely connected 3DCNN contains 7 layers of networks, 5 convolutional layers, and 2 pooling layers. The 3D energy field images as the input and the feature maps as the output are applied to connecting subsequent 3DSPP networks. The above-described network forms a dense connection module through multiple layer span layer splicing, which enables the reuse of the feature information. Pooling layer P1, convolutional layer C2, and convolutional layer C3 make up dense connection module 1; the feature maps generated by pooling layer P1 and convolutional layer C2 are, after being spliced, used as the input of convolutional layer C3. Pooling layer P1 and convolutional layer C2, convolutional layers C3, and C4 make up dense connection module 2, that is, the feature maps generated by pooling layer P1, convolutional layer C2, and convolutional layer C3 are spliced and then used as the input of convolutional layer C4. Pooling layer P1 and convolutional layers C2, C3, C4 and C5 make up dense connection module 3, that is, pooling layer P1, the feature maps generated by convolutional layers C2, C3 and C4, after splicing up, are used as input to convolutional layer C5.
The specific network parameters of the densely connected 3DCNN are shown in Table 1. The 3D convolution kernel size, step size, and Padding method are the same across all convolution layers, so they are not included in the table. To accelerate the convergence during network training, an extra batch normalization (BN) layer is included [26], [27]. Here are the notes on some parameters in Table 1: (1)The size of the energy field samples is 96 × 192 × 192; (2)The image format of the network input layer and the representation format of the feature map are as ''number of channels @ height × length × width''; (3)Each 3D convolution kernel has a size of 3 × 3 × 3, convolution kernel step sizes being all 1 × 1 × 1. Padding is of 'SAME' mode, which makes the spatial size of the feature map remain unchanged after 3D convolution operation; (4)3D pooling is of a 'max-pooling' type.

B. 3DSPP MODEL
SPP is a new type of network that is capable of multi-scale feature extraction [28], [29]. In this article, such a network is introduced into the recognition of the energy focus, and hence a 3DSPP model designed. By dividing a feature map into different sizes, multi-resolution recognition of the feature of the focus is made possible, thereby improving the recognition accuracy of the focus.
Assuming that the size of the input feature map is a ×a, the SPP layer evenly divides the feature map into n ×n bins, with n being customizable, and then the maximum pooling of each bin generates n×n feature vectors. Even division of different sizes can be made a number of times. Then, the pooling window size is win = [a/n], and the step is str = [a/n]. Among them, · and · represent rounding up and rounding down.
A 3D energy field image, after going through a dense structure-based 3DCNN, outputs a feature map of 32 @ 24 × 48×48 in size. After going through the dense structurebased 3DCNN, connection is made to the 3DSPP model. The actual network structure of the 3DSPP model is shown in Fig.6.
The 3DSPP model structure under study incorporated a total of 5 3DSPP layers, namely 1 × 2×2, 2 × 4×4,  Dimension reduction was performed on the feature maps through the pooling layers. As shown in Fig. 6, the final output was 32D eigenvectors to the number of 4 + 32 + 256 + 2048 + 6912 = 9252. When it was flattened into a one-dimensional vector, we got 1 × 9252×32 = 296064. In this way, the feature maps could be reduced in dimension, such that the network outputs eigenvectors of a fixed size; meanwhile, multi-level 3DSPP could be employed to extract features across the energy focus area at multiple scales, which significantly improved the accuracy of the focus recognition [30].
After being converted into a 1D vector, it was connected to an FC layer [31]. Since the source coordinates were defined by x, y, z, this article used a FC layer containing 3 neurons. For fear of over-fitting in training the FC layer, the Dropout technology p = 0.5 was employed so that half of the connection weights were randomly inactivated. In the course of training, the optimization function was based on the stochastic gradient descent method, with the mean square difference as the loss function.

IV. SIMULATION VERIFICATION
To demonstrate the feasibility of this method, an underground source positioning model was built with MATLAB. The energy focus point was located with the deep learning method and the QPSO separately. The positioning accuracy was evaluated in terms of Root Mean Square Error (RMSE), SEP (spherical error probable), and relative errors in the X, Y, and Z directions. RMSE evaluates the deviation of the positioning results of the two methods from the actual sources, SEP evaluates the uncertainty of the positioning results, and the triaxial relative errors evaluate the positioning accuracy of the two methods in 3D space.
The underground source simulation location model had its center on the origin of the surface and was of X [−50m, 50m], Y [−50m, 50m], and Z [0m, −50m] in size. In this underground model, the P-wave propagation velocity was 800m/s, and there were 9 preset sources, which were Ricker wavelets. The underground space noise was Gaussian white noise, the SNR of which was 20dB.
35 vibration sensors are arranged at equal intervals on the surface of the model to form an array of vibration sensors. The signal sampling rate was set to 100kHz and the sampling time to 0.2s. The deep network model was trained using the first 8 source locations and their corresponding sensor signals; the data acquired from the 9th test were used for evaluating the deep learning and QPSO methods regarding their focus point recognition accuracy. The following paragraphs describe, using as example the sensor data acquired from the first blasting, how a 3D energy field image is generated in the deep learning process. Fig. 7 (b) shows some of the sensor signals corresponding to the first blasting source.
The underground space was divided into grids, and 4 were selected from the 35 sensor signals to form up a group. A total of 50,000 energy field images were generated at a time through autocorrelation imaging. Fig. 8 gives some of the energy field images.
The results of autocorrelation imaging were subjected to cross-correlation and amplitude superposition operations. A part of the energy field image sequence generated by the group cross-correlation method is shown in Fig. 9.
More than 200,000 energy field image samples were generated through grouped cross-correlation. To increase the   generalization of the model, samples with different energy field focusing effects, including: (1) incomplete energy focusing areas, (2) divergent energy focusing areas, (3) low SNR energy focusing areas, and (4) low intensity energy focus points, were taken as training data. Table 2 gives the source coordinates corresponding to 9 simulation experiments; the explosion data of groups 1-8 were used as training samples for deep learning. The 9th group of experiments was used to evaluate the positioning accuracy of the deep learning method and QPSO algorithm.
With the source locations of the first 8 times as the output tags, the energy field images and the source locations were loaded respectively into the deep neural network. The times of training of the network was set to 2,000, to establish the loss value and the accuracy curve of the network in training. At the end of training, 10,000 samples were loaded into the network, to get the network's loss value and accuracy curve during the test phase, as shown in Fig. 10.
As can be seen from the Fig.10, the proposed deep neural network produced curves of the same trend in training  and testing. Stabilization appears after 400 trainings, and the loss value drops to 0.02 when iteration proceeds to 1,000 times, giving a test accuracy stabilizing at 95%. At the end of the network training and testing, the source data of the 9th blasting were loaded into the network, to give the energy focus location To determine the benefit of this method, the energy focus was located with the ninth experiment as an example and using the QPSO algorithm [18], [32], [33].To begin with, the search range of the source was set to x(-50m,50m), y(-50m,50m), z(-50m,0m), the swarm size to 40, the spatial dimensions to 3, the iteration times to 5000. The initial particle swarm was randomly generated. Second, with the energy target function as the particle fitness value, the location with the greatest fitness value among the swarm was found and taken as the optimal source position of the current source swarm. Third, the average optimal location of the 40 swarms was determined, and each source location was updated by the evolution equation. Finally, when the number of iterations was reached, the optimized source location was output. Fig. 11 gives a trajectory graph of particles searching for the energy focus in an energy field.
As can be seen from Fig.11, the large-scale scanning search of the particle swarm in the underground space, it is gradually converges when iterating to 1400 times.  After completing the deep neural network model training and searching the QPSO focus, the two methods were evaluated for their real-time positioning accuracy in terms of RMSE in the course of 2000 iterations, as shown in Fig.12.
As can be seen from Fig.12, during the 2000 iterations, the QPSO method searches the energy focus in steps, and after 1400 times of iteration, the search result begins to stabilize and the ''precocity'' phenomenon appears. The RMSE eventually approaches 0.26m. In contrast, with the deep learning method, in the energy focus search process the RMSE drops rapidly to 0.15m after 400 times of training. Then the energy focus was further recognized using the deep neural network method, to end up with an RMSE of 0.09m. Throughout the 2000 iterations, the method based on deep learning fared better than the one based on QPSO.
To further compare their positioning performance after the focus point search stabilizes down, more comparison was made based on the positioning results reached between the 1400th and 2000th time of iteration [34]. The performance is evaluated in terms of SEP (spherical error probable), X-axis, Y-axis, and Z-axis relative errors [35], [36]. Fig. 13 shows the SEP error curves of the two methods. It can be seen that from 1400 to 2000 times of energy focus point scanning, with the deep learning-based source localization method the SEP radius is significantly smaller and the accuracy of focus positioning is higher. Fig.13 shows the relative error curves, in the X, Y, and Z directions, corresponding to the two methods. As can be seen from the Fig.13, in the X and Y directions, the two methods are close in their positioning errors; in the Z direction, the localization based on deep learning is slightly better than the one based on QPSO.
The positioning accuracy was evaluated in terms of Root Mean Square Error (RMSE), SEP (spherical error probable),  and relative errors in the X, Y, and Z directions. From Fig.12, Fig.13 and Fig.14, the positioning results of the two positioning methods in the 9th test can be found, as shown in Table 3.
As can be seen from Table 3, in point of RMSE, SEP, and relative errors in X, Y, and Z directions, the proposed method performs better in focus positioning accuracy than the method based on QPSO, though the advantage is not much obvious. This is mainly because the simulation dealt with ideal underground medium, which favors a higher energy focus, making focus recognition less of a hard task. Anyway, when it comes to the calculation speed, the proposed method is more competitive. To output a set of valid positioning results, the QPSO has to complete 2000 times of searching in the energy field. Conversely, the deep neural network, once trained using the network parameters, is able to output a set of valid positioning results after one running.

V. EXPERIMENTAL VALIDATION
To further evaluate the feasibility of this method in engineering applications, it was necessary to conduct field tests. The project team conducted a field experiment of source positioning at Shanxi 247 shooting range. A comparison with the QPSO-based method further demonstrates the advantage of this method in engineering applications.  The experiment included a total of 64 sets of sensors and 10 sets of multi-channel acquisition nodes and wireless bridges, developed by North University of China, to build up an acquisition system of distributed source parameters, as shown in Fig. 15.
The actual coordinates of the TNT explosive, as shown in Table 4, was measured with the BeiDou positioning system CC50II-BG-T.
The experiment involved a total of 8 explosions, as shown in Table 4. The 1-5th initiation locations, in different quadrants, were used to train the deep neural network model, and the 6-8th test data were used to evaluate the positioning accuracy of the method. The 64 sensor nodes were arranged in the source monitoring area, as shown in Fig. 16.
During the actual experiment, holes 50m deep were drilled with a drilling rig, to hold TNT explosives. A sample of the excavated underground medium is shown in Fig. 17. The medium in this area, complex in geological structure, is dominated by rock strata.
A multi-channel data acquisition system was set up, which had a sampling rate of 20 kHz and a sampling time of 2s. After the end of each test, the data were transmitted back to VOLUME 8, 2020    the control master station for processing. Fig.18 shows part of the data acquired by the sensor array following the first explosion.
The generation process of the energy field samples will be described here using the first test as an example. First, the underground space was meshed, with x(-50m, 50m), y(-50m, 50m), and z(-50m, 0m) as the energy field reconstruction range. Second, the sensor array was grouped using the high dimension space distance method. Finally, 3D energy field images were obtained using the grouped crosscorrelation method. The size of each 3D energy field image was 96 × 192×192.  The time corresponding to the first and the final phase was evaluated, and in this test the period of the valid signal was found to be 0.4s-1s. With a sampling rate of 20kHz, the number of valid samples was 60,000. Following the method of grouped cross-correlation, 3D energy field samples in the effective time range, that is, 4D tensor information of deep learning, were obtained, and corresponding source coordinates were used as the output tag. Samples with different energy field focusing effects were taken as training data, as shown in Fig. 19. The following: (1) incomplete energy focusing areas, (2) divergent energy focusing areas, (3) low SNR energy focusing areas, and (4) low intensity energy focus points, were among the training data.
Similarly, the training data of the second to fifth tests were used as the training data of the network, with the times of network training set to 2000, to train the deep neural network. After completing network training, 10,000 random samples were loaded into the network to test out its performance, for which the corresponding loss values and accuracy curves are shown in Fig. 20.
As can be seen from Fig.20, after 1300 times of deep network model training, the internal parameters of the network stabilized practically. The loss value is close to 0.8, and  the accuracy is as high as 93%. Compared with the training process during simulation, adjusting network parameters took significantly greater time. The network was tested after training. The loss values and accuracy curves of testing resemble those of training; however, due to the complex geological structure and grave energy field focus blurring, the final test accuracy of the network was 87%, lower than in simulation, which was 95%. After completing the network training and testing, the network was made to output the source coordinates for the 6-8th explosions, which are given in Table 4. To examine the benefits of this method, the source coordinates for the 6-8th explosions were also estimated using the QPSO algorithm. To begin with, the search range of the source was set to x(-50m,50m), y(-50m,50m), z(-50m,0m), the swarm size to 40, the spatial dimensions to 3, the iteration times to 2000. The initial particle swarm was randomly generated. Second, with the energy target function as particle fitness value, the location with the greatest fitness value among the swarm was found and taken as the optimal source location of the current source swarm. Third, the average optimal location of the 40 swarms was determined, and each source location was updated by the evolution equation. Finally, when the number of iterations was reached, the optimized source location was output. Fig.21 gives the searching map of the 6th explosion based on QPSO.
It can be seen from Fig. 21 that, due to the complex geological structure, the problems of focus blurring and focus artifacts are more prominent. The QPSO method develops ''precocity'' in its searching in the energy field, and converged locally to a false focus.
After the deep neural network model training and the QPSO focus searching, the two methods were evaluated for their real-time positioning accuracy in terms of RMSE in the course of 2000 iterations, as shown in Fig. 22.
As can be seen from Fig. 22, in point of RMSE, the positioning accuracy of this method is significantly better than that of the QPSO-based method in the 6th, 7th, and 8th energy focus point positioning process. In the case of the QPSO-based method, the RMSE results, in all the three times, drop stepwise in the focus positioning process. More specifically, for the 6th shot the convergence begins at the 1400th iterations and the RMSE drops to 0.548m in the end; for the 7th shot, the convergence begins at the 1600th iteration and the RMSE ultimately drops to 0.425m; for the 8th shot, the convergence begins at the 1200th iteration and the RMSE ultimately drops to 0.750m. With the proposed deep learning method however, the RMSE keeps falling throughout the 2000 iterations, and the positioning error is less than 0.2m for all the 3 shots.
To further compare their positioning performance after the focus point search stabilizes down, the performance was evaluated in terms of SEP (spherical error probable), and X-axis, Y-axis, and Z-axis relative errors. Table 5 compares From Fig. 22 and Table 5, the actual positioning results of the three methods are found, as given in Table 6.
As can be seen from Table 6, under the actual geological conditions and in terms of positioning accuracy and calculation time, the proposed method works significantly better than the QPSO-based energy focus positioning method. The QPSO-based energy focus positioning method develops certain blindness and randomness in the process of scanning the energy focus. The RMSE varies between 0.425-0.750, and the SEP varies between 0.237-0.745. The greatest of relative errors of the three axes is close to 0.6m.
Its positioning accuracy is unstable and the robustness is poor. With the proposed method however, the RMSE falls between 0.15m-0.20m, the SEP stabilizes between 0.03-0.05, and all the relative errors in the three directions are below 0.040, hence giving higher robustness and having evident positioning advantages. In point of positioning speed, this method shortens the calculation time from 20s to 1.5s. It thus follows that the proposed method has evident advantages in focus recognition and positioning.
However, it is not without demerits when compared with the QPSO method. 1. This method necessitates detonating several seismic sources in advance in order to train the deep neural network. This, as a result, drives up the experiment cost and time. 2. The network training is a long process. In this experiment, the deep learning environment was constructed with Keras as the framework and Tensorflow as the back end. Two 1080Ti GPUs were used along with the parallel computing architecture CUDA to accelerate the image processing process. Iterative training was performed 2000 times, each iteration taking 10s. The total training time took close to 6 hours. Therefore, further research shall focus on how, in actual engineering applications, to reduce the amount of data in deep learning and to improve the training efficiency of the network while ensuring the positioning accuracy.

VI. CONCLUSION
To address the unsatisfactory recognition and positioning of the energy field focus in shallow underground source positioning, this article proposes an energy field focus searching and positioning method, which is based on and takes advantage of deep learning, a technique used in the field of image recognition. This method, by use of the grouped cross-correlation, generates 4D input information for deep learning and then, through the 3DCNN structure and the 3DSPP model, constructs a deep learning framework. The source coordinates are used as the output tags to generate an end-to-end model that converts energy field images to the source coordinates. Simulation and field testing results show that compared with the QPSO-based energy point recognition and positioning method, the proposed method reduced the RMSE from a maximum of 0.75m to 0.18m, to stabilize in the range of 0.15-0.20m, and lowered the SEP from a maximum of 0.745m to 0.035m, indicating that the proposed method is not only more robust but significantly more accurate in positioning. More prominently is the performance in SEP; and particularly in terms of positioning speed, the proposed method shortened the calculation time from 20s to 1.5s. It can be seen that this method significantly improves the recognition accuracy of the energy focus and accelerates the recognition process, affording therefore promising value of engineering application in the field of underground space positioning.