A Novel DAS Signal Recognition Method Based on Spatiotemporal Information Extraction With 1DCNNs-BiLSTM Network

Extracting more and more accurate information to understand the detected vibration or acoustic targets better, has always been an important goal in signal recognition for Distributed Acoustic Sensor(DAS) with optical fiber. In this paper, we use one-dimensional Convolution Neural Networks(1D-CNNs) to extract the detailed temporal structure information at each signal node and utilize a bidirectional Long Short Term Memory(BiLSTM) network to dig out the spatial relationship among the different signal nodes, and then propose a novel identification method by treating the spatial- and temporal- information in a different way, which is denoted as the 1DCNNs-BiLSTM model. The experimental results on the field data show better recognition performance can be achieved in the safety monitoring of the buried optical communication cable in urban with DAS. It helps to improve the recognition rate further compared with the other deep-learning methods frequently or possibly used for DAS signal recognition, such as the 1D-CNNs with a single temporal feature extraction, and 1DCNN-CNN and 2D-CNN models with simultaneous spatiotemporal feature learning. To the best of our knowledge, it is the first time to simultaneously extract and utilize the detailed temporal structure feature and the overall spatial connection through a customized deep learning network.


I. INTRODUCTION
Distributed Acoustic Sensor (DAS) based on the phasesensitive optical time-domain reflectometry ( -OTDR) technology [1]- [4], provides a highly sensitive and cost-effective vibration or sound sensing way for our environment in long distance or wide range. It has been extensively applied in many safety monitoring fields, such as natural disaster prediction including seismic waves [5], the third party damage warning for oil/gas/water/heating pipelines [6], [7], and perimeter security [1], [8] and large structure health monitoring, etc. Especially, it is a promising technology in maintenance of urban communication cables [9]. In recent The associate editor coordinating the review of this manuscript and approving it for publication was Shaohua Wan . years, the laid volume of the communication optical cables is dramatically growing with the increasing construction demand of 5G, Internet of things, industrial Internet, and data center. By utilizing the existing fiber cables, the vibration or sound signals generated by various destructive sources on the ground can be sensed and located by DAS with high sensitivity and precision. However, intelligent recognition and deep understanding of real threats such as mechanical operations, manual diggings and other vibration sources as the traffic interferences along the fiber is still a challenging problem, which results in frequent nuisance alarms in practical applications. In fact, to improve its smart sensing ability, a lot of useful signal processing work has been involved, which includes conventional machine learning methods with fixed hand-crated feature extraction [6], [7], [10], [11], and deep learning methods with one dimensional(1-D) [12] or two dimensional (2-D) Convolution Neural Networks (CNNs) [13]- [15] to extract the hidden distinguishable feature automatically.
However, most of the identification methods for DAS still mainly rely on the temporal feature extraction of the vibration signals at each sampling node at different fiber locations, while they ignore the spatial relationship among them. Some methods try to use traditional image processing methods [11] to detect and identify some certain 2-D patterns in the sampled space-time matrix. But fixed hand-crated feature extraction is time-consuming and laborious, and relying heavily on the expert knowledge. Thus in this paper, to improve the recognition ability in the safety monitoring of the buried optical communication cable, it is explored a novel identification method by using a deep learning structure with an array of 1-D Convolution Neural Networks(CNNs) combined with the Bi-directional Long Short-Term Memory(BiLSTM) model, which is shortened as 1DCNNs-BiLSTM. In this algorithm, CNN is used to automatically extract the temporal structure feature of the signals at each acquisition node on fiber, and the BiLSTM network is composed of forward and backward LSTMs, which are designed to mine the internal spatial relationship among the temporal signals at different nodes from right to left and from left to right in succeed.
The rest of the paper is organized as follows: Section II presents related work about the temporal and spatiotemporal information extraction methods and challenges in DAS signal processing at present; Section III, by taking the application of safety monitoring for a long distance underground communication cable as an example, it introduces the proposed 1DCNNs-BiLSTM algorithm, in which it can extract not only the structural feature of the temporal signals at each fiber location, but also the spatial correlation among the signals at different locations and give out recognition results of the space-time matrix at the end of the network; and the experimental results with real field test data are presented in Section IV; finally, the conclusion is given in Section V.

II. RELATED WORK
At present, intelligent recognition and deep understanding of real threats and other nonthreatening sources caused by various production and life noises along the buried fiber is a challenging problem, which still results in inaccurate recognition and frequent nuisance alarms in long distance monitoring applications. In order to propel the practical applications of DAS, more and more research is devoted to its signal processing methods, such as the vibration or acoustic source identification.
At the first stage (before 2015), most work is devoted to get better SNR for reliable detection and location of the events, such as by correlation detection and moving average [16], 2-D image edge operator [17], Wavelet transform [18] and Hilbert-Huang transform [19], and other signal denoising and separation techniques [20]. These methods are helpful to depress the noises caused by the frequency shift and other unstable system factors, and the stationary background to some extent. But in real applications, it is found that the nuisance alarm rate (NAR) mainly comes from poor understanding of the vibration or sound especially in complicated time-varying noisy environments. Thus at the second stage (from 2015-2017), more researchers focus on exploring various feature extraction and proper identification methods. The feature includes the magnitude [7], level crossing rate [21], and periodic gait characteristics [22] in time domain, and the frequency energy distribution of its Fast Fourier Transform(FFT) spectra [23], and the morphological features in space-time domain [11], and the spectra obtained by short-time Fourier Transform(STFT) [24], Wavelet or Wavelet packet energy spectra [25], [26], and Mel-Frequency Ceptral Coefficient (MFCC) [27] in time-frequency domain and etc. And the classification models include artificial neural networks(ANN) [6], [25], [26], Gaussian mixture model (GMM) [24], [27], Support Vector Machine(SVM) [7] and Relevance Vector Machine (RVM) [11] and etc. All the feature engineering methods enhance the perception ability further, but they rely heavily on the expert knowledge. And developing an applicable recognition algorithm is time-consuming and laborious, and the model has poor transfer ability in different application environments. Then along with rapid development of the deep learning algorithms and their successful applications in image processing [28], [29], speech recognition [30] and fault diagnosis [31] and etc., the vibration source identification of DAS enters a new phase at the third stage. In [15], [13], and [14], the hidden feature of different types of DAS signals is automatically extracted and identified by using CNN network. To extract more useful information, Tejedor J and et al proposed to utilize the temporal contextual connection by integrating the sequential feature vectors in a multi-layer perceptron (MLP) network in [32] and [10]. In further, we proposed a knowledge mining method based on the hidden Markov models (HMMs) [33] to extract the dynamic time sequence feature and its evolution information, and then identify the sequential state process of typical events.
However, generally most of the present identification methods for DAS still rely on the temporal feature extraction of the signals obtained at each sampling node on fiber, while they ignore the spatial information which is the relationship among the signals at different nodes. In [11], it considers the spatiotemporal structure of DAS signals, but it utilizes fixed hand-crated feature extraction for the 2-D space-time matrix, which is laborious. At present, 1-D CNN and 2-D CNN are still frequently used networks while they are proposed for the one-dimensional DAS signal recognition at every fiber node. And until recently, the spatiotemporal feature of DAS signals has not been fully considered or automatically excavated. On the other hand, a new deep hybrid learning model of CNN-LSTM is proposed and sequence connection and spatiotemporal information are excavated by using this network for video action recognition in [34], and transportation flow prediction in [35]. In [35], CNN is used to extract the spatial VOLUME 8, 2020 information while the bidirectional LSTM (BiLSTM) is used to extract the temporal information, and in [34] the BiLSTM is configured in a two-layer structure to extract the hidden connection among the image sequences. In this paper, to fully utilize both the structural information in the time sequences and the spatial distribution mode among the sequences influenced by the vibration source, it borrows the idea in [34] and [35] but designs a customized deep learning structure of 1DCNNs-BiLSTM for DAS according to its spatiotemporal signal structure. A separate 1-D CNN network is used to extract the detailed temporal information of the signal at each spatial sampling node; while a one-layer BiLSTM is used to automatically dig out the spatial connection among the node signals in order to give full play to their advantages respectively. Purposes of these two sub-networks are different and customized in our application.

A. SYSTEM STRUCTURE OF DAS AND THE PROPOSED RECOGNITION SCHEME
Taking the application of safety monitoring for long-distance underground communication cable as an example, the system structure of DAS and the distributed recognition scheme are demonstrated as in Fig. 1. The hardware consists of three parts: the probe fiber, the optical signal demodulator of DAS, and the signal processing unit. The -OTDR linearly demodulated by a 3 × 3 coupler [2] is used in the demodulator. The probe fiber takes a spare core of the communication cable laid along the pipelines under the urban ground and is used to sense/detect the ambient disturbing events which causes vibration. Each section of the fiber is equivalent to a sensor sampling node in space. These distributed nodes cooperate to pick up the vibration signals on the whole line. The system originally returns a space-time signal matrix, which is defined as In the matrix in (1), the row index t represents the time and T is the time length; and the column index s denotes the spatial sampling node and S is the spatial width. And the spatial interval of each two nodes is S, and the temporal interval is T = 1/f s , in which f s is the sampling frequency. One column data represents the temporal signal collected at a sampling node, which is always the basis of event recognition in most of the related work. Actually, any a machinery excavation or manual operation influences not just a single column, but some columns in a range. Most of the present work identify the temporal signal at each node respectively and ignores the internal spatial correlation of the signals among the nodes. Thus in this paper the acquisition matrix XX is segmented into small frames of event centric signal matrix, which are taken as a space-time recognition area. And the signal matrix in the recognition area is input into the following recognition network as shown in Fig. 1. In Fig. 1, a customized hybrid model of 1DCNNs-BiLSTM is designed according to the specific space-time structure of the DAS sensing signals, in which there are three modules: firstly, there is only one signal per node, and the signal at one node is preprocessed and input into a separate 1D-CNN network, which corresponds to one cell of the LSTM to extract the local and global structural feature; then the extracted 1D-CNN feature vectors are fed in parallel into a bidirectional LSTM network to continue mining the spatial association among the signals at different nodes in the event frame; finally, the extracted spatio-temporal feature sets are stacked and input to the full connection layer of the whole network to identify the event type. In this way, the space-time information, such as the characteristics of spatial distribution as well as the temporal structure feature of the vibration signals can be both fully utilized, while they are treated differently according to the different contribution of them.

B. TEMPORAL FEATURE EXTRACTION WITH 1D-CNNs
Various experimental results show that CNN can effectively extract structural features for the 1-D complicated speech or sensory signals [12]. In the application of DAS, considering that the information in spatial dimension is not as rich as that in temporal dimension, an array of 1D-CNNs is then adopted to extract the structural features of each signal at all the spatial nodes in the recognition area, while we do not use a direct two-dimensional CNN (2D-CNN). Besides, the 1D-CNN can extract the temporal feature brilliantly with fewer network parameters [12], which can improve the speed of model detection and prevent over fitting. Thus several identical 1D-CNNs are combined in parallel to form a 1D-CNN array, which is denoted as 1D-CNNs, in which each 1D-CNN is responsible for the temporal feature extraction for the signal at one spatial node. The detailed structure of each 1D-CNN in the model is demonstrated as in Fig. 2. It consists of four convolution blocks, and each convolution block is composed of a convolution layer, a pooling layer, and a ReLU activation function, which form a Convolution-Pooling-Relu structure. In the network, in order to alleviate the internal covariate shift phenomenon and increase the feature extracting ability, a Batch Normalization (BN) layer [36] is added following the output of each convolution block. Through the network, each column is processed and transformed into a deep feature vector, which is called the 1D-CNN feature in this paper. When all the columns of data are processed in the frame, the extracted 1D-CNN feature vectors are stitched into a feature sequence in the actual spatial order.

C. SPATIAL CONNECTION MINING WITH BiLSTM
In the second step, a bidirectional Long Short Term Memory (BiLSTM) network is designed further to extract the spatial connection among the deep structural feature vectors learned by CNN. LSTM is a deform of Recurrent Neural Network(RNN). It mainly solves the long memory problem of traditional RNN and remembers only useful information.
Here we regard the spatial association of the signal nodes as a kind of sequence relationship, and use a bidirectional LSTM to extract the contextual connection of the spatial distribution rule. The typical structure of BiLSTM is shown as in Fig. 3. A one layer bidirectional LSTM is used, including a backward and a forward layer, to interpret the spatial relationship from left to right and from right to left respectively. In the BiLSTM network, each LSTM cell is designed for one spatial node. The structural details of each LSTM cell is also illustrated on the upper right corner in Fig. 3. And its calculation principle in each LSTM cell is shown in (2) to (7), and the involved symbols are described in Table 1.
In the BiLSTM network, with the deep CNN feature at each spatial node as its input, the learning process of each LSTM cell is controlled by three gates: the input gate i t , the forget gate f t , and the output gate o t . In (2) and (3), the input x t and the output state of the previous cell h t−1 are transformed into the input information of the input gate and the memory state of the present cell respectively through the mapping functions of Sigmoid σ and Hyperbolic Tangent tanh. In (4), a forget gate f t is formed through the sigmoid function with the input x t and the output state of the previous cell h t−1 , which determines how much information of the previous cell is remained or forgotten in the present cell. In (5), the state of the present cell c t is updated by adding the remainder information of the previous cell in the forget gate, and the input information in the input gate. In (6), the output gate at the present cell is obtained by fusing the output state of the previous point and the input with a sigmoid function. And in (7), the output state of the present cell is finally obtained by passing the updated memory state of the present cell through the output gate. And the output state of the cell is taken as the extracted feature in the BiLSTM network. In the forward direction, the 2 nd cell contains the information of the 1 st cell; and the 3 rd cell contains the information of the 1 st and the 2 nd cell; and the last cell contains all the information of the previous cells. And in the backward LSTM, it is applied in the same way in an opposite direction. The two output feature in the bidirectional LSTM are contaminated together to combine a new feature vector. It contains the bidirectional information of all the cells in the network. Then an average mergence is carried out for the output of all the cells, and the merged feature vector finally contains all the spatiotemporal information and prepares for the following recognition.

D. IDENTIFICATION
In the last step, we use the fully connected (FC) layer followed by the 1DCNNs-BiLSTM network to identify the event type. And a dropout layer is adopted to follow the FC layer to avoid over-fitting phenomenon. At each iteration cycle of the training process, the dropout layer randomly discards a certain proportion of the cell units in the FC layer, which makes it to be equivalent to train a new network each time.
Thus the model robustness can be improved and over-fitting can be prevented. Finally, flowchart of the proposed 1DCNNs-BiLSTM algorithm is constructed as in Fig. 4, and its parameters in structure are detailed in Table 3 in Section IV. The event centered space-time samples acquired by DAS are prepared to construct a database of typical events. The 1DCNNs-BiLSTM network is trained with the database offline first; when it achieves at its optimal state, the well-trained network is used for online monitoring and identification. Cross entropy is used as the loss function to train the whole 1DCNNs-BiLSTM network, which is calculated as in (8), In (8), N is the batch size or the number of samples in the data batch used for training, and y is the true label of the sample, and a is the predicted label of the sample. The obtained cross entropy L represents the difference of the true class and the predicted class. In the training process, the loss function value will decrease iteratively until the difference vanishes, or when the model converges.

IV. EXPERIMENTS A. DATA PREPARATION AND MODEL CONSTRUCTION
We applied the proposed algorithm into the field test for safety monitoring of underground communication cable in Wuhan, Hubei province in April 2019. The monitoring cable is about 40km long, and 0.8-1.5 meters deep buried in underground in urban. One spare core of the optical fiber communication cable is taken as the probe fiber. The temporal sampling rate of the system is 500Hz, and its spatial sampling rate is 5.16m. In the test, we use the DAS system as shown in Fig. 1 to collect five types of typical events, including background with no threats, traveling traffic flow, machine excavator operation, road breaker operation, and manual digging, which are labeled as type 1, 2, 3, 4, and 5 respectively. The spatiotemporal signal sample of each event is composed of 25 adjacent spatial points (∼125m) in space, and the collection time is 30 seconds in time. Fig. 5 and Fig. 6 show the field scenarios and the corresponding samples of the five commonly interested events. As shown in Fig. 6, besides the  difference in the fluctuation the temporal signal, the spatial distribution modes of different events are somewhat different.
As we can see that the background with no threats has a wide mild noise distribution in the space-time matrix; the spatial distribution of the traffic interferences in main traffic arteries is disorderly, and the law of time evolution in long term is consistent with the regular urban population life; the signal samples of excavator digging and road breaker construction are similar from the perspective of space-time matrix, in both of which the signal fluctuation amplitude is relatively strong and the influenced spatial range is more concentrated than traffic interferences, but wider than most manual operation as in Fig. 6(e); and the influence range of the manual digging is the narrowest one, and its signal strength is weaker on the whole but occasionally strong locally. The field database prepared for training and testing is detailed as in Table 2. Structural parameters of the proposed CNN-BiLSTM model in the test are specified in Table 3. Four convolution layers are used in the CNN network, and a forward and a backward 25 * 256 LSTM are simultaneously used in the BiLSTM network. In the first two convolution layers, a 1 * 25 filter kernel is used for extracting the local detailed structure of more or less one second of the signal; while in the last two convolution layers, a 1 * 5 filter kernel is used for larger scale structure extraction in global, such as the signal trends. The filter size could be different for different signals with varying sampling rates, which has to be optimized through parameter adjustment. In each layer, the Kernel size/Stride/Padding and the Input size of the layer are also detailed respectively in Table 3. In the forward and backward 25 * 256 LSTMs, 25 LSTM cells stand for the spatial nodes in the sample, and 256 hidden states accepts the 1D-CNN feature of 256 channels. Finally, a 1 * 512 FC layer is used to stitch the output of the two LSTMs, and a softmax classifier of 1 * 5 is used to classify the five event targets. And the configuration of CNN and BiLSTM in Table 3 has been optimized. For example, one-layer BiLSTM is tested to have better performance than the two-layer LSTM structure. And convolution layer, convolution kernel size, pooling size and the activation function Relu in CNN network are all optimized through experiments. Table 3 actually contains the optimized configuration and parameters of the CNN-BiLSTM network for the DAS application in this paper.

B. FEATURE VISUALIZATION
As stated in the related work, 1-D CNN and 2-D CNN are the frequently used deep learning networks at present for the one-dimensional DAS signal recognition at each fiber location in [12]- [15], and 1-D CNN proves better than the others including the traditional machine learning methods with fixed hand crafted feature. In this paper, we mainly compare the proposed method with the networks which can consider both the spatio-and temporal-information but not the only temporal information. Then, we take the 1-D CNN as the 1st method to be compared, and it represents all the VOLUME 8, 2020 methods in which only temporal feature of the signal at each sampling node is considered here. And the other three are all methods of spatiotemporal information extraction and identification networks possibly used in DAS. The 1 st method is denoted as 1DCNNs, in which a four layer 1D-CNN is specified as in Table 3, all the CNN feature vectors obtained in the spatial order are stitched together and input into a FC layer for classification directly; the 2 nd one is denoted as 1DCNNs-CNN, in which the feature extraction process is the same with the 1 st one, but in the classification step, another one layer 1D-CNN with kernels of 1 * 3 in 256 channels is used to mine the sequence relationship of the spatially stitched 1D-CNN features; the 3 rd one is the 2D-CNN, in which the spatio-temporal sample is taken as a 2-D image, and a 2D-CNN is directly used to extract the contour pattern on the horizontal and vertical axes; and the last one is the proposed 1DCNNs-BiLSTM method in the paper, in which the four layer 1D-CNN is the same as that in the first two methods, and the obtained 1D-CNN feature at each node is taken as the input to each LSTM cell. And the spatial distribution characteristics among the signals is extracted in the BiLSTM network.
To observe the event distinguishability of the above four methods, the feature vectors learned by each of them from the database in Table 2 are visualized as in Fig. 7. The high dimensional feature obtained by the well-trained networks is reduced into a three-dimensional space through the Linear Discriminant Analysis(LDA) algorithm. From the four figures in Fig. 7, it shows the visualized feature of the five events can be basically distinguished for all of the above four methods, which explains all the four learning methods achieve at their optimal states in this test. And in more details, the spatiotemporal features extracted by 1DCNNs-CNN in Fig. 7(b) and the proposed CNN-BiLSTM in Fig. 7(d) has larger classification distances than the other two methods. At this stage,  it is convenient to compare the classification performance further.

C. RECOGNITION PERFORMANCE
Then more detailed classification performance of the four methods are compared in this section. Firstly, in the training process, the converging processes of the four algorithms are compared as shown in Fig. 8. It shows the 1DCNNs has the slowest convergence speed of 9 epochs and the lowest accuracy of ∼93%, while the proposed 1DCNNs-BiLSTM has the fastest convergence speed of 4 epochs and the highest accuracy of ∼97% in this test; and the convergence speed and recognition accuracy of the 1DCNNs-CNN and the 2D-CNN are both in the middle.
In the testing process, the confusion matrices of the four methods are all obtained as in Fig. 9 by using the testing set in Table 2. And from Fig. 9, the performance indices of Precision, Recall, F1-score and Accuracy of each event for all of the four methods are respectively computed and compared as in Table 4. In these four methods, the proposed 1DCNNs-BiLSTM has the best recognition performance indices for each types of events, and the best average F1-score in general. 1DCNNs-CNN is in the second place, which is better than the 2D-CNN method. And the 1DCNNs is the worst, in which only the temporal structural features are used.
Firstly, it preliminarily shows that the spatiotemporal features have more distinguishable information than the single temporal structural features. Secondly, as stated above, the detailed temporal features and the spatial distribution have obvious different contributions in the DAS signal recognition. The spatial information mainly reflects the difference of the distribution trend, while the temporal information contains much more detailed structural difference for different types of operation processes. The spatial information is actually not as rich as the temporal information. Thus they cannot be treated in the same way. That's why the 2-D CNN behaves worse when it is compared with the proposed 1DCNNs-BiLSTM and the 1DCNNs-CNN networks. As we all know that, CNN has unique advantages in extracting detailed feature, and the LSTM has advantages in mining the relationship in sequences. And in this paper, we use CNN to extract the detailed temporal structure difference and utilize LSTM to dig out the spatial relationship of different signal nodes, and treat the two dimensional information in a different way. And the test results on the field data proves that this is reasonable. The proposed 1DCNNs-BiLSTM finally behaves better than the 1DCNNs-CNN networks when considering this difference in this application.
On the other hand, recognition of the excavator operation is the most challenging because the action is sometimes strong and sometimes weak, and its signal and the affection area change randomly. That's why the accuracy of label 3 data is significantly lower with 1DCNNs based on only the temporal feature, and with 2D-CNN based on the two dimensional contour pattern.  Furthermore, a ten-fold cross validation is also carried out on the database in Table 2 to verify the stability of the four models and the results are comparatively included in Fig. 10. The four lines with different marks stand for the obtained recognition accuracy for the above four methods in 10 random tests, and the four dotted lines represent the average accuracy for each method. In the ten-fold cross validation, the dataset in Table 2 is used and the ratio of training and testing sets is set as 8:2 in each test. The results in Fig. 10 show that the proposed 1DCNNs-BiLSTM always has a better recognition behavior and its average accuracy in the ten tests can be achieved at 97.2%, which is significantly higher than those of the other three methods; the 1DCNNs-CNN is in the second place, which behaves slightly better than, but more or less the same with the 2D-CNN, and the average accuracy of these two method are 95% and 94.2% respectively; and the 1DCNNs still performs the worst in all of the four methods, and its average accuracy can only be achieved at 92.5% in the ten tests. It shows that the proposed 1DCNNs-BiLSTM network behaves steadily the best in this field data test for DAS, which generally reveals that the 1DCNNs-BiLSTM has the best learning performance for the spatiotemporal information extraction in this application.

D. RECOGNITION SPEED DISCUSSION
Considering the practical application requirements, the computation speeds of the four methods are also compared as VOLUME 8, 2020 shown in Fig. 11. For each test sample, the average test time of the four methods is included in Fig. 11(a); and for the whole line of 40km, the overall time of the four methods is compared in Fig. 11 (b). It demonstrates that: the computation speed of the 1DCNNs is the highest because it has only CNN learning network in it; and that of 1DCNNs-BiLSTM proposed in this paper is the slowest even though it has the best recognition performance, and the test time is about four times of the 1DCNNs; the 1DCNN-CNN is the second fastest method and the 2D-CNN is the third one in the computation speed. It shows BiLSTM takes a little more time because it has a more complicated network. Thus in this paper, a basic one-layer BiLSTM but not a two-layer structure as in [34] is used to improve the computation efficiency, which also avoids possible over fitting in the two-layer BiLSTM and improves the recognition performance of the whole model. Besides, the average merging operation for the hidden 256 states is also designed in the BiLSTM to improve the computation speed. In fact, the input sample of the CNN-BiLSTM network is a space-time matrix with a certain width but not a single temporal signal. In this field application, the monitoring cable is about 40km long and the spatial resolution of DAS is 5m, which means there are 8000 spatial points in total in the whole line. And in this application, every 25 spatial points are taken as a sample. Then there are 320 space-time samples to be identified at most in total. The recognition time for a line of 40km, takes about 1.59s for the proposed method, which is much less than the signal collection time of 30 seconds for the whole line in this case. In this way, the processing speed of the proposed method in this paper can still catch up the data collection rate well and it can be used on line in this application. Actually in other applications, it needs to weigh the tradeoff of the recognition rate and computation speed to choose proper algorithm. For example, the 1DCNN-CNN behaves better than the other two methods of 1DCNNs and 2-D CNN, but it has better computation efficiency.

V. CONCLUSION
In this paper, a novel 1DCNNs-BiLSTM based deep-learning model is proposed to automatically and accurately extract both the temporal structure feature and the spatial association characteristics for identifying the distributed DAS sensing signals. It helps to improve the recognition rate further compared with the frequently used models based on a single temporal feature extraction with 1D-CNN and the simultaneous space-time feature learning with 1DCNN-CNN and 2D-CNN. Moreover, the real-time computation efficiency of the proposed method is also discussed and compared with others, and it can be utilized on line. From the field test results, it shows the proposed 1DCNNs-BiLSTM is a promising method in the DAS signal recognition in practical complicated environments. It not only digs out the spatiotemporal information of the DAS signals in a deeper level, but also treats it in the two dimensions differently according to the different contribution of them.