An Optimized Multi-source Bilinear Convolutional Neural Network Model for Flame Image Identification of Coal Mine

Underground fire monitoring is an important tool to improve coal mine production safety. In this paper, a multi-source information identification method based on bilinear convolutional neural network (B-CNN) is proposed, which consists of construction of multi-source image acquisition system, B-CNN and integrated decision making based on multi-source B-CNN. Aiming at the problem that Softmax loss function based on the gradient descent in B-CNN is easy falling into the local optimum, an improved Grasshopper Optimization Algorithm (GOA) is proposed to optimally selected two parameters of W and θ; the method of initial solution generation based on sine mapping and the method of accepting bad solution with certain probability are respectively adopted. In order to solve high computational complexity in the stage of model training and integrated recognition by multi-source B-CNN, an image feature preprocessing method is proposed in this paper. Several feature vectors of color feature, shape feature and texture feature of the collected image are extracted and used as input vectors of B-CNN to complete model training and integrated recognition. In simulation experiments, firstly, four Benchmark functions are used to verify the performance of the improved GOA; then, by scaling, expanding and rotating the image to simulate the results of image acquisition at multiple positions and angles, different information sources can be formed to complete the integrated recognition by B-CNN. Three performance indexes of Accuracy, Precision and Recall are used to evaluate the simulation result of different comparative models, which show that the proposed method has better recognition effects.


I. INTRODUCTION
In underground coal mine fire disaster, spontaneous combustion fire is the focus of prevention and control, mainly caused by own oxidation and combustion of coal or other combustible heat accumulation. In recent years, most coal mines pay more and more attention to the construction of underground production monitoring system, through the installation of various types of data acquisition equipment and the construction of network transmission system, so that the production management personnel can as quickly as possible to grasp the underground production dynamic and abnormal changes. At present, the underground coal mine safety monitoring mainly focuses on the following aspects: (1) Data acquisition and analysis based on advanced sensors. Laser gas analyzer, infrared thermometer, optical fiber temperature measurement and other devices or equipment are used to monitor the underground production environment, and the data is transmitted to the ground monitoring center through the cable network, Internet of Things, Zigbee wireless network, etc., and the parameters are input into the database for comparative analysis, and then feedback to the underground production personnel.
(2) Production disaster prediction based on multiparameter. For production disasters such as fire, on the basis of mechanism analysis and production technology, several production parameters which are easy to be measured are directly used as auxiliary variables, and then linear or nonlinear mathematical models are established to predict different types of production disasters.
(3) Underground monitoring based on image processing. Video monitoring equipment is installed in key areas such as coal conveyance belt, electrical equipment and personnel operation position, to take pictures of the working status of production system and staff movement. Then, the image processing algorithm is used to reduce noise and enhance the image, so as to improve the readability of the image, and better grasp the underground working conditions. At present, the combination of image processes and machine learning to realize underground coal mine fire monitoring has attracted more and more attention of researchers. The image contains rich information, good continuity, and the image processing method is low cost, operability. The automatic recognition of flame image can be realized by using machine learning method and training model from enough samples. In recent years, support vector machine (SVM) [1]- [7], feed-forward artificial neural network [8], wavelet neural network [9], fuzzy neural network [10]- [11], probabilistic neural network (PNN) [12], RBF neural network [13], BP neural network [14], extreme learning machine (ELM) [15], etc. have been applied in the fire image recognition. However, the generalization performance of SVM is poor and it is prone to over-fitting, and the prediction accuracy of the selected kernel function in SVM is greatly affected by data distribution characteristics; in RBF and TWSVM networks, problems such as poor generalization ability and easy overfitting will also occur due to the use of high-dimensional data mapping; ELM relies on the accurate selection of the number of hidden layer nodes, and excessive number of hidden layer nodes will increase the calculation amount of the model, and it is easy over-fitting; the convergence rate of BP is slow and it is easy to fall into the local extremum; the selection of wavelet function in wavelet neural network is subjective and its prediction accuracy is greatly affected by data characteristics; the choice of membership function in fuzzy neural network is subjective and has great influence on the prediction accuracy of model. Nowadays, with the continuous development of deep learning technology, flame image recognition based on deep learning algorithm has been widely concerned. Muhammad et al. [16] constructed an early fire detection framework by adding depth features of convolutional neural network (CNN); Saeed et al. [17] proposed a CNN mixed model for early fire detection; Sharma et al. [15] mixed two models of CNN and ELM to realize fire image detection; Pereira et al. [18] proposed active fire detection method in satellite images, by introducing large scale active fire detection data set and using different CNN architectures; Ayala et al. [19] proposed a new deep learning architecture for fire image recognition, which adopts inverted residual blocks, deep convolutions and octave to reduce the computational cost of the model, and at the same time, it can ensure high calculation accuracy; Zhu and Ren [20] proposed a flame image recognition method based on deep learning and particle algorithm, which introduced RGB and HIS systems to realize multi-feature fusion flame recognition; Sun et al. [21] proposed an improved CNN to realize the rapid detection of forest fire smoke; Cao et al. [22] proposed an attention enhanced bidirectional long short-term memory network (ABi-LSTM) for smoke identification of forest fires in video; Pundir and Raman [23] proposed a robust smoke detection method based on dual deep learning framework, the first deep learning framework extracted image-based features from smoke patches, and the second deep learning framework was used to extract motion-based features, which were then input into CNN classifier to complete classification; Yang et al. [24] proposed a neural network model combining lightweight CNN and SRU, which could reduce the influence of strong interference, such as bright light flicker or high brightness background on single-frame fire image recognition. Bilinear convolutional neural network (B-CNN) [25] is a deep learning model based on weakly supervised learning. Two convolutional neural networks are adopted, in which, one network extracts the location information of images, while the other network extracts the appearance information of images. The classification effect of B-CNN is better than the image classification method based on image filtering. Zhou et al. [26] trained deep convolutional neural network to analyze multidimensional emotional facial expressions, and bilinear pools were used to encode second-order features, which proved that B-CNN model had better performance; Dong et al. [27] proposed combining B-CNN and pair-wise difference pooling (PDP) for texture classification of fine-grained images, which could not only obtain pairwise difference between two groups of features, but also encoded the difference between each pair of features; Tang et al. [28] proposed a new spatial attention bilinear convolutional neural network (SA-BCNN) to detect defects in casting X-ray images, by combining spatial attention mechanism with bilinear pooling. Although the research on B-CNN has achieved some fruitful results, it can be further improved in optimizing B-CNN model, reducing the complexity of training process and carrying out multi-source information fusion. In this paper, B-CNN is used to establish an underground flame image recognition model, and an improved Grasshopper Optimization Algorithm (GOA) is proposed to optimize the parameters W and θ of the Softmax loss function in B-CNN; the color feature, shape feature and texture feature of flame image are used as input variables to train the model, so as to This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. improve the processing speed of B-CNN. In addition, in order to reduce the influence of various interference factors on single source information processing, this paper adopts multi-source image acquisition method and multi-source information integration decision method to realize the coal mine flame image recognition. GOA is a novel swarm intelligence optimization algorithm [29]. Compared with Particle Swarm Optimization Algorithm (PSO), Pigeon-Inspired Optimization Algorithm (PIO), Grey Wolf Optimization Algorithm (GWO), Bat Algorithm (BA), fruit fly optimization Algorithm (FOA), Brain Storm Optimization Algorithm (BSO), etc., it has advantages of simple structure, few parameters, larger operation space, and so on. However, it still has room for further improvement in convergence accuracy and global search capability of GOA. So, in this paper, some works about its method of initial solution generation and global search capability are conducted to further improve it; and it is applied to optimally select two parameters W and θ of Softmax loss function in B-CNN.
The main contributions of this paper are summarized as follows: (1) An improved GOA was proposed to optimally select W and θ of Softmax loss function, which has better performance in initial solution generation and global search capability; (2) A multi-information source image acquisition system in underground working face of coal mine was designed, in which several image acquisition devices in different positions were set up to acquire multi-angle and multidirectional images; (3) A flame image recognition model based on multiinformation source bilinear convolutional neural network was proposed, which was formed by image feature extraction unit, model training unit and, recognition and detection unit; (4) In order to decrease the high computational complexity in B-CNN training and integration recognition stage, an image feature extraction method based on preprocessing was proposed.
The remaining part of this paper are arranged as follows: Section 2 introduces the research background of this paper, that is, remote image transmission system in underground coal mine and multi-information source image acquisition system in underground working face; in Section 3, bilinear convolutional neural network, multi-information source image processing methods based on bilinear convolutional neural network are briefly introduced; in section 4, the existing problems and proposed solutions are presented, firstly, an improved GOA is given, and then, the extracted color features, shape features and texture feature vectors are introduced; Section 5 introduces the flame image recognition model based on multi-information source bilinear convolutional neural network, including image feature extraction unit, model training unit and recognition and detection unit; in Section 6, different simulation experiments are designed to verify the effectiveness of the proposed method; Section 7 summarizes the main contributions of this paper.

II. RESEARCH BACKGROUND
The image acquisition instrument is placed in a specific position underground, to monitor objects in the target area. The mine remote image transmission system is mainly composed of image acquisition instrument, the underground control microcomputer (  Early fire has weak characteristics such as high temperature in the local area, a small amount of smoke, smaller flame, etc. and because of the distance, mine dust, environment temperature, worker, incandescent, mechanical and electrical equipment, the influence of early fire image is not easy to identify, make the existing method based on a single source of information processing has higher mistake rate and false negative rate. Therefore, based on the structure of remote image transmission system in underground coal mine as shown in Fig. 1, this paper constructs a multi-information source image acquisition system in underground working face site, as shown in Fig  According to Fig. 2, multi-information source image acquisition method is adopted in this paper to solve the problem that flame image features are easily interfered by various complex factors, such as: underground incandescent light, activities of works, lighting and heating of mechanical and electrical equipment. By setting up image acquisition devices in different positions, multi-angle and multi-directional image acquisition is carried out for the working face area, and then the integrated identification of flame features is carried out by multi-source image information processing method. Compared with the single information source processing method, the multi-source method can obtain the most useful information and reduce the influence of various interference factors as much as possible, so as to improve the accuracy of flame image recognition.

III. METHODOLOGY
In this paper, B-CNN is used for flame image recognition and processing. Firstly, a certain number of early flame images are collected to establish the training sample library. Then, the B-CNN based deep learning method is selected to design the operation mechanism, and the structure of the model is trained by the training sample library. Finally, the trained model is used to identify the enhanced images to determine whether the fire occurs.

A. BILINEAR CONVOLUTIONAL NEURAL NETWORK
The network architecture of B-CNN is shown in Fig. 3.
The mathematical representation of B-CNN is as follows: where f A is the feature function of the first convolutional network to extract convolutional network features, and f B is the feature function of the other convolutional network to extract convolutional network features; P represents bilinear pooling function, which is the confluence of two bilinear l I f f f l I f l I (2) Then, the pooling function P is used to accumulate bilinear features at all positions into one bilinear feature, and this feature is used to describe the features of the input image: The calculation results of (6) are taken as the final representation of image features, and Softmax is used to classify images.
During the training of bilinear convolutional neural network, two convolutional neural branch networks can be trained at the same time, and the gradient of the pooling layer of bilinear convolutional neural network can be calculated by using the chain rule, and then the error is propagated back to obtain the gradient update. Assuming that for each position l, the output of feature extraction functions f A and f B are f 1 and f 2 respectively, then the bilinear feature at l is: 12 T x  ff . dl /dx is used to represent the gradient value of the loss function at x, and the gradient value of the output of the loss function to the two networks A and B can be obtained by the chain rule: This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
To sum up, the function of network A is to locate the object, and network B is used to extract the features of the object detected by network A. The two networks coordinate with each other to accomplish the tasks of region detection and feature extraction.

B. MULTI-INFORMATION SOURCE IMAGE PROCESSING BASED ON B-CNN
According to the multi-information source image acquisition method in underground coal mine shown in Fig.  2, and combined with the bilinear convolutional neural network shown in Fig. 3, the system structure of multiinformation source bilinear convolutional neural network constructed in this paper is shown in Fig. 4.

B-CNN3
Integrated decision making Output

A. PROBLEM DESCRIPTION
According to the multi-information source B-CNN system shown in Fig. 4, in the model training stage, a certain number of flame images are input into the B-CNN respectively. Firstly, feature extraction and pooling calculation are completed respectively. Then, the bilinear feature after vectorization is normalized to obtain the image feature vector. Finally, the model is trained by the image feature vector and the defined image classification label.
A trained multi-information source B-CNN system is used for flame image recognition. Several flame images collected at different positions underground are respectively input into the trained B-CNN, and then two-channel feature extraction and pooling calculation are respectively completed. The image feature vector is obtained by the normalized operation, which are then used for classification calculation by the position of B-CNN. Finally, the final classification results are got by the integrated decisionmaking for several classification results, so as to determine whether the fire occurs.
According to the above training and classification methods of multi-information source bilinear convolutional neural network system, there are mainly two problems as follows: (1) Problem 1: Parameter optimization in the B-CNN training. The Softmax loss function is used in B-CNN, and its output represents the probability that the input vector belongs to each category. For n samples, the training set is defined as: {(x 1 ,y 1 ), (x 2 , y 2 ), …, (x n , y n )}, in which {x 1 , x 2 , …, x n } represents the input vector, and {y 1 , y 2 , …, y n } represents the training label. So, when the input sample is x i , its m estimated probabilities can be expressed as (8) where Wm represents parameters of the network model; In the process of model training, the gradient descent method is generally used to optimize the Softmax loss function, and updating formula of the parameter is as follows: (10) where t represents the number of iterations; λ represents the momentum factor.
However, when the model parameters W and θ are updated by (10), the gradient descent method is adopted for learning, so the whole search solution space cannot be traversed and the local minimum is easily trapped, thus reducing the fitting accuracy and generalization ability of the model.
(2) Problem 2: Computational complexity of the B-CNN model training. It can be seen that when a certain number of images are used to train two-channel CNN in B-CNN, twochannel feature extraction, pooling calculation and other operations have brought high computational complexity. When the trained model is used for flame image recognition, the flame images collected at each position need to be input into the B-CNN respectively, and the operations such as two-channel feature extraction and pooling calculation are carried out respectively, which will greatly increase the computational complexity of flame recognition. Such high computational complexity will limit the application of multi-information source B-CNN system in real-time identification of underground flame images.

B. SOLVING METHODS
In order to solve the above two problems, this paper intends to adopt the following solutions.

1) OPTIMIZATION OF MODEL PARAMETERS BASED ON AN IMPROVED GOA ALGORITHM
The bionic principle of GOA is to map the small range movement behavior of larva to the local development of short step size, and map the large range movement behavior of adult to the global exploration of long step size, so as to conduct optimization in a similar way of "stepping collaborative". The mathematical model of GOA is expressed by (11) where t represents the current number of iterations; x i represents location of the ith grasshopper; S i represents the interaction between grasshoppers; G i represents the force of gravity; A i represents the force of wind.
The interaction S i t between grasshoppers is defined by (12) where popsize represents the population size; d ij t represents the distance between grasshoppers; s(d ij t ) represents the influence function of the interaction force between grasshoppers and other individuals in the population, defined by where f and l represent the attraction intensity and attraction scale, respectively. The force of gravity G i and the force of wind A i are respectively defined by where g represents the gravitational constant; u represents the constant drift factor; g e and w e represent the unit vector of the force of gravity and the force of wind on the grasshopper. Substitute (11), and get the updated formula for grasshopper location is as follows: According to (15), after several iterations, grasshopper populations will gather in local comfort zones and lose population diversity. In this regard, the following changes to the updating formula in (15) are made, where c max and c min represent the maximum and minimum of c; t max represents the maximum number of iteration.
In the traditional GOA, the initial solution of grasshopper population is randomly generated. However, although this method can improve the diversity of population, the overdisorderly random solution will reduce the convergence speed of the algorithm. In this paper, an initial solution generation method based on sine mapping is adopted. The sine mapping is a classical chaotic mapping system [30], and its calculation formula is as follows: ; a is the control parameter of the sine mapping.
Thus, the method of generating the initial solution in the improved GOA (IGOA) is as follows: Step 1: The initial positions of all individuals in the grasshopper population are randomly generated by Step 2: The normalization operation for the initial position Step 3: The sine chaotic mapping for the (0) d i X is performed as follows: where (1) d i X represents the population position after the sine chaos mapping.
Step 4: Inverse normalization for (1) d i X and obtain new ordered initial locations of grasshopper populations, as follows: In addition, it can be seen from (16) that although the location updating method of grasshopper population has been improved to improve the diversity of the population, it still pays too much attention to the search of individual local location, which makes the solution process easily fall into local optimal. In this regard, in our IGOA algorithm, on the basis of (16), the location update mode of grasshopper population is further improved, as follows: According to (23), the worst solution is considered with a small probability in the updating process of individual positions in the grasshopper population, which can jump out of the local optimal region to a certain extent, and improve the diversity of the population in a more comprehensive way.

2) INPUT PREPROCESSING BASED ON IMAGE FEATURE EXTRACTION
First, feature extraction is carried out on the collected images, and 19 feature vectors are extracted under the three types of color feature, shape feature and texture feature of the flame image. Then, these feature vectors are input into the B-CNN, and the two-channel feature extraction and pooling calculation are completed respectively. Finally, the model is trained and identified by the normalized image feature vector. In this way, the extracted digital image features are directly input into the B-CNN by prior knowledge, which can greatly reduce the computational complexity brought by image input. At the same time, the digital image features include color feature, shape feature and texture feature, which can basically reflect the flame feature information of the image.
The 19 feature vectors are extracted as follows: (1) Proportion of RGB components The R, G and B channels are used to calculate the proportion of RGB components, which are taken as three feature vectors of flame image recognition. The calculation is as follows: where W R , W G and W B represents the proportion of R, G and B components, respectively; fire_region represents area of flame after image segmentation; R(i,j), G(i,j) and B(i,j) represent pixel values of R, G and B channels in RGB space.
(2) Moment characteristics of color [31] The first moment μ i and second moment σ i of flame pixels in the flame region are used to represent the color information of the flame, which are defined as follows: where p i,j represents the probability of occurrence of pixel value j in color image channel i; N represents the total number of pixels in the image.
(3) Circularity characteristic [32]- [33] The circularity characteristic is defined by where S represents the area of the region where the flame pixel is located; L is the perimeter of the area where the flame pixel is located.
(4) Rectangularity characteristic [32]- [33] The rectangularity characteristic is defined by where S R represents the area of the smallest rectangle which contains the object.
(5) Barycentric height coefficient [32]- [33] The barycentric height coefficient of the flame is defined by where H C represents height of center of mass of the flame pixel region; H represents total height of the flame pixel region.
(6) Texture feature The texture feature of flame burning is an important criterion to distinguish flame from non-flame object. The gray co-occurrence matrix of flame pixel region is firstly calculated, and then the texture feature of image is calculated from the gray co-occurrence matrix [34]- [35]. Define f(x,y)=i and f(x+∆x,y+∆y)=j as the grayscale of two related pixels, in which i, j=0,1,2,…L, L represents the grey scale; x and y are the coordinates of pixels, and ∆x and ∆y are the space. Thus, the gray co-occurrence matrix of two pixels are defined by The image feature extraction unit is responsible for the preprocessing and feature extraction of the input image, including image segmentation and feature extraction. Image segmentation is to segment the target region after image preprocessing; feature extraction is to extract the feature vector of the image of the target region. The model training unit is to develop the optimal B-CNN network by the training set. The recognition and detection unit takes the trained B-CNN network as the feature classifier, in which, the test set is input into the multi-information source B-CNN, and makes the integrated decision to obtain the final recognition result. The model training unit and recognition and detection unit are described as follows.

A. MODELTRAINING UNIT
After the training set data is determined, B-CNN network is trained. For the training data, the extracted 19 image feature vectors are used as input vectors, and the corresponding classification labels are used as output vectors. Then convolution and pooling operations are carried out. After activation function calculation, the Softmax classifier is used for classification. The calculated output is compared with the actual output to get the prediction error and adjust the weight reversely. When the prediction error meets the constraint conditions, the training ends and the optimal model parameters are determined. The training process of the model training unit is shown in Fig. 6.

Start
Input training data Parameter initialization, and set the initial weight According to Fig. 6, the whole training of B-CNN model involves convolution and pooling operations, calculation of activation function, classification calculation of Softmax classifier, etc. Then, the network structure of B-CNN model is shown in Fig. 7. Among them, network A and  Activation function Softmax classification In Fig. 7, Conv represents the convolution layer, and Maxpool represents the maximum pooling layer. In convolution operation, the convolution function is to realize a feature mapping; the pooling function in the pooling operation is to combine the features of all locations into a general feature. After convolution and pooling, all feature graphs are connected and mapped to a one-dimensional vector, and then the image features are classified by Softmax classifier.

B. RECOGNITION AND DETECTION UNIT
The trained B-CNN model is used for flame image recognition at different positions and angles. At the same time, a group of flame images are collected by image acquisition devices in different locations, and they are input into the trained B-CNN models respectively. The integrated decision is made on the recognition results of different models, and the final conclusion is obtained. The structure diagram of the recognition and detection unit based on multi-information source B-CNN model is shown in Fig. 8. According to Fig. 8, the workflow of recognition and detection unit is shown in Fig. 9.
In the stage of multi-information source integration identification, this paper adopts the "voting" strategy, and adopts odd number of information sources to produce odd number of classification results. If the classification results obtained are inconsistent, the result with more votes is the final result.

A. PERFORMANCE VERIFICATION OF IGOA
Firstly, the advanced optimization performance of IGOA algorithm proposed in this paper is verified. The following four Benchmark functions are used for simulation experiments.  (2) Sphere where x∈ [-2.5,2.5].
Six algorithms of FBH [37], IFS [38], IHS [39], LVCMFOA [40], GOA and IGOA are respectively used for optimization of f 1 -f 4 , in which n is 4, the population size is 30, the number of iteration is 1000. The values of adjustable parameters in each algorithm are shown in Table  1.
Each algorithm is repeatedly run by 20 times, and Best value, Worst value, Average value and Std. value of them are recorded in Table 2.

VOLUME XX, 2017
As can be seen from Table 2, for Ackley function (f 1 ), IHS  and LVCMFOA have poor optimization performance, while  FBH and IGOA have better optimization performance among  FBH, IFS, GOA and IGOA, among which IGOA has  superior performance; in addition, as standard deviation std. reflects the stability of optimization results of the optimization algorithm in multiple runs, it can be seen that IFS and GOA have poor stability, while IGOA has significantly better stability than other methods. For Sphere function (f 2 ), compared with FBH, IFS, IHS and LVCMFOA, both GOA and IGOA have obvious optimization accuracy; however, of the two methods, GOA has poor stability performance, and IGOA has advantages in both optimization accuracy and stability. For Griewank function (f 3 ), among the four methods of FBH, IFS, IHS and LVCMF-TABLE 1  ADJUSTABLE PARAMETERS IN EACH ALGORITHM  Method Parameters setting IFS Small search steps is random given in [15,25] search radius is random given in [0.1,0.5] IHS The harmony number is random given in [20,40], the retention probability is random given in [0.5,0.95], the memory disturbance probability is random given in [0.05,0.5], the minimum bandwidth is 0.0005, the maximum bandwidth is 1. GOA f is random given in [0.1,0.9], l is random given in [1,2], cmax=1, cmin=0.00004. IGOA a is random given in [1,5], f is random given in [0.1,0.9], l is random given in [1,2], cmax=1, cmin=0.00004, ps is random given in [0,0.3].

1) FEATURE EXTRACTION
Firstly, the target region of the image is marked, and then the target region is segmented. Finally, the feature extraction of the segmented target region is carried out as training set and test set. This paper adopts an RGB-HSV-YUV mixed color model for image segmentation, where RGB color rules [41] are as follows: Rule 1: Rule 3: where R, G and B represent the red, green, and blue components of each pixel in the image; S represents color saturation of the image; R T represents a threshold, there is R T ∈ [115,135]; S T represents a threshold, there is S T ∈ [45,60]. Only pixels that meet these three rules are identified as flame pixels of RGB color models.
HSV color rules [42] are as follows: where Y represents brightness of a pixels, and Y mean represents the mean value of pixels on the Y channel, in the YUV color space; U and V represents chroma of a pixel, VOLUME XX, 2017 9 and U mean represents the mean value of pixels on the U channel, and V mean represents the mean value of pixels on the V channel. Thus, the RGB-HSV-YUV mixed color model established for flame area identification is as follows: where F color represents pixel in the mixed color model; F RGB represents pixel meets RGB color rules; F HSV represents pixel meets HSV color rules; F YUV represents pixel meets YUV color rules. The flame region segmentation effect of a pipeline fire image based on the RGB-HSV-YUV mixed color model [44] is shown in Fig. 20.

2) TRAINING OF THE B-CNN MODEL
In the B-CNN model training, parameters of each layer are set in Table 3.  Conv11  3×3  64  Conv12  3×3  64  Conv21  3×3  128  Conv22  3×3  128  Conv31  3×3  256  Conv32  3×3  256  Conv33  3×3  256  Conv41  3×3  512  Conv42  3×3  512  Conv43  3×3  512  Conv51  3×3  512  Conv52  3×3  512  Conv53  3×3  512  Maxpool1  2×2  64  Maxpool2  2×2  128  Maxpool3  2×2  256  Maxpool4 2×2 512 According to the training process of B-CNN model (as shown in Fig. 6), the model is trained by the feature vectors of 300 flame images, and then 300 groups of feature vectors are input into the trained B-CNN model, and the recognition results of training samples are obtained, as shown in Table 4. According to Fig. 6, in the model training, weights need to be reversely updated according to the errors of actual values and calculated values, until the calculated errors meet the requirements. Set the number of iterations in the update process to 100. Then, the curve of calculation error changing with the number of iterations is shown in Fig. 21. As can be seen from Fig. 21, with the increase of iterations, the training error gradually decreases. Combined with the recognition results of training samples by B-CNN model in Table 3, it can be shown that the designed B-CNN network structure has satisfactory recognition performance.

3) TESTING OF MULTI-INFORMATION SOURCE B-CNN MODEL
In order to simulate the results of flame image acquisition by device with multiple positions and angles, the test images are scaled, expanded and rotated respectively. In this paper, flame image processing of three information sources is adopted, and 39 test images are shrunk by 0.5 times, expanded by 1.5 times and rotated by 30º respectively. The effects are shown in Fig. 22.
After 39 groups of test images are shrunk by 0.5 times, expanded by 1.5 times and rotated by 30º , respectively, a total of 39×3 groups of test images are obtained. According to (17)-(33), 19 image feature vectors are extracted and input into three B-CNN information sources. In order to verify the effectiveness of the proposed method, four models including multi-information source ESN, multi-information source PNN, multi-information source SVM and multi-information source CNN are used for comparison. The results of the test samples are shown in Tables 4-6.   According to the results in Tables 4-6, compared with ESN, PNN, SVM and CNN, B-CNN has better performance in classification Accuracy rate, classification Precision rate and classification Recall rate, as it comprehensively considers information from different angles and positions of images. From the perspective of three information sources, information source 1 is to enlarge the image by 1.5 times, information source 2 is to reduce the image by 0.5 times, and information source 3 is to rotate the image by 30º ; according to Tables 3-5, the result of the comparative analysis of five kinds of models, all classification effects on the information source 1 is good, and the worst classification effects on the information source 3, illustrating that when the image produces rotating, the negative influence on the classifier, on the contrary, when to expand the image processing, image characteristics of flame area are more apparent, which improves the performance of the classifier. From perspective of the integrated decision-making, although the classification effect of single information source is not ideal in some cases, the deficiency of it can be made up by information fusion processing and decision-making of multiple information sources. As can be seen from the results in Tables 3-5, although the classification accuracy rate of SVM by information source 3 is only 59.62%, the classification precision rate of ESN by information source 3 is only 57.14%, and the classification recall rate of SVM by information source 3 is only 59.62%, the final classification ability has been greatly improved through the integration decision of multiple information sources. VOLUME XX, 2017 9

VII. CONCLUSIONS
In this paper, an underground coal mine fire identification method based on multi-information source B-CNN is proposed. By setting up image acquisition devices in different locations and using multi-source image information processing method to make integrated decision, the problem that single information source processing method is greatly affected by various external interference factors can be solved. In order to verify this conclusion, the test images were scaled, expanded and rotated in the simulation experiment to simulate the flame images collected at different positions and angles, and the flame recognition was carried out by using three B-CNN information sources. Accuracy, Precision and Recall were used to verify the effectiveness of the proposed method. In addition, An initial solution generation method based on sine mapping and a bad solution acceptance method with a certain probability were adopted to improve the performance of GOA; in the training process of B-CNN model, it can improve the classification performance of B-CNN model by using it to optimize the two parameters W and θ in Softmax loss function. Moreover, in view of the B-CNN in training and integration recognition stage had the problem of high computational complexity, this article used the image characteristics in advance, that was, several image feature vectors under different types were extracted as input of B-CNN, which could reduce the computational complexity of the input image. In the stage of multiinformation source integration recognition, the "voting" strategy was adopted, that was, an odd number of information sources produced an odd number of classification results, and the result with more votes was the final result. However, in many cases, there might be many channels of multiple information sources, or the number of information sources is even; so in the face of these problems, how to adopt a more rigorous integration decision method is a research focus of our future work.