Convolutional Neural Network Based on Complex Networks for Brain Tumor Image Classification With a Modified Activation Function

The diagnosis of brain tumor types generally depends on the clinical experience of doctors, and computer-assisted diagnosis improves the accuracy of diagnosing tumor types. Therefore, a convolutional neural network based on complex networks (CNNBCN) with a modiﬁed activation function for the magnetic resonance imaging classiﬁcation of brain tumors is presented. The network structure is not manually designed and optimized, but is generated by randomly generated graph algorithms. These randomly generated graphs are mapped into a computable neural network by a network generator. The accuracy of the modiﬁed CNNBCN model for brain tumor classiﬁcation reaches 95.49%, which is higher than several models presented by other works. In addition, the test loss of brain tumor classiﬁcation of the modiﬁed CNNBCN model is lower than those of the ResNet, DenseNet and MobileNet models in the experiments. The modiﬁed CNNBCN model not only achieves satisfactory results in brain tumor image classiﬁcation, but also enriches the methodology of neural network design.


I. INTRODUCTION
In recent decades, an increasing number of hospitals have adopted artificial intelligence methods to assist medical diagnosis as computer technology thrives [1]- [7], which promotes the reform and development of intelligent medical care at the same time [8]- [13]. Human higher neural activities such as memory, intelligence and consciousness are controlled by the central nervous system of the brain which is the most complicated structure [14]. Once the tumor metastasizes to any part of the brain, it will damage the different functions of the human body whether it is benign or malignant [15]. In addition, brain tissue is more complex than any other part of the body, making treatment and diagnosis difficult. Traditionally, in addition to analyzing the symptoms of a patient, The associate editor coordinating the review of this manuscript and approving it for publication was Mohammad Shorif Uddin .
doctors usually need physiological test results and a series of images generated by magnetic resonance imaging (MRI, which is a technology that sends electromagnetic waves to an object and returns images of its internal structure [16]) to diagnose the classification of brain tumors.
Medical image analysis is a revolution of practicability and innovative concepts due to the rapid development of hardware, and the use of complex mathematical tools, which can obtain clearly visible medical images [17]. Based on these medical images, effective image analysis can help doctors diagnose and treat patients. The application of machine learning in medical image analysis, such as support vector machines (SVMs) and random forests, has greatly promoted the development of computer-aided medicine [18], [19]. Since the rapid development of deep learning, medical image analysis has made great progress, many new technologies have emerged for medical image analysis, such as VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ convolutional neural networks, 3-dimensional convolutional neural networks, and neural computing [20]- [23]. Image classification for traditional machine learning algorithms, for example, SVM, is hard to implement for largescale training samples. In addition, they are quite likely to fail to solve multi-classification problems. The performance of a convolutional neural network is determined by the depth, width, and residual connections of the network. At present, models used for image classification are generally composed of convolutional neural networks and are almost regularly connected. Models with excellent performance rely on more complex connection modes, and the connection methods of these modes are effective. In addition, some models achieve higher accuracy as the depth of the network increases. However, it increases production costs for industrial realization and mobile terminal deployment.
A convolutional neural network based on complex networks, a new type of model for image classification, is termed the CNNBCN model [24]. Different from traditional regular connections, the connection of the CNNBCN model is not manually designed, but randomly generated by some random graphs. Although CNNBCN is based on random graphs, the nodes in these graphs are highly correlated, not simply pure random. In addition, the network structure of these random graphs is close to the distribution of neurons in the human brain. The CNNBCN model uses a neural network generator to generate random neural networks, and the performance on classification tasks is even better than classic convolutional neural networks. The main task of the network generator of the CNNBCN model is to generate a random graph and convert it into an applicable network. The distribution of connections among nodes in the generated graph is then applied to a network. Then, the CNNBCN model uses a heuristic approach to convert the generated random graph into a directed acyclic graph (DAG). Finally, the generated DAG is mapped into the CNNBCN model in which each neuron has a series of operations such as deconvolution, batch-normalization, and activation function processing. Therefore, the network that can be trained directly is attained. In the experiment, a brain tumor dataset [25] consisting of glioma, meningioma, and pituitary tumors, is used for training. After testing, the accuracy of the trained model can reach 95.49% for brain tumor classification.
The related work of this paper is as follows: • The structural design of convolutional neural networks has changed from simply increasing the depth and width to exploring the complex connection modes such as ResNet [26] and DenseNet [27]. Currently, in addition to artificially designed more sophisticated connections, the automatic generation of neural networks is also an important research direction. Our work is the application of randomly generated convolutional neural networks to the classification of brain tumors.
• With the enhancement of medical equipment and the promotion of deep learning, computer-aided detection (CAD) has made rapid progress in lesion detection, image segmentation, image registration, and image fusion [17], [19], [20]. Our work applies a new method based on a convolutional neural network for medical image analysis.
• Neural network architecture search (NAS) is an algorithm for finding the optimal neural network architecture, which can create some network structures that have not been explored before [24]. NASNet only allows skip connections with one downsampling, but other types of connections are also worth trying. The foundation of the model on which our work relies is presented in [24], and the skip connections in the model structure are of multiple groups and random. It is worth exploring in more fields, not just for brain tumor classification.
• Over the past decade, with the rise of research on complex networks, random graphs have become an important model of complex networks, called random networks. Applying random graphs to neural networks is also a valuable research direction. The results of evaluating the mean of mutual information for random graph models are provided in [28]. The approach presented in [29] uses a pre-trained CNN based on the Caffe deep learning framework for the analysis and classification of the random graphs and networks. The presented CNNBCN model is based on the connection patterns of random graph and networks to be applied to the backbone network for feature extraction. The rest of this paper is divided into four parts. In Section II, the theoretical analysis of the CNNBCN network generator is presented. Then, the structure of the CNNBCN model is given in Section III. For verification, brain tumor classification experiments among the CNNBCN model and other classic image classification models are conducted in Section IV. At length, in Section V, the conclusion of the paper is summarized. In addition, the main contributions of this paper are as follows: • In this paper, a new type of convolutional neural network based on complex networks is applied for medical imaging classification of brain tumors. It enriches practical and effective methods for medical image analysis.
• Based on the original CNNBCN model, a combination of activation functions is proposed to replace the original activation functions, which improves the performance of the CNNBCN model and the classification accuracy by about 0.5% − 0.75%.
• The experimental results confirm that the CNNBCN model is practical and effective. In addition, the combination of activation functions improves the performance of the original model. A neural network generated by a random complex network performs as well as or even better than an artificially designed convolutional neural network.
• The CNNBCN model provides a new idea and method for the construction of neural networks. Randomly generated networks reduce the manual intervention, provide greater exploration space for neural network structure design, and provide the potential for better performance.

II. METHODOLOGY
In this section, the theory of randomly generated graphs and the concept of a network generator, which is the foundation of the CNNBCN model, are presented.

A. RANDOM GENERATED GRAPH
Constructing a randomly generated graph is the first step of the network generator. Three algorithms for randomly generating graphs namely, the Erdos-Renyi (ER) algorithm, Watts-Strogatz (WS) algorithm and Barabasi-Albert (BA) algorithm, are provided in [30]- [32] and are used to construct the randomly generated graph. First, the simplest representation of the ER random network model is provided in [30], [33]. The total number of generated graph nodes is denoted as V ER where the subscript ER denotes the corresponding algorithm. At each step, two nodes are arbitrarily selected and connected with probability p. The p is determined as where E is the number of edges, The process stops when the number of edges reaches E.
The crucial statistical properties of the generated graph are obtained in the following discussions. The average degree of the network can be written as [33] where · , κ and represent the corresponding average value, degree of a node, and asymptotic equality, respectively. The degree distribution is obtained: The average cluster coefficient of random networks can be written as where α ∝ β denotes that α is proportional to β. The average distance of the nodes in random networks is obtained as follow: Second, the WS small-world network model, which is between the rule network and the random network, is provided in [31], [33]. Starting from a regular net model, each edge is randomly reconnected with a probability p while remaining as a simple graph. Consequently, each node has µ neighbors, and the generated model is displayed as a regular net model when p = 0. However, when 0 < p < 1, the expectation value of the random reconnection edge is pµV WS (V WS → ∞), where the symbol V WS indicates the number of generated graph nodes along with the subscript WS denoting the related algorithm. Additionally, the generated model is displayed between the rule and the randomness. All edges of the model are randomly reconnected and transformed into an ER random mesh model when p = 1. λ is used to denote the constant degree value of every node in the rule network corresponding to the small-world model. When p = 0, the average cluster coefficientˆ WS is defined as: When 0 < p < 1, the probability that neighbors of any node remain unchanged is (1 − p), and the probability of their adjacency is also (1 − p). Therefore, the expectation value of the average cluster coefficient of the small-world network is obtained: Using the mean field method to obtain the analytical expression [30] ζ where ϑ is the dimension of added edges and When p = 0, the degree distribution of the small-world network is the same as the rule network. The degrees of all nodes are λ. When p > 0, since each edge retains one unchanged endpoint, each node has at least λ/2 edges after reconnection. The degree distribution is provided in [30]: where It is concluded that the degree distribution follows with a Poisson distribution. Third, the BA scale-free network closer to the actual complex networks is presented in [32], [33]. Growth and preference are two essential attributes of the BA scale-free network model. The former indicates that a complex network is an open system, that new basic units are continuously added, and that the total number of nodes is increasing. The latter means that the probability of a node connecting a new edge should be monotonously dependent on its existing degree. The model presented on the basis of these two principles is expressed as follows: • When = 0, the network has ω 0 nodes, and ω(ω ω 0 ) old nodes are connected with a newly added node where symbol denotes time step in the network.
• A new node is connected to the old node i with a probability that is proportional to the degree of the node. Therefore, the connection probability is written as: where symbol τ i and V BA with subscript BA indicate the concerned algorithm indicates the node i's degree and the number of network nodes, respectively.
• Finally, the BA scale-free network model reaches a stable evolutionary state. The BA scale-free model degree distribution is provided by the solution of the master equation in [34], [35]. The steady state distribution can be written as where s means the index of a special node. Consequently, P(κ) BA follows an exponential function [32], [33]. The average distance of the BA scale-free model changes with V BA is provided in [30]: The rule that the average cluster coefficient changes with the increase of the network size ς was proposed in [36]: Therefore, the average clustering coefficient of the rule network does not change with increasing the network size.

B. NETWORK GENERATORS
The main process steps of the network generators are described in this section. The first step is to obtain a random undirected graph. Three models of randomly generated graphs were described in detail in Section II-A. To make the network generator structured and complete, a method that converts the generated graph to a trainable network is presented. The complexity of a graph is determined by the number of nodes and edges in a graph. The randomly generated graph applied in the network is defined as follows.
• ER algorithm: the initial state of the random graph contains V ER nodes and 0 edges. Each pair of nodes of the random graph, is connected with a probability of P. The algorithm can be written as ER(P). When P > ln(V ER )/V ER , the graph has a large probability of becoming a connected component.
• WS algorithm: in the initial state of the random graph, a total of V WS nodes are distributed in circles. Every node of the random graph is connected to Z /2 neighboring nodes in total (Z is an even number). Nodes in the random graph are traversed in clockwise order. For each node, examine the ith node connected to it. This node is reconnected with probability P. ''Reconnect'' means randomly selecting a node except itself to connect a node unconnected with the current node, and the connected edge is not a multiple edge. When 1 < i < Z /2, a new random map is obtained after repeating the operation Z /2 times. This algorithm can be written as WS(Z , P).
• BA algorithm: the initial state of the random graph contains Q (1 Q < V BA ) nodes and 0 edges. For each node added, Q edges are added to the random graph. The newly added node is connected to the old nodes with a particular probability that is related to the degree of nodes. The algorithm terminates when the number of nodes increases to V BA . This algorithm can be written as BA(Q). The second step converts the undirected graph into a directed acyclic graph (DAG). The graphs generated by the ER, BA, and WS algorithms are all random undirected graphs. The generated random undirected graph is converted to a DAG by a simple method: every node in the graph is assigned an index, and the direction of each edge is set to point from the node of the smaller index to the node of the larger index. Evidently, the directed graph generated by this algorithm has no cycle. The node indexing strategy of the ER algorithm is that the indices are assigned randomly. The node indexing strategy of the WS algorithm is that the order of the index is assigned in clockwise order. The node indexing strategy of the BA algorithm is that indices of Q nodes in the initial random graph are assigned from 1 to Q, and the remaining indices are assigned according to the order in which they are added to the graph.
The third step is to map the generated DAG to a trainable network. The network is composed of a data stream line for transmitting data and an arithmetic module for processing the data stream. The former forms a mapping with edges, and the latter forms a mapping with nodes. This is consistent with the intuitive understanding. The mapping process is divided into edge operations and node operations. Since our goal is mapping a directed graph to a computable neural network, edge operations are defined as passing data streams from one node to another. The node operations consist of the following three steps: • Aggregation: the input data are summed based on their weights.
• Transformation: the results after aggregation are processed by transformation which contains a series of operations such as activation functions, 3 × 3 convolutions, and batch normalization (BN) [37].
• Distribution: the results after transformation are transmitted to the next node. Node operations are shown in Fig. 1. The data are transmitted by the previous node through four input edges. Then, the learnable and positive weight parameters W 1 , W 2 , W 3 , and W 4 , which are processed by the sigmoid function to make them positive, are used to calculate the sum of the weights. In addition, the aggregated data are transformed by a series of operations such as activation functions, convolutions, and BN. Finally, data are output to the next node through five channels. The node operation achieves satisfactory results. Due to aggregation, the number of channels, both input and output, is limited.
A generating graph with edge operations and node operations is obtained. Since the number of input and output nodes is not determined, the graph cannot be called an effective neural network. The CNNBCN model is used for image classification, so we transform the generated graph into a network with explicit input and output. A short and effective method is used to achieve this goal. The approach is similar to streaming network technology. An input node and an output node are added to the graph. The original input nodes are all connected by the new input node and receive the input data in a module. Similarly, the original output nodes are all connected to the new output node. These two new input and output nodes are not convolved, and they are not counted in the total number of nodes V . An effective neural network is attained because of the single input nodes and output nodes. An effective neural network is often composed of multiple parts, and each part has a disparate structure with different tasks. Hence, it is necessary to obtain the feature map by downsampling to achieve network partitioning. The graph network generated above is considered as a module. A complete and effective neural network consists of multiple modules. Each module is connected to other modules through its input nodes or output nodes. Thus far, a CNNBCN model including several modules that can be directly input into training is obtained.

C. MODIFIED MODEL
According to the previous section, a CNNBCN model consisting of several modules is obtained. To optimize the generated model, a new combination of activation functions is applied. After many rigorous tests, the combination of activation functions composed of Gaussian error linear units (GeLUs) [38] and Rectified linear units (ReLUs) is applied to several modules of the model. The GeLU activation function is applied to the first and second modules, and the ReLU activation function is used in the remaining modules and the classifier. The GeLU nonlinearity is the expected transformation of a stochastic regularizer that randomly applies the identity or zero map to a neuron's input. The GeLU nonlinearity weights are input by their magnitude, rather than gates input by their sign as in ReLU [38]. For GeLU(x), which is assumed to be a standard normal distribution, a mathematical formula for approximate calculation is provided in [38]: Finally, a modified CNNBCN model that can be directly put into training is generated.

III. STRUCTURE OF THE CNNBCN MODEL
In this section, the structure of the CNNBCN model is presented. A computable network is constructed by a network generator in Section II-B. The generated random graph is transformed into the DAG and then mapped into a neural network. The random graph used to construct the neural network is generated by ER, WS, and BA algorithms. It is easy to understand the complete generation process of the model in Fig. 2. A complete CNNBCN model is composed of several modules. The CNNBCN model is divided into simple mode and regular mode due to the complexity of the neural networks. The number of channels in data feature extraction is one of their differences. The former has 78 channels and the latter has 109 channels. The structure of the CNNBCN model is shown in Table 1. In the table, V refers to the number of nodes and C denotes the number of channels. The pixel size of the input is 224 × 224 after resizing.
The structure of the CNNBCN model is provided in Fig. 3. The neural network is mapped from a random graph, and all nodes (except custom nodes) are randomly generated and distributed. To make the structure of the generated network more intuitive, the nodes of the input, output, and hidden layers are manually limited to be randomly generated within a certain range. The custom input and output nodes are represented by a large circular mark, and the input nodes of the random graph are represented by regular triangle mark. An inverted triangle means that the output nodes and random nodes in the network are indicated by small circles. The random graph of this model is constructed by the WS algorithm. Therefore, the first layer is input to the random graph network through module 1 , and then passes through the four random graph networks in sequence. Finally, the final result is obtained after passing through the classifier. Therefore, the CNNBCN model is composed of several modules and a classifier is obtained.

IV. COMPARATIVE EXPERIMENTS
A brain tumor dataset provided from Nanfang Hospital, Guangzhou, China, and General Hospital, Tianjing Medical  University, China, from 2005 to 2010 is used for comparative experiments of the models. This dataset is collected from 233 patients and consists of 708 meningioma images, 1426 glioma images, and 930 pituitary tumor images [39], [40]. The images have an inplane resolution of 512 × 512 with a pixel size of 0.49 × 0.49 mm 2 . In addition, the dataset is publicly available and can be obtained in [25]. Samples of some brain tumors are provided in Fig. 4.
The experiment consists of three parts. The first part is a comparison of the three generation algorithms of the original model and the modified model. To increase the credibility of the model and the persuasiveness of the experiment, the second part of the experiment is a comparison of other researchers' models on brain tumor classification using the same dataset. The last part is the performance comparison of other effective image classification models. The number of nodes in the randomly generated graph is set as 32. In the comparison experiment, it is reasonable to choose a deeper neural network model, because a complete CNNBCN network consists of five modules and each module has 32 nodes. In addition, the parameters were set to ER(P = 0.2), WS(Z = 4, P = 0.75), and BA(Q = 5) in the three graph generation algorithms. Due to the small number of classification labels, the model is set to a simple mode. The loss function of this experiment is selected as the cross-entropy loss function [41], [42]. The experimental results are shown in Table 2, from which the parameters are intended for evaluating We separately tested the CNNBCN model and the modified CNNBCN model generated by the ER, WS, and BA algorithms for accuracy, loss of training and test for identifying the type of brain tumor. The modified CNNBCN model has significantly higher test accuracy than the original model. The highest test accuracy of the original model is 94.53%, the test accuracy of the modified model increases by 0.5% − 0.75% on average, and the highest accuracy reaches 95.49%. The modified CNNBCN model achieves satisfactory experimental results with the combination of GeLU and ReLU activation functions. Specifically, subfigures (a)-(d) in Fig. 5 provide more training and test information about the original model and the modified model, respectively. The modified CNNBCN model uses less training time to reach the convergence state, as can be seen in subfigures (a)-(b). Besides, the comparison results between the modified CNNBCN model and the original CNNBCN model in terms of test accuracy and test loss are given in subfigures (c)-(d) from which it is clear to see that the modified model is better than the original model. Some models presented by other researchers for brain tumor image classification are provided in Table 2. Since these models use the same dataset as our models, they are in appropriate contrast. The modified CNNBCN model achieves better results compared to other works. In addition, the test accuracy of the modified CNNBCN model and other performance parameters are better than those of the other presented models. The experimental results compared with other classic convolutional neural network models are shown at the bottom of Table 2. From this table, the results of training and test are visually presented. The modified CNNBCN model generated by the randomly generated graph algorithm achieves satisfactory results. Their training accuracy rates are all 100%, and the test accuracy rates are all above 95%. In comparison with other image classification models, the modified CNNBCN model also achieves a higher ranking. The test loss of the CNNBCN model is minimal in comparison with other image classification models. In particular, the training time of the modified CNNBCN model is lower than that of other convolutional neural network models of the same depth, such as ResNet-151 and DenseNet-161. In addition, the test accuracy of the modified CNNBCN model is higher than that of EfficientNet-b0 which is equipped with the strongest image classification performance in Table 2. The results show that the modified CNNBCN model generated by the randomly generated graph algorithm achieves satisfactory results in the classification of brain tumor images, which provides a feasible prospect for the construction of neural networks.

V. CONCLUSION
The convolutional neural network based on complex networks with a modified activation function for image classification of brain tumors, abbreviated as CNNBCN, generated by ER, WS, and BA algorithms has been presented in this paper. Experimental results have shown that the classification accuracy of the original CNNBCN model and modified model are better than some manually designed neural networks. In addition, its performance is comparable to one of the best current image classification models. The CNNBCN model has not only achieved satisfactory results in the field of brain tumor image classification, but also provided a reference for the design of network structures. The construction of neural networks by using connectomes of animals and even humans will be considered in our future work.