Analysis and Optimization of Network Properties for Bionic Topology Hopfield Neural Network Using Gaussian-Distributed Small-World Rewiring Method

The fully connected topology, which coordinates the connection of each neuron with all other neurons, remains the most commonly used structure in Hopfield-type neural networks. However, fully connected neurons may form a highly complex network, resulting in a high training cost and making the network biologically unrealistic. Biologists have observed a small-world topology with sparse connections in the actual brain cortex. The bionic small-world neural network structure has inspired various application scenarios. However, in previous studies, the long-range wirings in the small-world network have been found to cause network instability. In this study, we investigate the influence of neural network training on the small-world topology. The role of the path length and clustering coefficient of neurons is expounded in the neural network training process. We employ Watt and Strogatz’s small-world model as the topology for the Hopfield neural network and conduct computer simulations. We observe that the random existence of neuron connections may cause unstable network energies and generate oscillations during the training process. A new method is proposed to mitigate the instability of small-world networks. The proposed method starts with a neuron as the pattern centroid along the radial, which arranges its wirings in compliance with the Gaussian distribution. The new method is tested on the MNIST handwritten digit dataset. The simulation confirms that the new small-world series has higher stability in terms of the learning accuracy and a higher convergence speed compared with Watt and Strogatz’s small-world model.

became one of the best bionic computational models of that 23 time. The HNN showed various advantageous properties such 24 as object recognition capabilities, categorization, and error 25 correction. Despite the recent dominance of neural networks 26 The associate editor coordinating the review of this manuscript and approving it for publication was Rodrigo S. Couto .
(e.g., convolutional neural networks) using different learning 27 methods (e.g., gradient descent), classifiers (e.g., softmax), 28 and hardware accelerations (e.g., CUDA), the Hopfield-type 29 neural network is still one of the most effective computational 30 models that can be trained similar to the real biological brain. 31 The HNN is a spin dynamics system that coordinates the 32 connection of each neuron with all the other neurons (without 33 self-loops, Discrete HNN). Each neuron pair is connected 34 by a synaptic weight, and each neuron performs a weighted 35 summing on the states of other neurons. The neuron state is 36 activated by the presetting threshold of the signum function 37 VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and the influences of the other neurons. The weight matrix influence of the topology on the network memory function 94 and measured varieties of topologies in terms of storage 95 performance and pattern retrieval [17], [18]. The small-world 96 network with several shortcuts has been shown to have the 97 same efficiency as a random network. Recently, researchers 98 further confirmed the biological reality of small-world net-99 works. Chen et al. [19] reported that the cerebellar functional 100 connectome of an actual human is small-world organized. 101 Rosen et al. [20] estimated the absolute number of axons 102 linking cortical areas from a whole-cortex diffusion MRI 103 connectome and observed that real human cortical areas are 104 small-world connected. Pircher et al. [21] further compared 105 the small-world network between artificial and biological-106 based neural networks and observed remarkable parallels 107 between these two neural networks. Scientists have started 108 exploring the specifics of the characteristics of small-world 109 networks. Arvin et al. [22] explored the role of short and long-110 range connections in small-world networks. Their research 111 revealed that short-range connections dominate the dynam-112 ics of the system, e.g., affect the volatility and stability of 113 the network, and long-range connections drive the system 114 state. Rüdiger et al. [23] also reported that the long-range 115 connections of small-world networks may make the network 116 unstable, supporting frequent supercritical mutations.  Ravasz et al. [24], [25] uncover a rule that the probability that 118 two neurons are connected declines exponentially as a func-119 tion of the distance between them. This important principle is 120 termed ''the exponential distance rule''. Takagi [26] studied 121 energy constraints for modeling human brain connections. 122 His results shown that the energy constraints play a crucial 123 role in regulating brain structures. These studies have implied 124 the random rewiring mechanism of the WS small world 125 model may form a neuron connection distribution that does 126 not conform to the bio-growth cost rule and the exponential 127 distance rule. In addition, the current WS small-world model 128 is yet to consider the specifics of neural network training, 129 such as the consistency of network energy and the random 130 neuron connections that may cause unstable network energy. 131 Therefore, clarifying the influence of neural network training 132 on the small-world topology and improving its stability are 133 urgent.

134
The contributions of this study are as follows: 1) The 135 impact of the small-world topology on the HNN training 136 is investigated. The role of the path length and clustering 137 coefficient of neurons in the neural network training pro-138 cess is elaborated. 2) The instability shortcoming of the 139 random rewire mechanism in the WS small-world model 140 is discussed. The random existence of neuron connections 141 that may cause unstable network energies and generate oscil-142 lations during the training process is highlighted. 3) The 143 Gaussian-distributed small-world wiring method is proposed 144 to improve the stability of the small-world HNN. The nov-145 elty of the new rewiring method is that it organizes neuron 146   node i, Therefore, the state of neura l node i is determined by thresh-187 old T i and the influences received from the states of other 188 nodes. The weight matrix of the DHNN, which is symmet-189 ric and zero diagonal, has w i,j = w j,i , w i,i = w j,j = 190 0. The state updating rule is maintained by where t represents the time of process for 192 the neural node i. The Lyapunov energy function is  197 In the training process, the weight between neurons i and j 198 can be calculated by (1). In equation (1), s denotes the pattern 199 number. This weight updation method is usually called the 200 Hebbian method.
In the Wan Abdullah method, energy is built upon the sat-203 isfiability of the clause composed of neurons. α represents 204 the Boolean relations between neurons i and j, and the CNF 205 is written as (2). To identify inconsistencies between the 206 clauses, (2) is negated by applying the De Morgan law and is 207 written as (3). Subsequently, the cost function can be written 208 as (4).
The synaptic weight can be computed in Table 1. In equa-217 tion (2), when all the clauses (composed of two literals) are 218 satisfied, α is called 2-satisfied. When extended to the entire 219 network, each weight of the network can be computed by 220 iterating all the pairs of neurons according to equations (3), 221 (4), and Table 1.

222
In this study, we implemented both the Hebbian method 223 and the 2-SAT Wan Abdullah learning method to explore the 224 impact of small-world topology on neural network training 225 in our computer simulations. We compare these two meth-226 ods under the small-world neural network regarding learning 227 accuracy and convergence speed. However, higher 3-SAT 228 and Max-kSAT are not integrated because the small-world 229 rewiring mechanism does not guarantee connections between 230 any three neurons.    In 1998, WS proposed a method to form the small-world 252 network [16]. The method proposed two stages to form 253 the small-world network: initializing and rewiring. With n 254 nodes in the network initializing stage, each node is con-255 nected with k nodes from the near region to begin forming 256 a high-centralization ring structure. Each node N i is divided 257 into two sides (left and right) in the rewiring stage, with k/2 258 connections per side. P is the rewiring probability. Connec-259 tions are taken from the right side and rewired to a randomly 260 selected node by P, and no self-loop is stipulated. Adjusting P 261 can yield a small-world network between the regular network 262 (P = 0) and the random network (P = 1). Figure 2 shows 263 three topologies formed by different rewiring probabilities 264 from a 20 node network. The case k = 4, P = 0, forms a 265 regular network, while k = 4, P = 1 generates a random 266 network. The small-world network may be formed when P is 267 in the interval [0,1]. This illustrates a small-world network 268 formed under k = 4, P = 0.5. By adjusting the P, the 269 small-world network can be formed between the regular and 270 random networks.
The WS small-world model elaborates on the small-world phenomenon in the real world, such as the neural network 273 of the worm Caenorhabditis elegans, the power grid, and 274 the collaboration graph of film actors. In complex network 275 theory, the small-world network belongs to a type of ran-276 dom network. However, the small-world network has unique 277 bionic advantages that differ from other complex networks.

278
In this study, we employed the WS small-world model as  energy. The small-world network energy may be divided into 319 two parts: the regular lattice part is formed by K /2 nodes from 320 the near region, and randomly chosen nodes form the random 321 rewire part of the network. The network neuron flips the state 322 driven by these two energy parts. The regular lattice part is the 323 same as that formed by the same region of the network. But 324 the random rewired part is unstable. It may cause the neuron 325 to flip to the wrong state. In this section, we compare the 326 training process of the fully connected structure with that of 327 the small-world structure, and then elaborate on the impact 328 on network training in terms of network characteristics and 329 energy.
330 Figure 3 illustrates two neural network training processes. 331 The figure on the top is the fully connected structured neural 332 network, and the bottom is the small-world structured neural 333 network. The yellow node represents the neuron in training. 334 The fully connected structure organizes the yellow node to 335 connect with all other nodes. The yellow node connects to 336 only a few blue nodes in the small-world structure. The nodes 337 in the green frame represent the regular lattice nodes, and the 338 rest of the nodes represent the random rewired nodes of the 339 small-world network. During the training process, the yellow 340 node traverses from left-top to right-bottom of the network. 341 Under the fully connected structure, each weight between the 342 yellow and blue nodes needs to be updated. By comparison, 343 the yellow node connects fewer blue nodes under the small-344 world structure, thus requiring fewer weights to be updated. 345 The Hebbian method and the Wan Abdullah method remain 346 applicable as the learning method for the small-world neural 347 network, and the signum function decides the neuron state. 348 We noted no significant difference in the training mechanism 349 between the fully connected and the small-world neural net-350 work. The changes in network characteristics and energy are 351 two essential factors that impact neural network training. The impact of the network characteristics is mainly 353 reflected in the energy converge efficiency of neural network 354 training. The network characteristics are described by two 355 factors: the average path length and the clustering coefficient 356 of the network. In Figure 4, we compare the average path 357 length of the fully connected topology with that of the small-358 world topology HNN. In the fully connected structure, each 359 VOLUME 10, 2022 To better clarify the small-world DHNN, we constructed the 413 WS small-world model over the DHNN. We divided the 414 small-world HNN algorithm into two stages. In the initial-415 izing stage, a user is required to input the parameters k and 416 rewire probability P. Then, the weight matrix for the HNN 417 is initialized by the connection information of the topology. 418 In the learning stage, both the Hebbian and 2SAT Wan Abdul-419 lah learning methods are integrated, and a variable is defined 420 to switch between the learning methods. Either the Heb-421 bian or 2SAT Wan Abdullah method may update the weight 422 matrix. The signum function is applied to update the neuron 423 state till all neurons remain at a stable state, and then the 424 stop condition is met. Then, the output pattern can be pro-425 duced. Based on the assumption that the input is a pattern in 426 binary format, the algorithm of the small-world HNN is as 427 follows.
428 FIGURE 6. Energy comparison: small-world neural network vs fully connected neural network. Initializing stage:

429
Step 1: Initialize the input pattern, and specify the size of 430 the pattern n and the data of the neuron states.

431
Step 2: Initialize the small-world parameters, specify 432 parameter k, rewire probability P, and allocate the weight 433 matrix with n × k size.

434
Step 3: Initialize the small-world network, and connect 435 each neural node with its k nearest neighbor nodes.

442
Step 5: Start iterating neural nodes in the network, and read 443 the neuron state from the pattern.

444
Step 6: If the Hebbian method is specified, compute  Step 8: Update the state for neural nodes by and V i can be computed by V i = n j=1 w i,j ·x j . x j refers to the 455 output of neural node j.

456
Step 9: Check the stop condition where all neurons of 457 the network meet S (t) = S (t − 1), then stop the training 458 iteration. Otherwise return to Step 5. Continue the training 459 iterations till the stop condition is met.

460
Step 10: Obtain the weight matrix. Compute the state for 461 all neurons of the network by signum function. Yield the 462 pattern for output. The small-world-based HNN algorithm was implemented 465 using C++ for Microsoft visual studio 2017 on a machine 466 with an i5 CPU, 16 GB memory, and the Windows 10 oper-467 ating system. In our embodiment, the neural network's size 468 (number of neural nodes) is defined as an integer variable. 469 The rewire probability P and the number of the nearest neu-470 ron k are parameterized to form the small-world network. 471 A multi-dimensional array is assigned to store the weight 472 matrix and the connection in the formation. The weight 473 matrix is allocated on a continuous memory address to boost 474 VOLUME 10, 2022 the searching and iteration speed. In the initializing stage, the

524
This also suggests that the sparse small-world may achieve 525 a performance comparable to the fully connected network.

526
Its competency as a neural network topology is confirmed 527 regarding network characteristics.

528
In the second test, we measure the learning accuracy of 529 the small-world neural networks formed by different com-530 binations of the parameter k and rewiring probability P. 531 We defined learning accuracy as the similarity between the 532 retrieved and original input patterns. We assumed that k 533 equals the size of a network with n neurons and that the 534 rewiring probability P equals 0. Then, k is gradually reduced 535 to k = 4, and for each k, P is distributed from 0 to 1 at 536 0.05 intervals. Figure 9 shows the similarity distributions 537 with different combinations of k and P. We observed that 538 in most cases where k is large (k > 20), the similarity 539 equals 1, which means that the retrieved pattern is entirely 540 consistent with the original pattern. Points at which the sim-541 ilarity is less than 1 mostly appear in the area where k is 542 small (k < 16). Figure 9 shows the trend in learning accuracy 543 under different combinations of k and P for the small-world 544 network characteristics. Starting with P = 0, when k tends 545 toward n, it means the network is closer to the fully connected 546 structure and tends toward obtaining the same learning results 547 as the fully connected neural network. The fully connected 548 neural network is obtained till k = n. Conversely, while k is 549 gradually decreased by only slight rewiring, we may obtain a 550 remarkably high learning accuracy (learning accuracy = 1). 551 When k drops to a small value (k < 20), the learning accuracy 552 also drops. This trend is also observed throughout our tests on 553 the MNIST dataset.

554
However, we also realize that one combination of k and 555 P may form different structures because the current random 556 rewiring mechanism may wire a neuron to any neuron in the 557 network. Therefore, the random drop in the learning accuracy 558 may be attributed to the unguaranteed composition of the 559 inconsistent energy compared with that of the fully connected 560 structure. As mentioned in Section III, there are two energy parts to 565 consider during the training process of the WS small-world 566 neural network: 1) The regular lattice ring, which is entirely 567 consistent with that of neurons at the corresponding position 568 of the fully connected network. These neurons form a sta-569 ble energy. 2) The random rewiring, which is of a highly 570 uncertain state because these neurons are chosen randomly 571 from the network. These rewired neurons form the unstable 572 energy of the network. When the rewiring energy is increased 573 to a certain proportion in the composition of the network 574 energy, it magnifies the adverse effects in terms of conver-575 gence efficiency and learning accuracy and even generates 576 oscillations during the training process. However, appropriate 577 rewiring may substantially reduce the average path length 578 for the network, and can promote convergence efficiency. 579 In addition, when k increases, the granularity of the regu-580 lar lattice ring increases, and the learning accuracy is also 581 enhanced.   The current WS small-world model uses the random 583 rewiring mechanism. However, as the topology of the neu-584 ral network, the assurance of consistent energy is yet to 585 be considered. We propose two improvements to the WS

594
For the first requirement, we take a neuron from the 595 network as the centroid, compose the regular lattice ring, 596 and then examine the neurons along the radial in terms of 597 their impact on the network characteristics. We observed 598 that in the WS small-world model, the neurons near the 599 centroid have more overlapped neighbors with the regular 600 lattice ring. While extrapolating along the radial, the increase 601 in the clustering coefficient showed an exponential decline, 602 and the neurons far from the centroid barely benefited from 603 increasing the clustering coefficient. However, slight wiring 604 with distant neurons may substantially reduce the average 605 path length. Therefore, we assumed that the connection quan-606 tities along the radial obey the Gaussian distribution. For the 607 second requirement, we integrate the validation step using the 608 2SAT Wan Abdullah method in the new small-world rewiring 609 method to ensure consistent energy.  x represents the data layer variable, and f (x) represents the 620 connection quantities on each data layer and can be written 621 as equation (8).
C is the size of the regular lattice ring, µ is initialized to 0, The detailed steps of the Gaussian-distributed small-world 642 rewiring algorithm are as below:

643
Step 1: Choose parameter k and rewiring probability P for 644 the small-world topology.

645
Step 2: Take the neuron as the centroid of the pattern, 646 Divide the pattern into x data layers.

647
Step 3: Connect the neuron with its k nearest neighbor 648 neurons as a cluster. Denote C as the size of the cluster.

649
Step 4: Initialize the rewiring quantity R = 0 and generate 650 K /2 random numbers. Check for every random number. If it 651 is smaller than the preset rewiring probability P, increase one 652 to R.

653
Step 5: If the random number is greater than the rewiring 654 probability P, connect the neuron with the follow-up neuron. 655 Increase the cluster size C by one.

656
Step 6: Taking the current neuron as the cluster center, 657 divide the pattern into data layers along the radial. Initialize 658 σ by equations (8) and (9).

659
Step 7: Compute the connection quantity for each layer by 660 equation (10) and initialize x = 1.

661
Step 8: Wire neurons randomly to the related layer.

662
Step 9: Validate the energy for the neuron by the 2SAT 663 Wan Abdullah method. Restart rewiring the neurons to the 664 related layer if energy is inconsistent.

666
We evaluated the Gaussian-distributed small-world rewiring 667 method using two aspects. 1) The coincidence degree with 668 the fully connected neuron series -The similarity of the 669 new small-world series with the fully connected series was 670 measured to demonstrate that the small-world may obtain the 671 same accurate result as the fully connected series. 2) The con-672 sistency of the energy-By evaluating the consistent energy of 673 the small-world series formed by the new method, we further 674 confirm the stability of the Gaussian distribution rewiring 675 method.

676
To evaluate the approximation between the small-world 677 series generated by the new method with the fully connected 678 structure, we employed dynamic time warping (DTW) to 679 measure the coincidence for these two series with different 680 sizes. Figure 11 illustrates three small-world neuron series 681 compared with the fully connected series, which formed on 682 different regions of 100 neurons. In most places, the neuron 683 series of the small-world highly coincides with the fully 684 connected series, while the differentials appear at only a 685 few positions. This explains the reason for the new rewiring 686 mechanism's high learning accuracy. The new Gaussian-687 distributed rewiring mechanism ensures high clustering cen-688 trality in the near region of the centroid and slight rewiring in 689 the distant region, therefore maintaining stability. The rewire 690 mechanism reserved randomness regarding the rewire prob-691 ability P to ensure that the disordered small-world topology 692 is formed, further measuring its convergence trends to ensure 693 the stability of output results.

694
To verify that the neuron series of the small-world structure 695 is accurately converged, we evaluated its logic inconsistent 696 energy using the Wan Abdullah method [7]. We randomly 697 chose a neuron (state is 1, occasionally) from the network, 698 VOLUME 10, 2022  and then set a small value of k (k = 6) as the starting value. (¬S 4 ∧ S 5 ∧ ¬S 6 ). We can compute the weight between neu-711 rons by applying the 3-SAT Wan Abdullah method in Table 3. 712 Hence, we can verify the logic inconsistent energy by com-713 puting the state of the neuron. Then, we increase parameter 714 k and the variables for the CNF, and we verify the logic 715 inconsistent energy till k equals n, which is the network size. 716 On the left-hand side of Figure 12, the curve shows the 717 convergence trends of the standard deviations obtained from 718 twelve rounds of repeat testing. The abscissa denotes the 719 k value, and the data points of the ordinate represent the 720 corresponding standard deviation. When k < 26, the results 721 of the remaining incorrect neuron states (state of 0) can be 722   [27], [28] as the testing dataset. It is a mainstream 742 dataset in digit recognition, which contains 60,000 handwrit-743 ten digit training samples and 10,000 testing samples. Each 744 sample has been standardized to a 28 × 28 pixels grayscale 745 image. We measured similarity in the MNIST dataset by com-746 paring the retrieved and original patterns. The measurement 747 covered 60,000 training samples and was conducted using 748 different configurations of learning methods and topologies. 749 However, when measuring the similarity under these con-750 figurations, we observed that all the configurations could 751 obtain a result that has a similarity equal to one, after just 752 a few rounds of training. Meanwhile, we obtained accurate 753 recognition results on almost all of the testing images.

754
In Figure 13, from the plotted digit testing results, we can 755 observe that the small-world network is significantly ahead of 756 the fully connected topology in terms of learning accuracy. 757 VOLUME 10, 2022 Many methods have achieved impressive results in hand-795 written digit image classification. The MNIST benchmark 796 database [28] shows that the convolutional neural network 797 is the most accurate model for handwritten digit classifi-798 cation. Jarrett et al. [29] reported their CNN model with 799 multistage feature exaction obtained a test error rate of 0.53. 800 Cireşan et al. [30] employed a plain multi-layer perceptron 801 (MLP) in their DNN (deep neural network) model and yielded 802 a 0.35% test error rate. In this section, we test the per-803 formance of the proposed Gaussian-distributed small-world 804 HNN from a practical application view. Because the HNN 805 is a feedback neural network [1] [31], to better test perfor-806 mance, we integrated the Gaussian-distributed small-world 807 HNN with Cireşan's DNN model as a solution for hand-808 written digit image classification [32]. Figure 15 illustrates 809 the architecture of the Gaussian-distributed SWHNN-DNN 810 of the training and testing stages.

811
The training architecture of Gaussian-distributed SWHNN-812 DNN contains a Hopfield layer [31] and a DNN layer 813 [30]. The Hopfield layer contains ten units of the Gaussian-814 distributed SWHNN, which are used for training the digit 815 images of the digits from ''0'' to ''9'', respectively. As shown 816 in Figure 15, the Gaussian-distributed SWHNN unit orga-817 nizes 784 neurons with Gaussian-distributed small-world 818 wiring. The DNN layer is trained in a separate process. Its 819 hidden layer is initialized by five fully connected layers in 820 which the number of neurons of each layer is 2500, 2000, 821 1500, 1000, and 500. The output layer (SoftMax layer) 822 contains ten neurons that use the ''SoftMax'' function as the 823 activation function to output classification results.

824
Under the testing architecture, each test image is input 825 into the Gaussian-distributed SWHNN units in parallel. 826 We restricted the Gaussian-distributed SWHNN unit from 827 memorizing test images, which means the weight matrix in 828 the SWHNN unit is protected and test images are running on 829   [32], [33], [34]. The pattern from the selected channel 834 is retrieved and submitted to the DNN layer for classification.

835
During the training stage, first, we trained the HNN layer. 836 We manually divided the MNIST training set into ten clusters 837 of digits ''0'' to ''9'', and these digit images were submit-838 ted to Gaussian-distributed SWHNN units for training. The    the Gaussian-distributed SWHNN-DNN is lower than other listed CNN methods. From training complexity aspects, the In addition, the Gaussian-distribution small-world wiring 925 method is dedicated to improving the instability defi-926 ciency of the WS small-world model. However, it dif-927 fers from the hyper-parameter optimization method in deep 928 learning. We summarize the key points. 1) The proposed 929 Gaussian-distributed small-world network does not leverage 930 the optimal parameter combinations for improving train-931 ing performance. The parameter combinations of k and P 932 cannot determine the connection between a pair of neu-933 rons; one combination of k and P may form different 934 small-world networks.
2) The workflow of the Gaussian-935 distributed small-world wiring method differs from that of the 936 hyper-parameter optimization method. The hyper-parameter 937 optimization method usually contains an expensive step of 938 finding the optimal parameters. The Gaussian-distributed 939 small-world wiring method does not include such steps. 940 3) These two methods have different perspectives on opti-941 mizing the training of neural networks. The hyper-parameter 942 optimization method more focuses on the pros and cons of the 943 combination of configurations of neural networks. However, 944 the Gaussian-distributed small-world method emphasizes the 945 biological reality and the stability of the network energy of 946 the HNN.

948
In this study, to address the instability issue of the small-949 world network, we examined the influence of the small-950 world topology on the HNN learning process and conducted 951 computer simulations. We observed that instabilities due to 952 the random existence of neuron connections cause unsta-953 ble network energies, which may generate oscillations dur-954 ing the WS small-world neural network training process. 955 Therefore, we proposed the Gaussian-distributed small-world 956 wiring method to improve the stability of WS small-world 957 networks. The proposed method organizes neuron connec-958 tions in compliance with the Gaussian-distribution, which 959 reduces random connections from a distant area and makes 960 the short-range connections dominate the main part of net-961 work energy, thus improving the stability of small-world 962 networks. To evaluate the new small-world rewiring method, 963 we compared the new small-world series with the fully con He is currently a Researcher and appointed as 1198 a fellow working in the field of complex sys-1199 tem modeling and simulation. His research inter-1200 ests include big data, mathematical modeling for 1201 complex problem, and application of artificial 1202 intelligence. 1203 1204 VOLUME 10, 2022