Dendritic Neural Network: A Novel Extension of Dendritic Neuron Model

The conventional dendritic neuron model (DNM) is a single-neuron model inspired by biological dendritic neurons that has been applied successfully in various fields. However, an increasing number of input features results in inefficient learning and gradient vanishing problems in the DNM. Thus, the DNM struggles to handle more complex tasks, including multiclass classification and multivariate time-series forecasting problems. In this study, we extended the conventional DNM to overcome these limitations. In the proposed dendritic neural network (DNN), the flexibility of both synapses and dendritic branches is considered and formulated, which can improve the model's nonlinear capabilities on high-dimensional problems. Then, multiple output layers are stacked to accommodate the various loss functions of complex tasks, and a dropout mechanism is implemented to realize a better balance between the underfitting and overfitting problems, which enhances the network's generalizability. The performance and computational efficiency of the proposed DNN compared to state-of-the-art machine learning algorithms were verified on 10 multiclass classification and 2 high-dimensional binary classification datasets. The experimental results demonstrate that the proposed DNN is a promising and practical neural network architecture.

and they can mimic the human brain in processing complex data inputs to produce output predictions [2].The development of ANNs has a long and rich history that spans several decades of research and innovation in computer science, mathematics, and cognitive neuroscience [3].
In 1943, McCulloch and Pitts pioneered the first computational neuron model inspired by biological neurons and based on binary logic and threshold functions [4].This model, which is referred to as the McCulloch-Pitts neuron, formed the basis for many early ANNs and laid the foundation for the computational neuroscience field.In 1958, Rosenblatt developed an early ANN, which is well-known as a perceptron that could perform basic pattern recognition tasks [5].Despite its early success in binary classification tasks, the perceptron has several limitations that hinder its performance in more complex applications [6].For example, the perceptron cannot handle linearly indivisible data.In other words, the perceptron can only classify input data in the input space that can be separated by linear boundaries or hyperplanes [7].However, the perceptron paved the way for more sophisticated ANNs, e.g., multilayer neural networks and deep learning techniques that can handle complex and highdimensional data [8].Currently, ANNs are used extensively in many applications, including speech recognition [9], [10], image processing [11], [12], natural language processing [13], [14], and robotics [15], [16].
There are more than 10 4 neurons per cubic millimeter in the human brain.Neurons primarily comprise dendrites, one axon, and one soma body [17].In total, approximately 10 11 neurons and 10 15 connections are integrated into complex neural networks that perform various brain functions.Note that dendrites are more than simple passive information conduits.They are active computational units that can process and convert signals from other neurons or sensory cells [18], and produce local spikes that propagate back to the cell body and modulate the output response of the neuron [19].The dendritic branches of neurons are very complex and variable, each receiving inputs from various sources and locations.The inputs received by dendrites can be excitatory, which means that they depolarize dendrites and make them more likely to excite action potentials, or inhibitory, implying that they hyperpolarize dendrites and make them less likely to excite action potentials [20].In addition to computational functions, dendrites play a critical role in plasticity, i.e., the nervous system's ability to adapt based on learning.Dendritic plasticity occurs at multiple levels, including changes in dendritic morphology, dendritic excitability, and synaptic strength [21], and these changes can be induced by various forms of synaptic activity and can strengthen or weaken specific synaptic connections [22].The changes in synaptic strength are believed to be the cellular basis of learning and memory in the brain, and they are critical for forming new neural connections and reorganizing existing connections [23].Dendritic plasticity is regulated by complex interplay between intrinsic dendritic properties and extrinsic factors, e.g., neuromodulators and growth factors [24].With their branching and spines, the unique structure of dendrites allows them to integrate information from multiple synaptic inputs, which enables them to detect and respond to patterns of activity that can induce plasticity [25].Overall, dendritic plasticity is essential for the brain's ability to adapt to changing environments and to form and modify neural circuits [26].
Inspired by biological dendritic neurons, we have previously proposed a simple neuronal model featuring a dendritic structure [27].The synaptic layer, dendritic layer, membrane layer, and soma body collectively form the main structures of the dendritic neuron model (DNM).Consistent with neurobiological observations in the brain, the structure of the DNM exhibits sufficient plasticity to discard useless synapses and redundant dendrites after training; thus, it can produce a unique dendritic structure for each task [28].The DNM has been used to solve several vision problems, e.g., motion recognition [29], orientation detection, and depth rotation [30].Subsequently, the model was improved by introducing simple and powerful multiplication and summation operations rather than soft-minimum and maximum functions in the dendritic and membrane layers, respectively [31].The neuronal architecture of the DNM was verified to be completely hardwareized by logic circuits consisting of comparators and logic gates (e.g., AND, OR, and NOT gates), which is a breakthrough for the DNM in the pattern recognition field [32].Compared with other ML methods that require floating-point computation, logic circuits perform computation in binary, which results in extremely fast processing time at minimal computational costs [33].In addition, the DNM serves as a support for the hypothesis that logical computation can realize synaptic interactions on dendrites.Consequently, research on the neuronal architecture and learning algorithms of DNM has gained momentum and been successfully applied to solve problems in various fields [34], e.g., medical diagnosis [35], credit assessment [36], and other classification problems [37].The DNM has also been applied to solve various time-series forecasting problems, e.g., epidemic transmission propensity [38], wind speed prediction [39], and stock price movement [40].
However, the single-neuron structure limits DNM performance significantly and makes it difficult to handle more complex problems.Combining multiple DNMs has been successful in the object motion direction detection task; however, it is still of limited help in terms of unleashing the capabilities of the DNM [41].With the advancements of computational resources, especially advances in graphics processing units (GPUs) and data processing units (DPUs), it is necessary to improve the DNM as a deep neural network rather than a single-neuron model.Thus, in this paper, we propose a DNM-based dendritic neural network (DNN).Differing from the conventional DNM, the proposed DNN includes multiple membrane layers and soma bodies; thus, it can produce multiple outputs to better handle complex problems.In addition, the flexible synaptic structure enriches the nonlinear capability of the DNN, thereby making it more proficient in handling increasingly complex problems.The inherent properties of the DNM ensure that the gradient vanishing problem intensifies as the number of features increases; therefore, a dropout mechanism is designed and implemented to mitigate gradient vanishing as an optional strategy in the proposed DNN.The proposed DNN is applied to 10 multiclass classification and 2 high-dimensional binary classification problems to examine its effectiveness.In summary, the primary contributions of this study are summarized as follows.
1) The DNN model is proposed to extend the DNM from a single-neuron model to a neural network to handle more complex problems effectively.
2) The flexible synaptic structure enhances the ability of the proposed DNN to handle nonlinear tasks.Flexible synapses consider each feature sufficiently and allow the proposed DNN to realize an effective balance between underfitting and overfitting.3) A dropout mechanism designed specifically for the proposed DNN provides an effective adjustable strategy to mitigate the gradient vanishing problem caused by cumulative multiplication.4) The dendritic structure with multiple soma bodies enables sufficient adaptation to more diverse loss functions.In addition, it is feasible to concatenate multiple outputs into a fully-connected network or input them into a new DNN to construct a deep neural network.The remainder of this paper is organized as follows.Section II reviews the structure of the DNM and its unique properties.The proposed DNN is described in detail in Section III.Section IV describes the datasets, experimental setup, and performance metrics used to evaluate the DNN.An analysis of the experimental results is summarized in Section IV.Finally, conclusions and future work are discussed in Section V.

A. Dendritic Neuron Model
The DNM is a feedforward neuron model that mimics biological dendritic neurons, primarily comprising synapses, dendrites, a membrane, and a soma body.As shown in the left-hand panel of Fig. 1, multiple synapses are attached to the same dendrite, and multiple dendrites are summarized into a membrane layer and a cell body.To be specific, synapses are responsible for processing the input information as signal-receiving units and can evolve into four synaptic states after training.Dendrites collect and process information from all synapses connected to them and transmit this information to the membrane.The dendritic signals are accumulated in the membrane and are carried into the soma body, which is responsible for producing the final output.The conventional DNM can be expressed mathematically as follows: 1) Synaptic Layer: 2) Dendritic Layer: 3) Membrane Layer: 4) Soma Body: where i ∈ [1, 2, . .., I] and j ∈ [1, 2, . .., J].Here, I and J represent the number of synapses on each dendrite and the number of dendrites, respectively, and w i,j and b i,j refer to the connection weights and biases of each synapse, which are optimized according to the learning algorithm.d is a distance factor, which is set to a constant value in the conventional DNM.λ and θ soma denote the steepness factor and activation threshold of the soma body, respectively, which are also predefined constants in the conventional DNM.The activation threshold of each synapse can be formulated as follows:

B. Synaptic Evolution
w i,j and b i,j can be modified during training and solidified into four synaptic states after the learning process is completed.The four synaptic states are presented in the right-hand side of Fig. 1, which can be expressed as follows: 1) Direct Connection: 2) Inverse Connection: 3) Constant 1 Connection: 4) Constant 0 Connection: In the synaptic state of the direct connection, the output approaches 1 when the input x i reaches the corresponding threshold θ i,j and approximates 0 otherwise.In contrast, in the synaptic state of the inverse connections, when the input x i exceeds the corresponding threshold θ i,j , its output is 0; otherwise, it is 1.In the synaptic states with constant 1 and constant 0 connections, regardless of the input x i , the outputs are always 1 and 0. The DNM is consistent with biological dendritic neurons, with outputs 0 and 1 corresponding to biological excitatory and inhibitory signals, respectively.

C. Dendritic Neuron Model Analysis
The conventional DNM is a feedforward multiple-input single-output neuron model that has been widely utilized to solve classification and prediction problems [42], [43], [44].When handling binary classification problems, benefiting from cumulative multiplication in dendritic layers, the evolved structure can be simplified according to a neuronal pruning strategy, which corresponds to plasticity in biology [45].Specifically, synapses with a constant 1 connection can be eliminated, and all dendrites containing synapses with constant 0 connections can be removed completely.
However, cumulative multiplication exacerbates the gradient vanishing problem, which makes it difficult for the DNM to solve tasks with many features.Any value multiplied by 0 is equal to 0; thus, even if there is only a single synapse with a constant 0 connection on a dendrite, all other synapses on that dendrite will become meaningless.The dendrite will degenerate, and its output is constantly 0. As the number of features in the dataset increases, so does the number of synapses on the dendrites.Simultaneously, the risk of dendrites containing synapses that fall into saddle points increases, which makes it difficult for the DNM to solve problems containing too many features [46].Previous studies have investigated ways to mitigate synapses falling into saddle points; however, the main focus of these studies was learning algorithms [47], [48], [49].Although the DNM performs well when handling traditional multi-input single-output tasks, it is difficult for a single DNM to handle more complex tasks, e.g., multi-output problems.In addition, the solidified structure prevents DNMs from being developed as deep learning techniques because it is difficult to stack DNMs or connect them to other ANN models.
The proposed DNN is designed to improve the conventional DNM in terms of its structure.The flexibility of the synapses allows the proposed DNN to have a stronger nonlinear ability to handle more features, and the dropout mechanism enhances the ability of synapses to jump out of the saddle point during the learning process.In addition, expansion of the output layer (i.e., the soma body) enables the proposed DNN to solve multi-output problems efficiently and provides the ability to stack and connect multiple DNNs to other ANNs.

D. Ensemble Dendritic Neuron Models
The DNM is a neuron model designed for binary classification problems; thus, it is practical to collaborate with multiple DNMs when handling multiclass classifications.Generally, one of the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.basic concepts involved in solving a multiclass classification task is to decompose the problem into multiple binary classifications.There are three classical decomposition schemes, i.e., the One-vs-One, One-vs-Rest, and Many-vs-Many schemes [50].
Here, the One-vs-One scheme involves one-to-one pairing of N classes in the dataset, resulting in N (N − 1)/2 binary classifications, and ultimately the most predicted class as the decision outcome based on the N (N − 1)/2 classification results [51].The One-vs-Rest scheme involves training N classifiers by alternating one of the N classes as the positive side and the samples of all other classes as the negative side.Here, if only a single classifier is predicted as positive, the corresponding class label is adopted as the final result, and if multiple classifiers are labeled as positive, the confidence interval is considered, and the one with the highest confidence level is selected as the result [52].The Many-vs-Many scheme involves taking turns with several classes as positive and several classes as negative according to a specific encoding scheme, e.g., error correcting output codes, and both the One-vs-One and One-vs-Rest schemes are special cases of the Many-vs-Many scheme [53].Inevitably, the number of classifiers to be utilized in the One-vs-One scheme is greater than that of the One-vs-Rest scheme; however, each classifier in the One-vs-One scheme requires samples from only two classes, whereas each classifier in the One-vs-Rest scheme requires all samples.As a result, the storage and computational costs of the One-vs-Rest scheme are higher.The performance depends primarily on the particular data features and distribution; but the two schemes generally exhibit comparable performance in the majority of cases [54].
Thus, the proposal for an ensemble of dendritic neuron models (EDNM) based on the One-vs-Rest scheme is natural, and each DNM in the EDNMs is responsible for identifying one of the categories in the dataset [55].As shown in Fig. 2, the dataset is repeated three times according to the three classes, and one of the classes is selected to distinguish it from the other two classes each time.The final solid black circle represents the loss function.Note that the cross-entropy function is utilized in this study.As discussed previously, significant effort has been dedicated to EDNMs; however, this method suffers from the following limitations and challenges.
1) It is necessary for each individual DNM to handle all samples from all classes, which is a significant waste of data storage and computational resources.
2) The dendrites of the DNM have a significant impact on the final output, whereas the dendrites of each DNM contribute very little to the other DNMs.3) A new DNM must be added for each additional class in the dataset.This structure lacks flexibility, which increases the redundancy and inefficiency of dendrites.4) EDNMs are an extension of the conventional DNM specifically designed to solve multiclass classification problems, which limits the generalizability and evolvability of the model.

A. Dendritic Neural Network
As a neural network-based extension of the conventional DNM, the proposed DNN primarily includes synapses, dendrites, membranes, and soma bodies, as shown in Fig. 3.In addition, the flexibility of the proposed DNN allows it to handle more complex tasks.The mathematical details of the proposed DNN are expressed as follows: 1) Synaptic Layer: 2) Dendritic Layer: refer to (2).
3) Membrane Layer: 4) Soma Body: where d i,j denotes the distance parameter of the synapses, which enhances the flexibility of the synaptic connections.Here, each synapse has its own distance parameter, and the flexible switch controls whether it participates in the optimization process, where "True" means that the d i,j is considered a parameter that requires optimization.In addition, v j,n represents the connection strength between the j-th dendrite and n-th membrane, which increases the flexibility of the dendrites.Introducing synaptic and dendritic flexibility allows the proposed DNN to acquire stronger nonlinear capabilities than the conventional DNM when handling complex problems.

B. Dropout Mechanism
In ML, when a model involves too many parameters and too few samples to learn, the trained model is prone to overfitting [56].As a common problem of ML, overfitting will likely result in unusable models.An ensemble of models is generally employed, which means that multiple models are combined for training to solve this problem.However, training and testing multiple models is time consuming.The dropout mechanism effectively mitigates overfitting by reducing the interactions of the feature detection units, thereby improving the neural network's performance [57], [58].In the hidden layer of other feedforward neural networks, the local neuron is essentially a weighted summation of the neurons in the preceding hidden layer.The standard dropout mechanism randomly discards different neurons in the hidden layer, which is similar to shaping different networks.The dropout operation enables the neural network to perform a weighted summation of many different neural networks.Different overfitting is generated in these networks, and the "opposite fitting" can counteract each other to reduce the overall overfitting.In addition, the retained weights must be rescaled during training to reduce the impact of neuron dropout on the input of the next hidden layer.
Differing from the overfitting problem in traditional ML techniques, gradient vanishing is an important factor that limits the performance of the proposed DNN.As discussed in Section II-B, the trained synapses can evolve into four synaptic states, in which direct and inverse connections are considered valid, whereas constant 1 and constant 0 connections are considered invalid.Due to the cumulative multiplicative function in the dendritic layer, even a single synapse with a constant 0 connection on the dendrite can result in the degeneration of the connected dendrite.In addition, the sigmoid function in the synapse limits the output of the synaptic layer to between 0 and 1, which further exacerbates the gradient vanishing as the number of features increases.Thus, the proposed DNN includes a dropout mechanism, which is described as follows: where p indicates the dropout rate.As shown in Fig. 4, the dropout mechanism is utilized as an adjustable mechanism to train the proposed DNN.Specifically, the gradient vanishing problem can be reduced significantly by ignoring some of the synapses during training (i.e., setting the value of some synapses to 1).Due to the cumulative multiplication in dendrites, the dropout mechanism in the proposed DNN does not require the rescaling of weights, and the preserved synapses directly inherit the original values.The dropout mechanism causes synapses to not always work in the DNN; thus, parameter updates are no longer dependent on the interaction between synapses with fixed relationships, which prevents some features from only being effective under certain conditions.In addition, eliminating synapses with a constant 0 connection state makes it possible for dendrites to avoid degeneration, which allows them to contribute more to the DNN.The dropout mechanism enables the proposed DNN's sensitivity to some particular cue fragments and to extract information from other cues even if a particular cue is lost.As a result, the proposed DNN is forced to learn more robust features.

C. Learning Algorithm
In this study, the proposed DNN is employed as a multiclass classifier; thus, the cross-entropy function is adopted as the loss function, which can be formulated as follows: where T n and O n denote the target and actual outputs of the n-th soma body, respectively.According to the error, the synaptic and dendritic parameters can be optimized according to the Widrow-Hoff delta learning rule, which is expressed as follows: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where η indicates the learning rate parameter.Following the chain rule in calculus [59], the partial differential derivatives in the above equations can be decomposed as follows: In addition, as an enhanced optimization strategy of adaptive moment estimation (i.e., the Adam optimizer) [60], the AdamW optimizer is implemented to reduce intrinsic fluctuations in the learning process and accelerate the convergence speed [61], which can be described as follows: where ξ is the parameter to be optimized.Here, ∇F and λ denote the corresponding partial differential derivatives and weight decays, respectively.The new gradient update formulas of the AdamW optimizer depending on the momentum m ξ and velocity v ξ are expressed as follows: where β 1 and β 2 are two constants that are generally set to 0.9 and 0.999, respectively.Note that the initial values of m ξ (0) and

A. Experimental Setup
To evaluate the performance of the proposed DNN comprehensively, 10 multiclass classification and 2 high-dimensional binary classification datasets from the UCI Machine Learning Repository (https://archive.ics.uci.edu/) were used in our experiments, including the Balance Scale (Balance), Cleveland Heart Disease (CHD), Contraceptive Method Choice (CMC), Dry Bean (DB), Internet Firewall (Firewall), Hepatitis C Virus (HCV), Iris, Seed, Thyroid Disease (Thyroid), and Wine datasets [62].The details of these datasets are given in Table I.In these experiments, 70% of the samples from each dataset were used for training, and the remaining samples were used to test the performance of the compared models.
Note that there are four adjustable user-predefined parameters in the proposed DNN, i.e., the number of dendrites, dropout rate, flexible switch, and batch size.Thus, sixteen DNNs with different parameter settings were employed for parameter sensitivity analysis, and EDNMs with the One-vs-Rest scheme were adopted to measure the improvement realized by the proposed DNN.In addition, the proposed DNN was compared to 10 existing ML methods to demonstrate its effectiveness.Here, each experiment was run 20 times independently, and the results are presented as the mean and standard deviation to emphasize the confidence of the results.

B. Performance Metrics
For common classification metrics were used to measure the performance of the classifiers: accuracy, precision, F1 score, and Cohen's kappa (κ).The corresponding metrics for each label were calculated, and they were weighted and summed according to the proportion of their sample size in the total sample size, which is expressed as follows: where the corresponding performance metrics for each label are calculated as follows: Recall Here, T P and T N represent true positive and true negative, and F P and F N denote false positive and false negative, respectively.
In addition, nonparametric statistical tests were conducted to verify the differences between the methods.Here, the Friedman test was applied to compare and rank DNNs with different parameters, where the significance level α was set to 0.05 [63].In addition, the Bonferroni-Dunn procedure was employed as a post hoc test to characterize the statistical results, which can compensate for the lack of controlling family-wise error rates for unadjusted p values [64].The Wilcoxon signed-rank test was utilized to determine whether significant differences could be observed between the proposed DNN and the compared ML methods, where the significance level α was also set to 0.05 [63].

C. Comparison of DNNs
Based on previous studies, the number of dendrites in the DNM is typically set close to the number of input features (I) [65], and the number of dendrites in EDNMs is I × N .To measure the performance of DNNs systematically, the number of dendrites in DNNs was set to increment from I to I × N .Considering that the DNNs in our experiments are not deep DNNs with a large number of dendrites, the dropout rate was set to two levels, i.e., 0 and 0.05.In addition, the flexible switch also has two states, i.e., true and false.In these experiments, batch size was adaptive, the parameters were modified 200 times in each iteration, and the batch size was set to 1 when the number of samples from the dataset was less than 200.The number of epochs was set to 1000 for all methods.Additional details about the parameter settings are given in Table II.
The accuracy, precision, F1 score, and κ results of the compared DNNs with different parameter settings obtained the 10 datasets are shown in Tables SI and SII, where the best results for each metric are shown in bold.As can be seen, the DNNs achieved competitive results in most cases, and DNNs with more dendrites obtained better performance.The increased flexibility generally improves the performance of DNNs, which allowed them to outperform the original DNNs on most datasets.The performance of the DNNs incorporating only the dropout mechanism was enhanced with an increasing number of dendrites, which is in line with expectations.The DNNs with both high flexibility and the dropout mechanism have a more complex neural network structure; thus, their performance was slightly unsatisfactory with fewer dendrites.Theoretically, the performance of DNNs can be improved further by adjusting parameter settings, e.g., increasing the number of dendrites and epochs to realize sufficient learning.From these results, we conclude that the proposed DNN can solve multiclass classification tasks.In addition, introducing flexibility effectively improves the nonlinear capability of the proposed DNN, thereby enabling it to solve complex tasks efficiently.
In addition, the statistical results of DNNs in terms of each performance metric are summarized in Table III .As can be seen, the FDNN-4 model obtained the best performance in terms of accuracy and precision, and the FDNN-2 model achieved the best results in terms of the F1 score and κ metrics.For the accuracy and precision metrics, the FDNN-4 model was significantly superior to the DDNN-1, DDNN-2, and DFDNN-1 models, and the FDNN-2 model outperformed the DDNN-1, DDNN-2, and DFDNN-1 models significantly in terms of F1 score and κ.There was no significant difference between the DNNs with other parameter settings and those with the best performance.The relatively optimal parameter settings for each dataset are listed in Table IV .The dropout mechanism can lead to the DNN underfitting in some simple tasks; however, it can provide a solution to prevent overfitting when solving more complex tasks.Theoretically, the underfitting problem can be solved by setting the number of epochs appropriately, while increasing the number of dendrites and setting suitable dropout parameters can prevent overfitting.Determining appropriate parameter settings for different tasks can improve the performance of the proposed DNN effectively.In summary, the competitive results obtained on most datasets suggest that the DNNs can solve numerous multiclass classification tasks without relying on hyperparameter settings, and the fine hyperparameter design provides the potential to further improve the performance of the proposed DNN.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

D. of ML Methods
To further validate the classification performance of the proposed DNN, EDNMs, and 10 ML methods were compared experimentally, including ensemble decision trees using the One-vs-Rest scheme, ensemble epsilon support vector machines (EeSVMs), and ensemble nu support vector machines (EnSVMs) with different kernel functions using the One-vs-Rest scheme, i.e., EDTs, EeSVMs-L (with a linear kernel), EeSVMs-P (with a polynomial kernel), EeSVMs-R (with a radial basis function kernel), EeSVMs-S (with a sigmoid kernel), EnSVMs-L (with a linear kernel), EnSVMs-P (with a polynomial kernel), EnSVMs-R (with a radial basis function kernel), EnSVMs-S (with a sigmoid kernel), and the multilayer perceptron (MLP).The hyperparameter settings of all methods are listed in Table V.
Tables SIII and SIV compare the accuracy, precision, F1 score, and κ results of the DNN, EDNMs, and ML methods obtained on the 10 experimental datasets.As can be seen, the proposed DNN achieved the best results in nearly all performance metrics on all datasets, with the exception of the accuracy metric on the CHD dataset and all metrics on the Firewall dataset.On the Firewall dataset, the EDTs achieved the best results for all performance metrics, with DNNs ranking second for all metrics.In addition, the statistical results of the Wilcoxon signed-rank test are shown in Tables SIII and SIV, where "-" indicates that the DNN cannot be compared with itself.As shown, the proposed DNN significantly outperformed the vast majority of the compared ML methods in terms of nearly all metrics on all datasets.Specifically, the proposed DNN significantly outperformed all other ML methods in terms of accuracy, precision, F1 score, and κ on the Balance, CMC, DB, and Seed datasets.On the CHD dataset, the proposed DNN and EeSVMs-S achieved comparable results, and the DNN was significantly better than all other ML methods in terms of the other performance metrics.On the Firewall dataset, the proposed DNN significantly outperformed all other ML methods (except the EDTs) in terms of all performance metrics.In addition, we found that the EDNMs achieved similar results to the proposed DNN on the HCV, Iris, Thyroid, and Wine datasets, and EnSVMs-S obtained results that were comparable to those of the proposed DNN on the Wine dataset.No significant difference was observed between the proposed DNN and EnSVMs-L in terms of the precision and κ metrics on the Wine dataset.Thus, we conclude that the proposed DNN has a more appropriate neural network structure that can utilize the dendritic information more effectively than a simple ensemble of DNMs.Introducing flexibility and the dropout mechanism enables the proposed DNN to solve complex tasks more effectively.Compared with many other multiclass classifiers based on ML methods, the excellent multiclass classification capabilities of the proposed DNN demonstrate that it is a promising multiclass classifier.
The statistical results of the Friedman test are summarized in Table VI, and the results show that the proposed DNN ranked first among the compared ML methods.The proposed DNN is significantly superior to the EDTs, EeSVMs-L, EeSVMs-P, EeSVMs-R, EeSVMs-S, EnSVMs-L, EnSVMs-P, EnSVMs-R, EnSVMs-S, and MLP models in terms of the accuracy, precision, F1 score, and κ metrics.Benefiting from having more dendrites, the EDNMs achieved performance that is comparable to that of the proposed DNN, and no significant difference was observed between the EDNMs and DNN.The MLP model also obtained competitive performance because it has more hidden layers.Based on these results, we conclude that, compared to existing ML methods, the proposed DNN is an excellent neural network.Ensemble models that combine binary classification models as units significantly limit the performance of numerous ML methods in terms of solving multiclass classification problems.Although EDNMs can provide competitive performance on some datasets, the flexible and concise architectural design of the proposed DNN provides better computational efficiency.

E. Extension
In this section, we investigate the performance of the proposed DNN when applied to binary classification datasets with more features.The Breast dataset contains 569 samples with 30 features, and the Parkinson dataset consists of 195 samples, each of which contains 22 features.A comparison of the accuracy, precision, F1 score, and κ results obtained by the proposed DNN, EDNMs, and nine ML methods on 2 datasets is shown in Table VII.In this evaluation, the number of output layers was set to two for the MLP.In addition, two individual units were utilized to collaborate in the ensemble models.As shown in Table VII, the proposed DNN obtained better results than the compared ML methods in terms of accuracy, precision, and F1 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE VII COMPARISON OF RESULTS ON BREAST AND PARKINSON DATASETS
score.On the Breast dataset, the EnSVMs-R model achieved the best results, and the proposed DNN was slightly inferior in terms of κ.These statistical results suggest that the proposed DNN significantly outperformed the EDTs, EeSVMs-L, EeSVMs-P, EeSVMs-R, EeSVMs-S, EnSVMs-L, EnSVMs-P, EnSVMs-R, EnSVMs-S, MLP, and EDNMs models on the Breast and Parkinson datasets in terms of accuracy, precision, and F1 score.Thus, we conclude that introducing flexibility and the dropout mechanism enhances the proposed DNN's ability to solve multiclass classification problems and improves its performance on binary classification datasets with more features.

V. CONCLUSION
In this paper, we have proposed the DNN architecture, which extends the single-neuron model of the conventional DNM to a feedforward neural network structure that can process multiple inputs and produce multiple outputs.The added flexibility enhances the adaptability of the model to different loss functions and enables the construction of deep neural networks.The proposed DNN architecture represents a more appropriate neural network structure than a simple ensemble of DNMs because it enables more effective utilization of dendritic information.In addition, the synaptic flexibility enhances its nonlinear capability, thereby making it more efficient when solving complex tasks.Although introducing the dropout mechanism may result in underfitting on some simpler tasks, it allows the proposed DNN to prevent overfitting in more complex tasks.To address underfitting, it is theoretically possible to increase the amount of learning.In contrast, overfitting can be prevented by increasing the number of dendrites and setting appropriate dropout parameters.Thus, designing appropriate parameter settings for different tasks is crucial in terms of improving the performance of the proposed DNN.To evaluate the effectiveness of the proposed DNN architecture, we applied it to 10 multiclass classification and 2 high-dimensional binary classification problems.Compared with representative multiclass classification methods, the proposed DNN exhibited competitive classification performance and satisfactory computational efficiency.These results suggest that the proposed DNN is a promising ANN with practical application potential in various classification tasks.In future research, we plan to investigate the application of the proposed DNN to other more complex problems and design strategies to further optimize its performance.In addition, the effects of various neural network parameter settings on the performance of the proposed DNN in various tasks will be verified to analyze the architectural differences between it and other neural networks.

Fig. 1 .
Fig. 1.Structure of dendritic neuron model and four synaptic states.The trained synapse can evolve to the four states in the right half of the figure.

Fig. 2 .
Fig. 2. EDNMs with the One-vs-Rest decomposition scheme.The dataset is binary divided multiple times based on the classes to train individual DNM.

Fig. 3 .
Fig. 3. Structure of proposed DNN for multiclass classification.The dataset is fed directly into the DNN, which optimizes the model structure and conserves computational resources.

Fig. 4 .
Fig. 4. Dropout mechanism in proposed DNN.The introduction of the Dropout mechanism enables more dendrites to escape degradation.

TABLE VI STATISTICAL
RESULTS OF THE COMPARED ML METHODS