Energy-Efficient Distributed Spiking Neural Network for Wireless Edge Intelligence

The spiking neural network (SNN) is distinguished by its ultra-low power consumption, making it attractive for resource-limited edge intelligence. This paper investigates an energy-efficient (EE) distributed SNN, where multiple edge nodes, each containing a subset of spiking neurons, collaborate to gather and process information through wireless channels. To leverage the benefits of the joint design of neuromorphic computing and wireless communications, we develop quantitative system models and formulate the problem of minimizing the energy consumption of edge devices under constraints of limited bandwidth and spike loss probability. Particularly, a simplified homogeneous SNN is first explored, where the system is proved to have stationary states with a constant firing rate and an alternating optimization based algorithm is proposed for jointly allocating the computation and communication resources. The algorithms are further extended to heterogeneous SNNs by exploiting the statistics of spikes. Extensive simulation results on neuromorphic datasets demonstrate that the developed algorithms can significantly reduce the power consumption of edge systems while ensuring inference accuracy. Moreover, SNNs achieve comparable performance with state-of-the-art recurrent neural networks (RNNs) but are much more bandwidth-efficient and energy-saving.


Energy-Efficient Distributed Spiking Neural
Network for Wireless Edge Intelligence Yanzhen Liu , Zhijin Qin , Senior Member, IEEE, and Geoffrey Ye Li , Fellow, IEEE Abstract-The spiking neural network (SNN) is distinguished by its ultra-low power consumption, making it attractive for resource-limited edge intelligence.This paper investigates an energy-efficient (EE) distributed SNN, where multiple edge nodes, each containing a subset of spiking neurons, collaborate to gather and process information through wireless channels.
To leverage the benefits of the joint design of neuromorphic computing and wireless communications, we develop quantitative system models and formulate the problem of minimizing the energy consumption of edge devices under constraints of limited bandwidth and spike loss probability.Particularly, a simplified homogeneous SNN is first explored, where the system is proved to have stationary states with a constant firing rate and an alternating optimization based algorithm is proposed for jointly allocating the computation and communication resources.The algorithms are further extended to heterogeneous SNNs by exploiting the statistics of spikes.Extensive simulation results on neuromorphic datasets demonstrate that the developed algorithms can significantly reduce the power consumption of edge systems while ensuring inference accuracy.Moreover, SNNs achieve comparable performance with state-of-the-art recurrent neural networks (RNNs) but are much more bandwidth-efficient and energy-saving.
Index Terms-Spiking neural network, energy-efficient, distributed computing, resource allocation, edge intelligence.

I. INTRODUCTION
T HE convergence of artificial intelligence (AI) and edge computing, referred to as edge intelligence, has gained considerable research attention [1].The edge intelligence paradigm endeavours to shift the computation and communication load of AI algorithms from centralized servers to the network edges, thereby substantially reducing the processing delays and required bandwidth.This approach would benefit a multitude of applications ranging from autonomous driving to video surveillance.However, the implementation of edge intelligence poses non-trivial challenges, primarily due to the limited computational capabilities and constrained battery life of edge devices.Additionally, as prevailing network architectures continue to scale, the deployment of large models onto edge devices becomes impractical.Hence, it is necessary to explore an energy efficient neural network (NN) model to facilitate the usage of edge intelligence.
Spiking neural networks (SNNs) emerge as a promising solution for achieving low-power edge intelligence [2].SNNs imitate the dynamics of biological neurons, processing information through binary spike trains.These spike trains consist of sequences of action potentials, or "spikes," generated by neurons, in contrast to the real numbers used in conventional NNs.SNNs feature high energy efficiency, spike-based computing, and always-on operation [3] and are particular suitable for resource constrained edge devices.Due to these appealing properties, SNNs have attracted extensive attention in both academia and industry [4].Numerous specialized hardware has been designed to emulate SNNs, such as IBM's TrueNorth [5], Intel's Loihi [6], Tianjic [7], and Neurogrid [8].Moreover, SNNs naturally integrate with neuromorphic sensors that directly generate spike-type data [9], [10], [11], enabling SNNs to learn and infer in an endto-end bio-plausible manner with ultra low power consumption Recently, the distributed SNN has been introduced [12] for the following reasons: 1) Distributed SNNs could accommodate a broad range of applications because numerous scenarios require the integration of information collected from distributed sensors, with sensors and processors located in different places; 2) The distributed computing paradigm could fully utilize idle computation resources and enhance communication efficiency though collaboration [13], potentially enabling the development of large scale NNs [12]; 3) SNNs are suitable for distributed deployment because spiking neurons are event-driven and generate minimal amounts of data, which could significantly reduce the communication costs especially in wireless environments.Potential applications of distributed SNN may involve security alarms, environmental monitoring, intelligent robots [14], and healthcare [15].It is believed that such a distributed neuromorphic computing paradigm will contribute to realizing the full potential of edge AI systems.

A. Related Works
Recently there is a growing interest in applying SNNs for edge applications [2], [12], [16], [17], [18], [19], [20], [21], [22], [24], [25], [26].In [16] and [17], distributed wireless SNN has been implemented on field programmable gate array (FPGA) for an exclusive or (XOR) computation task, where carrier sense multiple access/collision detection (CSMA/CD) and time division multiple access (TDMA) have been used for spikes transmission, respectively.By analysing the performance of distributed SNN in terms of inference accuracy and neural activity under spike losses [12], the resilience of SNNs in wireless environments has been demonstrated.The novel neuromorphic semantic communication paradigm in [18] combines the event-driven sensing, spiked-based computation, and impulse radio (IR) for remote inference.A digital semantic communication system based on SNNs has been further developed in [19].To adapt to the available spectrum, the hybrid automatic repeat request (HARQ) mechanism is combined with SNN-based semantic communications [20].In [21], a federated learning based SNN has been developed for cooperative training.In [2], a leader selection mechanism has been incorporated into the federated training of SNN to accelerate the convergence and defend against model attacks.SNNs have also been applied to more specific applications.The notable work in [22] has utilized SNN for signal detection in low earth orbit (LEO) satellite network, where a novel hybrid network [23] can get the merits of both deep learning and conventional matched filter.Moreover, SNNs have been employed for low-power radio frequency fingerprint identifications in very high-frequency data exchange systems in [24].Other SNN-based applications include joint source-channel coding [25] and integrated sensing and communication [26].
Nevertheless, the area of distributed SNNs is still under investigation, especially in terms of how to efficiently deploy distributed SNNs in resource constrained edge scenarios.More precisely, the systems in [16] and [17] only consider a simple XOR computation tasks with two input neurons, which may not be generalizable to real-world scenarios.To the best of the authors' knowledge, there are still several issues to be considered, such as practical communication protocols on the collaboration of multiple edge devices, quantitative analysis of the spike capacity and energy consumptions, and efficient algorithms for allocating the limited bandwidth resources and minimizing the system power consumptions.In particular, there exists a fundamental trade-off between the transmit power and the spike capacity due to the limited spectrum, if edge devices are connected wirelessly.

B. Main Contributions
In this paper, we develop an EE distributed SNN (EE-DSNN) through the joint design of neuromorphic computing and communication systems.As in Fig. 1, the system consists of several distributed edge nodes and an access point (AP), with each node containing a subset of spiking neurons.The edge nodes are connected wirelessly to provide more flexibility and scalability [27].They collaborate to gather and process information in the form of spike trains and the AP reads out the computational result upon receiving these spike trains.To maximize the synergy between SNNs and communication systems, we develop novel algorithms for jointly optimizing bandwidth allocation, transmit power, and neuron assignment.The goal is to minimize system energy consumption while ensuring robust spike transmission to uphold SNN performance.These develop algorithms leverage the intrinsic sparse nature of SNNs and could effectively manage the limited bandwidth resources.Simulations on neuromorphic datasets demonstrate that these proposed methods significantly reduce system power consumption while ensuring SNN performance.Moreover, SNNs exhibit only a slight performance gap with the state-of-the-art recurrent NN (RNN) but are much more bandwidth-efficient and energysaving.The major contributions of this paper are outlined as follows: • We develop a new EE-DSNN for edge intelligence to highlight the advantages of the joint design of neuromorphic computing and wireless communication through quantitative models.
• We analyse a simplified homogeneous SNN.We first show that the system has absorbing states with a constant percentage of firing neurons and then propose an alternating optimization algorithm to jointly allocate bandwidth, optimize transmit power, and assign spiking neurons.
• For more general heterogeneous SNNs, we simplify the problem to a form similar to homogeneous SNNs by exploiting the spike statistics and reasonable approximations.Additionally, we introduce a novel neuron allocation algorithm inspired by power laws.

C. Organizations
The remaining sections are organized as follows.Section II introduces the system model and formulates the EE resource allocation problem for the investigated EE-DSNN.The homogeneous case is introduced in Section III and an alternating optimization based resource allocation algorithm is developed.Section IV presents our solution for the general heterogeneous case and finally, Section V concludes the paper.The major notations are summarized in Table .I

II. PROBLEM FORMULATION
This section introduces the system model of EE-DSNN and formulates the EE resource allocation problem.

A. System Overview
We consider a distributed SNN, illustrated in Fig. 1.The system consists of multiple edge nodes and an AP.The edge nodes are further divided into input nodes and hidden nodes.Each input node contains a subset of input neurons responsible for sensing the physical world while each hidden node accommodates a portion of hidden neurons for processing information. 1All neurons generate spike-type data.The AP collects the spikes transmitted from hidden nodes and reads out the computational result.Fig. 2 provides a more detailed computing timeline of the distributed SNN during an inference stage.The time domain is divided into T time slots with equal durations of ∆T .In time slot t, each input node broadcasts the spikes generated by its contained input neurons.Simultaneously, each hidden node decodes the received spikes and updates the dynamics of its hidden neurons, the details of which is introduced in the next subsection.The generated hidden spikes are broadcasted though wireless channels in the next time slot.The above process is repeated in the T time slots.Finally, the AP obtains the computation result based on the received spikes transmitted by hidden nodes in the T time slots.We define the notations for nodes and neurons as follows.
The set of all nodes, K ≜ {1, 2, 3, . . ., K}, is partitioned into the input nodes set I and the hidden nodes set J .Similaly, the set of all neurons, L ≜ {1, 2, 3, . . ., L}, is divided into the input neurons set M and the hidden neurons set N .Additionally, Π k represents the set of neurons contained in node k.

B. Neuron Model
There have been several models for describing the dynamics of biological neurons.In this work, we adopt the widely used leaky integrate and fire (LIF) model [28], defined as: where U n [t] denotes the membrane potential of neuron n at time slot t, τ denotes the leaky factor of membrane potential, U rest denotes the resting potential, R denotes the resistance, and I n [t] denotes the input synaptic current of neuron n at time slot t, given by2 where S l [t] ∈ {0, 1} denotes the spikes generated by neuron l at time slot t, W n,l denotes the (n, l)th element of the feed-forward weight matrix W, and V n,l denotes the (n, l)th element of the recurrent weight matrix V.
When membrane potential U n [t] reaches a threshold U th , the neuron will generate a spike, i.e., where Θ is the unit step function.Afterwards, the membrane potential will be reset to U rest .Without loss of generality, we let R = 1, U rest = 0, and U th = 1.The above neural dynamics can be compactly written as

C. Delay Model
The delays for input nodes consist of sensing and communication.The sensing delay is not our focus and we concentrate only on the delay related to transmitting spikes.The delays for hidden nodes include computation and communication.As the update of neural dynamics is extremely efficient, taking only several nanoseconds [6], we can overlook the computation delays.Since both input nodes and hidden nodes are primarily influenced by communication delays, we represent them succinctly as where r k is the transmission rate of node k, D k ≜ l∈Π k S l denotes 3 the instantaneous spikes generated by node k, and L k is the number of bits for denoting a spike.As in [12], we transmit the indexes of neurons for the sake of efficiency.Therefore,

D. Communication Model
We adopt narrow band wireless communications for transmitting spike signals [29].Moreover, frequency division multiple access is assumed since it is easy to implement.We also assume that the wireless channel is with flat fading and the channel information is known.We further assume that each node k is allocated with w k bandwidth and equipped with a single antenna, then, the maximum achievable rate between input node i and hidden node j can be denoted as where h i,j is the channel gain between input node i and hidden node j, p i is the transmit power of input node i, and N 0 is the power spectral density of the additive white Gaussian noise (AWGN).Since input node i needs to broadcast its spike signals to all node j ∈ J , its broadcast rate r i is limited by the lowest transmission rate in {r i,j , j ∈ J }, that is, where Similarly, the broadcast rate of hidden node j is given by where p j is the transmit power of hidden node j and Here, h jj ′ denotes the channel gain between hidden nodes j and j ′ , and h jA denotes the channel gain between hidden node j and the AP.

E. Energy Consumption Model
The energy consumed by input node i, denoted as E i , can be expressed as where E S i and E T i denote the energies consumed on sensing and transmitting spikes, respectively.In practice, E S i is 3 For notation convenience, we drop the time index t.Note that S l is a random variable here.
determined by the sensor and can be viewed as a constant in our formulation.The communication energy can be written as where q l ≜ E{S l } denotes the average firing rate of neuron l.
For hidden node i, consumed energy E j can be expressed as where E C j is the energy for updating the neurons and E T j is the energy for transmitting spikes.E T j takes the same form as ( 13) and E C j can be written as [6] where C U j denotes the energy for updating the dynamics of a single spiking neuron, | • | indicates the number of elements in a set, and C F j denotes the energy for generating a spike.

F. Problem Formulation
In practical edge systems, the energy consumption is a major concern due to the limited battery life of edge devices.Therefore, we aim to minimize the weighted-sum energy (WSE) for the EE-DSNN system.The problem is formulated as where Ω 0 ≜ {p k , w k , ∀k ∈ K} ∪ {Π j , ∀j ∈ J } is the set of optimization variables.Note that {Π i , ∀i ∈ I} is not optimized here because the input neurons are determined by the sensors, which is fixed in practice.The objective function (16a) denotes the WSE of the system with α k being a weight that represents the priority of node k.The constraint (16d) ensures the reliability of spikes transmission, where γ k denotes the outage probability.The constraint (16e) guarantees that all hidden neurons have been allocated to hidden nodes and the constraint (16f) ensures that the hidden neurons are not repeatedly assigned.Problem ( 16) is very challenging because of the combinatory variable Π j , the non-convex fractional objective function, and the coupled spikes outage constraints.To address the problem, we first investigate homogeneous SNNs, where the SNN is proved to have stationary states with a constant firing rate.Consequently, the problem is significantly simplified because the neurons are not coupled and an alternating Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
optimization based algorithm is developed.Then, we extend the algorithm to more general heterogeneous SNNs by resorting to the statistics of neurons.

III. HOMOGENEOUS SNNS
In this section, we investigate simplified homogeneous SNNs.We show that the system exists stationary states where the percentage of firing neurons is constant.An alternating optimization based algorithm is then developed for problem (16).

A. Homogeneous Model
We consider a homogeneous LIF model [30], [31], [32].Specifically, we assume that the firing of a spiking neuron is a random event related to U n [t], with the probability given by Φ(U n [t]), where Φ is an non-decreasing function with Φ(x) = 0 for x ≤ 0 and Φ(x) = 1 for x ≥ 1.Note that ( 6) is a special case of this probabilistic firing model when . We formally write this model as Moreover, we assume that the weights of W and V are given by M and Hr N , respectively, as in [30], [31], and [32].We also assume the input spikes follow a Bernoulli distribution, given by S l ∼ B(1, q l ), ∀l ∈ M. Defining ρ[t] ≜ Φ(U )p(U )[t]dU as the percentage of firing neurons at time slot t with p(U )[t] denoting the fraction of neurons whose membrane potential is U , then, the following theorem, proved in appendix A, characterizes the behaviour of the analysed homogeneous system. Theorem , ∀l ∈ M, and q l follows a limit distribution q(x), there exist absorbing states in the system where the membrane potentials of the neurons only have discrete values Λ k .Denoting the fraction of neurons whose membrane potential is Λ k as η k , then, the percentage of firing spikes is ρ From the above theorem, the homogeneous SNNs will eventually evolve to a stationary state, where spiking neurons transit among discrete phases with distinct membrane potentials Λ k and the percentage of firing neurons is constant.To provide a clearer understanding of these stationary states, Fig. 3 illustrate an example with parameters H f = 0, H r = 14  9 , τ = 0.5 [30] and a firing probability function Φ(U ) = U .Fig. 3(a) depicts the evolution process of the system, where ρ[t] quickly stabilizes as a constant.Fig. 3(b) shows the stationary distributions of p(U ).The spiking neurons only transit among three membrane potential values.The impact of radio loss on homogeneous SNNs is also studied in Fig. 3(c)-(f).From the figure, as the spike loss grows larger, the percentage of firing neurons ρ will decrease and the number of stationary membrane potential Λ k will increase.It is seen that the communication capacity of spikes has a significant impact on homogeneous SNNs, which emphasizes the need to jointly design the SNNs and communication.
In homogeneous SNNs, the percentage of firing neurons ρ is a constant.Therefore, the number of spikes generated by hidden node j, i.e., D j can be denoted as Q j ≜ |Π j |ρ, which is proportional to its contained number of neurons.Moreover, since D j is asymptotically deterministic, it is not necessary to impose an outage probability constraint (16d) on hidden nodes.In contrast, we directly enforce the transmission capacity to be greater than the number of transmitted spikes.Hence, the constraint (16d) can be reformulated as where for input nodes we also define Q i ≜ l∈Πi q l , ∀i ∈ I for notation convenience.The original problem (16) can then be rewritten as where Ω ≜ {Q j , ∀j ∈ J } ∪ {p k , w k , ∀k ∈ K} is the set of optimization variables and C j is defined as

B. Alternating Optimization
Problem ( 20) is significantly simplified compared with problem (16).However, optimization variables in Ω are still coupled and the objective function (20a) is non-convex.To the best of the authors' knowledge, current optimization technique cannot find the optimal solution to problem (20) in polynomial time.Consequently, we develop an alternating optimization based algorithm for addressing problem (20).Specifically, the variables in Ω are divided into two blocks.The first block is {Q j , ∀j ∈ J } and the second block is {p k , w k , ∀k ∈ K}.The details for updating these two blocks of variables are given as follows.
1) Problem w.r.t.{Q j , ∀j ∈ J }: The subproblem w.r.t.Q j is given by j∈J The problem w.r.t.Q j is a linear programming problem.Hence, its optimal solution can be efficiently found by method like simplex algorithm.
2) Problem w.r.t.{p k , w k , ∀k ∈ K}: The subproblem w.r.t.p k and w k is given by where The above problem is a fractional programming problem with sum of ratios objective function.Based on the theory established in [33], we can equivalently convert problem (23) to the following parametrized problem where µ k and β k are introduced parameters.With fixed µ k and β k , it can be readily seen that problem ( 25) is a convex optimization problem and its optimal solution can be obtained via the celebrated dual ascent algorithm.Specifically, the partial Lagrangian function of problem ( 25) is given by where Note that similar to [34], constraint (25c) is not taken into the Lagrange function for the convenience of deriving the optimal solution to w k .By checking the first order optimality condition of p k , the optimal p k is given by where . Substituting ( 27) into ( 26), we obtain the problem w.r.t.w k as min k∈K which is a linear programming problem and its optimal solution w * k can be easily found via simplex algorithm.After obtaining the optimal solution to p k and w k , the dual variables are updated based on their sub-gradients, whose values are given by respectively.More precisely, the ellipsoid algorithm is employed for updating the Lagrange multipliers [35].After obtaining the optimal solutions to parametrized problem (25), the introduced parameters µ k and β k are updated based on the following damped newton method where ζ is the iteration index and τ ζ is a step size that can be chosen based on the procedure developed in [33].
The algorithm developed for homogeneous SNNs is summarized in Algorithm 1.It is worthy noting that while finding the global optimal solution to problem (20) is challenging, Algorithm 1 ensures optimality for each subproblem.Consequently, as the number of iteration increases, the value of the objective function will monotonically decrease, eventually reaching a solution that is significantly better than the initial point.The complexity of Algorithm 1

IV. HETEROGENEOUS SNNS
In this section, we extend our analysis to more general heterogeneous SNNs.In practice, the firing patterns of spiking neurons can be highly complex and may not exhibit stationarity.There is no universal method to derive analytic expressions that precisely model the distribution of spikes generated by SNNs.To tackle this challenge, we obtain the spike distribution using Monte Carlo methods.Subsequently, the problem is transformed into a form similar with the homogeneous SNNs, through reasonable approximations.We then introduce a power law inspired neuron assignment algorithm.

A. Problem Transformation
As illustrated above, in practice, the distribution of spikes D k is highly complex, which is challenging to obtain closedform expressions.However, by noting that the training dataset of SNNs is typically accessible and shares the same distribution as the test dataset, it is feasible to utilize the spike distribution on training dataset to predict the number of spikes generated in the inference stage.Note that this method is general and applicable to various tasks.
1) Outage Constraints: We employ Monte Carlo-based methods to characterize the distribution of D k .Specifically, for input nodes i, the set of contained neurons Π i is fixed.Thus, the cumulative distribution function (CDF) of D i , denoted as F i (•) can be obtained by plotting the histogram of l∈Πi S l on the training data set.As a result, we can express the outage constraint for input nodes as where For hidden nodes j ∈ J , the problem is more challenging because Π j is not fixed.Note that Π j is a combinatory variable and calculating the CDF of D j for every possible combination of Π j would be prohibitive.Therefore, we seek a simplified method to model the distribution of D j .It is observed that Π j is a subset of neurons in N .Hence, D j should have a similar distribution with D N ≜ l∈N S l .Indeed, the experimental results show that when neurons in Π j are randomly selected from N , then D j has a distribution very similar to that of D N .This implies that D j can be viewed as a scaled version of D N , with the scale factor given by the ratio of their average firing probability ϱ ≜ E{ l∈Π j S l l∈N S l } = l∈Π j q l l∈N q l .This relationship is verified by showing the quantiles-quantiles plot of D j versus ϱD N as in Fig. 4. From this figure, it is evident that the simple linear model approximates the distribution of D j well.Then, adopting this approximation model, the outage constraint for the hidden nodes can be rewritten as where η ≜ l∈N q l with F N (•) denoting the CDF of D N .2) Variable Approximation: From (34), the spike capacity of hidden node j should be greater than a value that is proportional to l∈Πj q l .This is akin to the homogeneous case and we similarly introduce a continuous variable Q k to denote l∈Π k q l .Additionally, Inspired by the homogeneous case where we have where qj ≜ is the average firing probability of neurons in Π j and is initialized with qj = q ≜ l∈N q l N .Such an approximation method, though heuristic, helps remove the combinatorial variable Π j and significantly facilitates the algorithm design.
3) Problem Reformulation: With the above analysis, the EE resource allocation problem for heterogeneous SNNs can be rewritten as where Cj is defined as

B. Alternating Optimization Based Algorithm
It is observed that problem (36) has a very similar form with the problem of homogeneous SNNs (20), but their physical meanings are different.In homogeneous SNNs, the outage constraint is deterministic because the percentage of firing neurons is constant.However, in heterogeneous SNNs, the outage constraint is related to the statistics obtained by Monte Carlo method.Secondly, in homogenous SNNs, neurons fire with equal probability asymptotically, but in heterogeneous SNNs, the average firing probabilities of spiking neurons are distinct and follow the power law, as demonstrated in Fig. 5.In the following, we develop alternating optimization based algorithm for solving problem (36) and propose a power law inspired algorithm for assigning neurons.
1) Problem w.r.t.{Q j , ∀j ∈ J }: The subproblem w.r.t.Q j is given by The problem w.r.t.Q j is a linear programming problem, whose optimal solution can be readily obtained via popular convex optimization tools.
2) Problem w.r.t.{p k , w k , ∀k ∈ K}: The subproblem w.r.t.p k and w k is given by where The problem is the same as in homogeneous SNNs, which can be solved using the fractional programming method.The detail is not repeated here.
3) Neuron Assignment Algorithm: Note that Q j is introduced to approximate l∈Πj q l .Therefore, the hidden neurons should be allocated so that l∈Πj q l is as close to Q * j as possible, where Q * j is the solution to sub-problem (38).Moreover, by noting that the computational energy is related to the number of hidden neurons in the form of j∈J α j C U j |Π j |, more hidden neurons should be allocated to nodes with smaller values of α j C U j in order to minimize the WSE.Thus, it makes sense to sort the hidden neurons in order of q l and allocate them to node with the smallest value of α j C U j until l∈Πj q l = Q * j .However, it is observed that a group of sorted neurons tend to fire together, leading to a higher outage probability.To address this issue, the power-law distribution of q l , l ∈ N is further exploited.
Specifically, the power law is a mathematical relationship between two variables in which one variable's value is proportional to a power of the other.This universal rule has been found in many scientific domains, also including SNN [30].For the investigated heterogeneous SNNs, it is found that the distribution of q l , l ∈ N also approximately follows the power law, as depicted in Fig. 5.The longtail distribution of q l indicates that a small portion of hidden neurons fires intensively while the rest fires with low probability.Therefore, the set of hidden neurons N is divided into two sets N L and N S .Set N L contains neurons that fire intensively.These active neurons are randomly assigned to different hidden nodes aiming at reduce their corrections and lower the outage probability.Set N S contains neurons which have smaller values of q l .These inactive neurons are directly allocated to nodes with smaller value of α j C U j to yield lower WSE performance.Such a power-law inspired algorithm could trade-off the outage probability and the energy consumption.The developed neuron assignment algorithm is summarized in Algorithm 2.
In addition, after assigning the neurons, the CDF of D j , i.e., F j (•) can be obtained via plotting the histogram of l∈Πj S l .Hence, the outage constraint is updated as and the communication resources are reallocated by solving problem (39) to guarantee reliable spiked transmission.The overall algorithm for resource allocation under the heterogeneous case is summarized in Algorithm 3 and its complexity is O{N log N + K 2 }.It can be easily seen that the complexity order of Algorithm 3 is polynomial w.r.t N .
In practice, the number of hidden neurons N may be very large.Therefore, the developed algorithm is suitable for real time implementation.

V. SIMULATION
In this section, we provide numerical results to validate the performance of the developed EE-DSNN.The default simulation settings are I = 2, J = 4, T = 15, ∆T = 1 ms, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Algorithm 2 Power Law Inspired Neuron Allocation Algorithm
Input: The optimized Q * j , the average firing probability of hidden neurons q l , and partition ratio δ. 1 Sort the hidden neurons in N based on ascending order of q l and partition N into sets N S and N L so that l∈N S q l = δ and l∈N L q l = 1 − δ.Sort the hidden nodes J based on ascending order of  We adopt a campus large scale fading model as in [36] and adopt the Rayleigh model for small scale fading.The energy for sensing E S i is not considered since it is not the focus of this paper.

A. Homogeneous SNNs
We first investigate the performance of proposed algorithms for homogeneous SNNs.The system configurations are M = N = 1 × 10 5 , W = 30 MHz, ∆T = 10ms, Algorithm 3 Proposed Alternating Optimization Based Resource Allocation Algorithm for Heterogeneous SNNs Input: The system configurations, the channels, and the statistics of the spikes. 1 Initialize the optimization variables Π j , p k , and w k with a feasible point.Set qj = q = l∈N q l N . 2 repeat 3 Update Q j by solving linear programming problem (38).H f = H r = 1, τ = 0.5, q l = q = 0.5, ∀l ∈ M, and Φ(U ) = U .The following benchmarks are considered: • Proposed: This scheme adopts the proposed Algorithm 1 for homogeneous SNNs and Algorithm 3 for heterogeneous SNNs.• Equal neuron allocation: This benchmark equally allocates the hidden neurons to hidden nodes.The bandwidth and transmit power are optimized using the fractional programming method developed in our proposed algorithms.• Equal bandwidth allocation: This algorithm equally allocates the bandwidth to all nodes.The other variables are optimized following the same procedures as our proposed algorithm.• Max power: This scheme transmits the spikes with maximum power.The other variables are optimized following the same procedures as our proposed algorithm.Fig. 6 illustrates the relationship between the WSE and system bandwidth W . From the figure, the WSE of all analysed schemes decreases with the system bandwidth.Notably, the proposed algorithm achieves significant better performance than other benchmarks, especially when compared to the max power transmission scheme.If there is sufficient bandwidth, the proposed algorithm can efficiently back-off the transmit power, resulting in significant energy savings.In contrast, the max power scheme continues to transmit at full power, leading to unnecessary energy consumptions and potentially reducing the battery life of edge devices.Fig. 7 depicts the WSE versus the firing intensity of input neurons q.As q increases, hidden neurons receive a more intense stimulus, resulting in a higher percentage of firing neurons ρ.Consequently, the WSE of all compared schemes increases as q rises due to the heavier communication burden.The developed algorithm still achieves the best performance.Additionally, as q increases, the gap between the proposed algorithm and the equal neuron allocation scheme becomes larger.This is because the proposed algorithm could allocate more neurons to nodes with superior computational capacity and communication efficiency, resulting in a gradually increased WSE curve compared with the naive scheme of allocating hidden neurons to different hidden nodes.

B. Heterogeneous SNNs
In this subsection, we present numerical results for heterogeneous SNNs.Three representative neuromorphic datasets are chosen to evaluate the performance of the developed algorithms, i.e., N-MNIST [37], DVS-Gesture [38], and Spiking Heidelberg Digits (SHD) [39], whose brief introductions are given below: • N-MNIST: N-MNIST is the neuromorphic version of the traditional MNIST dataset.Its samples are collected by a DVS that records the samples from MNIST dataset displayed on a screen.NMNIST consists of 60,000 training samples and 10,000 testing samples.• DVS-Gesture: DVS-Gesture is a gesture recognition dataset that uses DVS cameras to record actual human gestures.It comprises of 11 categories of hand and arm gestures, with a total number of 1464 samples.• SHD: SHD is a spike-based speech dataset which is transformed from the audio recordings using an artificial ear model.It consists of 10 English and 10 German spoken digits, with a total number of 10420 samples.Following [18], the input samples are equally partitioned into I strides and allocated to input nodes.The default number of hidden neurons is N = 800 and the default system bandwidth is W = 0.4 MHz.In addition, to avoid overfitting, the convolutional layer [40] and attention layer [41] are prepended for the DVS-Gesture and SHD dataset, respectively.We train SNNs using back-propagation through time (BPTT) with Fig. 7. WSE performance versus the firing intensity q of the input neurons on homogeneous SNNs.arctangent as the surrogate gradient function [28].Moreover, an L 1 norm regularization term with a weight of 2 × 10 −8 is added on the training stage to enforce sparsity.Networks are trained and inferred on a NVIDIA RTX A5000 GPU.The statistics of spikes are obtained on training datasets and the performance of analysed algorithms is validated on the testing datasets.
Table .II presents the estimated outage probability and inference accuracy versus γ on the N-MNIST dataset.From the second row, the estimated outage probability is close to or less than γ, indicating the effectiveness of proposed Algorithm 3 in capturing spike statistics and robust transmission.The last row shows the inference accuracy versus γ.As γ decreases, fewer spike losses occur, leading to increased inference accuracy.Furthermore, the variance of the inference accuracy decreases as γ approaches zero.When γ is less than or equal to 1 × 10 −3 , the inference accuracy matches the lossless accuracy almost surely.Table .III and Table .IV display similar results for the DVS-Gesture dataset and SHD dataset, respectively.Note that the DVS-Gesture dataset contains only around 300 test samples.Hence the outage probability when γ = 10 −4 is not shown due to insufficient granularity.These results demonstrate that the developed algorithms also ensure robust spike transmission on these two datasets.
Fig. 8(a) illustrates the WSE performance versus W on the N-MNIST dataset.It is worth noting that the max power scheme exhibits very poor performance and is excluded from the figure.Instead, we consider the full offloading scheme.It is widely used as benchmark in edge computing [42],  where input nodes directly transmit all the spikes to AP.The proposed algorithm consistently achieves the lowest WSE performance among the compared schemes.Moreover, in scenarios with limited bandwidth, there is a significant performance gap between the proposed algorithm and the equal bandwidth scheme.This discrepancy arises because, resources should be allocated to nodes requiring higher transmission capacity when W is constrained.The naive equal bandwidth allocation scheme fails to achieve this goal, resulting in a dramatic increase in transmit power consumption.In contrast, the developed algorithm efficiently manages the limited wireless resources, thereby reducing the energy consumption.Additionally, the proposed algorithm outperforms the full offloading scheme.The full offloading scheme directly transmits all input spikes to the more distant AP, which requires a high transmit power.The developed EE-DSNN addresses this issue through collaborative computation and relaying, significantly reducing the energy consumption of edge systems.
Fig. 8(b) shows the WSE performance versus the outage probability of hidden neurons, γ, on the N-MNIST dataset.The proposed algorithm still achieves the best performance.Moreover, it is observed that when γ is greater than 10 −5 , the WSE increases as γ decreases.This phenomenon arises because a stricter outage constraint requires a larger transmission capacity, resulting in increased transmit power.When γ is less than 10 −5 , the energy consumption remains stable since the outage probability is sufficiently close to zero.As indicated in Table.II, the inference accuracy reaches the lossless value when γ is less than 10 −3 .Hence, there is no need to set γ very close to zero, allowing for approximately a 10% energy saving without affecting the performance of the SNN.
Fig. 9 and Fig. 10 compare the WSE of analysed algorithms on the DVS-Gesture dataset and the SHD dataset, respectively.
From the figures, the proposed algorithm outperforms other benchmarks by large margins, which further demonstrates the efficacy and universality of the developed algorithm.

C. Comparison With ANN
The performance of the SNN is compared with its conventional ANN counterpart and the bidirectional long short-term memory (Bi-LSTM) [43], [44].Note that the Bi-LSTM is the state-of-the-art RNN architecture on most spike-based datasets [45].Both ANN and Bi-LSTM employ the same network structure and the same number of hidden neurons as SNN.Moreover, they are trained using the Adam optimizer with fine-tuned learning rates.The input to ANN is averaged over the time axis since ANN is not designed for sequences.Table .V compares the accuracy among SNN, ANN, and Bi-LSTM.The Bi-LSTM achieves the high accuracy across the three datasets.The performance of the ANN is the worst, particularly on the DVS-Gesture dataset.The DVS-Gesture involves dynamic hand gestures with rich temporal structures, such as clockwise and anticlockwise arm rotations.The ANN is challenging to handle temporal information and suffers from severe performance degradation.The performance of SNN is notably more satisfactory than ANN, with only a 1% performance gap compared to Bi-LSTM.
We also compare the performance of different NNs under the considered edge systems.Specifically, we adopt Algorithm 1 to efficiently deploy the ANN and Bi-LSTM.The intermediate data generated by ANN and Bi-LSTM are transmitted using 16-bit quantization for the sake of efficiency.The computing energies of ANN and Bi-LSTM are calculated based on the energy of floating-point operations on 15-nm CMOS processes [46], [47].Fig. 11 compares the accuracy of SNN, ANN, and Bi-LSTM versus the system bandwidth W Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.on SHD dataset.The performance of N-MNIST and DVS-Gesture is similar and therefore is omitted here.Notably, the SNN requires minimum bandwidth.When W exceeds 0.04 MHz, the SNN achieves its lossless accuracy.This is in sharp contrast with the Bi-LSTM, which requires more than 3 MHz bandwidth for transmitting the intermediate data.Furthermore, ANN requires significantly less bandwidth than Bi-LSTM because the data rate of ANN can be reduced to 1 T compared with Bi-LSTM.However, the bandwidth required by the ANN is still considerably greater than that of the SNN.The SNN only produces binary type of data and the firing rate of neurons can be very sparse (even sparser than 1 T ).Thus, the SNN stands out as a highly bandwidth-efficient computational framework, capable of achieving satisfactory inference accuracy at the cost of minimal communication data.
Fig. 12 compares the energy consumption of SNN, ANN, and Bi-LSTM versus the number of hidden neurons N on SHD dataset, where the weights of nodes are set to α k = 1, ∀k.Note that the solid line represents the power consumption under W = 1 MHz and the dashed line represents the power consumption when W is infinite, i.e., the energy for computation only.The Bi-LSTM exhibits high power consumption when W = ∞, notably exceeding that of ANN and SNN.Moreover, under W = 1 MHz, the power consumption of Bi-LSTM increases by approximately an order of 2. This heightened consumption is attributed to the transmission of a substantial amount of intermediate data.
In fact, when W = 1 MHz, the Bi-LSTM has to transmit with maximum power, remarkably increasing the power consumption.The ANN consumes significantly less energy than Bi-LSTM due to its simpler computational architecture and reduced communication data.However, it is observed that when W = 1 MHz, the power consumption of ANN increases rapidly with N .This escalation is a consequence of the size of intermediate data produced by ANN being proportional to N .According to Shannon's formula, the transmit power needs to scale exponential with N to meet the capacity requirement.In contrast, the SNN is more energy-efficient because it only produces data when neurons fire, and the number of firing neurons does not scale linear with N .Consequently, SNN is a highly energy-efficient computational framework, holding promise to empower a broader range of applications for edge intelligence.

VI. CONCLUSION
This paper developed an energy-efficient distributed SNN for resource limited wireless edge networks.We analysed communication, computation and energy consumption of the system and formulated the weighted-sum energy minimization problem.Then, efficient resource allocation algorithms are developed for homogeneous and heterogeneous SNNs, respectively.Extensive simulations on neuromorphic datasets show that the proposed algorithms significantly reduce the system energy consumption while ensuring inference accuracy.Furthermore, SNNs could achieve comparable performance with state-of-the-art RNNs while being potentially an order more bandwidth-efficient and energy-efficient, rendering it a scalable architecture for edge intelligence.Possible research directions may include the integration of advanced wireless communication techniques such as multicast and nonorthogonal multiple access (NOMA), and implementation of distributed SNNs on neuromorphic hardware.

APPENDIX A PROOF OF THEOREM 1
To prove Theorem 1, we first prove the following lemma which shows that the input stimuli to the hidden neurons, are asymptotically constant.
Lemma 1: When W m,n = H f M , ∀m, n, S l ∼ B(0, q l ), ∀l ∈ M and q l follows a limit distribution q(x).Then, we have where S ≜ lim M →∞ l∈M q l M ≜ q(x)dx.Proof: Based on the Chebyshev's inequality, we have Taking limit on both side of the above inequality, we have M l∈M q l .Substituting it into m∈M W m,n S m , we obtain (42), which completes the proof.□ Lemma 1 shows the input current to each hidden neuron is asymptotically constant.Under such a case, the investigate homogeneous system is equivalent to a group of Galves-Löcherbach (GL) neurons [30].Specifically, at time t, the potential of the fired neurons will be reset to zero, which produces a Dirac impulse at the membrane potential density function p(U = 0)[t].The membrane potentials of these fired neurons then evolve according to (17).Hence, we can divide the neurons based on their firing ages k.Denoting the percentage of neurons that fires at time t − k and did not fire until t as η k and their corresponding membrane potential as Φ k .The percentage of firing neurons can be written as (17), the relationship between these neuron dynamics can be written as (48) Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 3 .
Fig. 3.The stationary states of a group of homogeneous neurons under different spike losses.

Algorithm 1 3 Update
Proposed Alternating Optimization Based Resource Allocation Algorithm for Homogeneous SNNsInput: The system configuration parameters and channels.1 Initialize the optimization variables in Ω with a feasible point. 2 repeat Q j by solving the linear programming problem(22).

7
Update w k by solving the linear programming problem(28).

8
Update the Lagrange multipliers λ k and θ k using the ellipsoid algorithm based on the sub-gradients (29) and(30).

Fig. 4 .
Fig. 4. The quantiles-quantiles plot of spike generated by a subset of neurons D j versus the scaled number of spikes generated by all neurons ϱD N .The SNN is trained on N-MNIST dataset with N = 3200.

11 k
← randomly choose an element from J , 12 N L = ∅; N 0 = −174 dBm/Hz, p max = 23 dBm, C F j = 30.7 pJ, and C U j = 52 + 23.6 l∈L q l pJ, ∀j ∈ J [6].The weight is set as α i = 1, ∀i ∈ I and α j is randomly chosen from range [1, 10].The input nodes are randomly located inside a circle centred at [−400 m, 0 m] with a radius 100 m and the hidden nodes are randomly located inside a circle centred at [−200 m, 0 m] with a radius of 100 m.The location of the AP is [0 m, 0 m].

4
Update w k and p k by solving the fractional programming problem(39).

5 Update Π j based on Algorithm 2 7
Update ck based on (41) and execute step 4.

Fig. 11 .
Fig. 11.Inference accuracy of SNN, ANN and Bi-LSTM versus bandwidth W on the SHD dataset.

Fig. 12 .
Fig. 12. Power consumption of SNN, ANN and Bi-LSTM versus number of hidden neurons N on the SHD dataset.

TABLE II OUTAGE
PROBABILITY AND INFERENCE ACCURACY VERSUS GAMMA ON THE N-MNIST DATASET TABLE III OUTAGE PROBABILITY AND INFERENCE ACCURACY VERSUS GAMMA ON THE DVS-GESTURE DATASET Fig. 6.WSE performance versus the system bandwidth W on homogeneous SNNs.

TABLE IV OUTAGE
PROBABILITY AND INFERENCE ACCURACY VERSUS GAMMA ON THE SHD DATASET