Power-Bandwidth Tradeoff for Ultra-Low Power MFSK and G-MFSK Systems

In one of our previous papers, we designed an ultra-low power non-coherent MFSK system using 2-pole bandpass filters to replace matched filters for detection and showed the performance loss between our proposed system and the optimal MFSK system using matched filters for detection was no greater than 1.2 dB in all alphabet size, phase continuity and channel conditions we analyzed. In this paper, we improve our previous design by considering the power-bandwidth tradeoff, and we show that we can save a large percentage of system bandwidth by sacrificing a small amount of power, when the demodulator and coding parameters are optimized. For example, we can save 50% of system bandwidth at the cost of 1 dB loss in performance compared to our previous system design. We further extend the results to include Gaussian filtering, we quantify the performance loss as a function of both the system bandwidth saved and the time-bandwidth product of the Gaussian filter, and we compare the performance of the $M$ -ary GFSK system with the corresponding MFSK system.


I. INTRODUCTION
Recent low-data-rate wireless communication applications, such as Bluetooth, wireless personal area networks, wireless sensor networks, and wireless medical implants require low power consumption and low cost. Other examples of the need for low power systems vary from reducing the size and weight of batteries used by foot soldiers who carry tens of pounds of equipment in their backpacks, to various low-power internet-of-things (IoT) applications that have a wide spread of applications from wearable fitness trackers to transportation, healthcare, consumer electronics and many others [1], [2], [3]. M -ary frequency shift keying (MFSK) modulation is used in low power communication systems for its power efficiency [4], [5], [6], [7], [8]. A novel noncoherent 4FSK high data rate and low power transceiver was introduced in [9].
In Gaussian FSK (GFSK) modulation, a Gaussian filter is used to reshape the transmitted signal, smooth the The associate editor coordinating the review of this manuscript and approving it for publication was Di Zhang . transition between symbols. GFSK has the advantage of reducing sideband power and reducing interference with neighboring channels, at the cost of increasing intersymbol interference. GFSK is frequently used in applications such as Bluetooth receivers [10], [11], [12], [13]. Attempts have been made to study and implement GFSK receivers for the purpose of reducing power consumption and improving system performance. A mixed-signal GFSK demodulator was proposed in [13], with a power consumption of 6 mW, which can tolerate up to 200 kHz frequency offset at a 2 MHz intermediate frequency. A GFSK receiver with an ultra-low power consumption based on injection-locking was presented in [12], achieving a power consumption of 1.8 mW, and a Bluetooth GFSK data rate of 1Mb/s. GFSK demodulators with large frequency offset tolerance between the transmitter and receiver were proposed in [11]. An optimized differential GFSK demodulator that outperforms conventional differential demodulators was developed in [10]. The use of GFSK with frequency hopping spread spectrum (FHSS-GFSK) to detect drone communication signals in a non-cooperative scenario was proposed in [14]. The theoretical performance

II. DEMODULATOR PERFORMANCE IN AN AWGN CHANNEL
In this section, we analyze the performance of a GFSK non-coherent demodulator with 2-pole BPF detection in an AWGN channel. The demodulator consists of a parallel bank of M branches, each with a BPF whose center frequency is the frequency of the corresponding tone, followed by an envelope detector and a sampler, and we choose the largest among the M test statistics from the samplers to make a decision, as shown in Fig. 1. The received waveform is given by In (1), n w (t) is additive white Gaussian noise (AWGN) with single sided power spectral density η 0 , and s(t) is a pulse train of rectangular pulses of duration T , filtered by Gaussian filters. The lowpass equivalent signal of one pulse in s(t) is given by where P a (x) is a rectangular pulse defined as and the transfer function and impulse response of the lowpass equivalent Gaussian filter are, respectively. The detection filters H (ω) are 2-pole BPFs, and the transfer function and impulse response of the i th filter are, respectively. The impulse response of the lowpass equivalent filter of the 2-pole BPF is given by where W is the filter bandwidth and f = f 2 − f 1 is the tone spacing, or the difference between center frequencies of adjacent detection filters, and u(·) is the unit step function. In (5), we assumed that the center frequency of the filter was much greater than the filter bandwidth for any filter when we derived the impulse response from the transfer function. We define VOLUME 11, 2023 where z g is the time-bandwidth product of the lowpass equivalent Gaussian filter, z is the time-bandwidth product of the lowpass equivalent filter of the 2-pole BPF, h is the modulation index, T s is the sampling time and r is the normalized sampling time. Note that all the parameters in (7) are dimensionless, meaning our results are independent of the symbol duration (or data rate). For each pair of (z g , h), we optimize z, r to minimize the E b /η 0 required at a bit error rate (BER) of 10 −3 .
The total system bandwidth of an M -ary FSK system is given by where B sig is the bandwidth of one FSK signal. While there are multiple ways of defining bandwidth of a signal, such as the null-to-null bandwidth (bandwidth of the main spectral lobe, possibly the most popular definition for digital communications, but lacks complete generality since some modulation formats lack well-defined lobes) and equivalent noise bandwidth (ENBW, the bandwidth of a brickwall filter which produces the same integrated power as the signal power and is critical for predicting of receiver sensitivity), all with their own advantages and disadvantages, B sig is roughly on the same order as the tone spacing f . Therefore, for a large M (e.g., for our baseline design of M = 16), B ≈ M f . Since h = fT = 1 was shown in [5], [6], [7], and [8] to yield optimal performance, the bandwidth saved by the system proposed in this paper can be represented as 1 − h.
It is easy to see that our system experiences both intercarrier interference (ICI) and inter-symbol interference (ISI) due to Gaussian filtering, non-orthogonal tone spacing, and 2-pole BPF detection. Because of the non-causality of the Gaussian filter, to detect the current transmitted symbol, we need to consider ISI from both the previous symbols and the future symbols. Based on the analysis in [5], the adjacent symbols contribute the most ISI, so we consider only one previous symbol and one future symbol as the source of ISI in this paper. If the transmitted symbol has frequency f i , the ICI branch is the branch with 2-pole BPF centered at frequency f n , the previous symbol has frequency f m 1 and the future symbol has frequency f m 2 , then the union bound on the symbol error rate (SER) can be shown to be given bȳ where θ 1 , θ 2 ∼ U [0, 2π] and Ps m 1 m 2 in is the conditional SER, conditioned on i, n, m 1 and m 2 , as a function of θ 1 and θ 2 , and is given by [16] Ps m 1 m 2 where Q(·, ·) is the Marcum-Q function defined as and I 0 (·) is the modified Bessel function of the first kind and zeroth order, defined as The parameters in (11) can be shown to be given by where X = 2π(i − n)hr and we use the four-quadrant definition of tan −1 (·), and where In (15), m in (r) and µ in (r) are the real and imaginary parts of y in (r), respectively, i.e., m in (r) = ℜ{y in (r)}, µ in (r) = ℑ{y in (r)}, (16) and the lowpass-equivalent waveform of y in (r), y lp (r) (which is, in general, complex), is given by where and where j is the imaginary unit, (·) is the cumulative distribution (CDF) of the standard normal distribution, erf(·) is the error function, and λ is defined as λ ≜ δ T = √ ln 2 2πz g . The parameters A 1 and θ 1 can be found by letting n = i in (15) - (17). Detailed analysis to find y in (r) is shown in the Appendix. For the noise parameters in (11), the filtered noise power σ 2 , the magnitude of the normalized complex crosscovariance |ρ|, and the corresponding phase φ = ̸ ρ were found in [5] to be given by where we still use the four-quadrant definition of the tan −1 (·) function.
We do an exhaustive search to optimize the parameter pair (z, r) of the system with 2-pole BPF detection to minimize the E b /η 0 required to achieve a BER of 10 −3 as a function of z g and h. The optimal parameters z opt , r opt , and the corresponding performance of the optimal 2-pole system, for z g = ∞, 1, and 0.5, in conjunction with h = 1, 0.95, 0.9, . . . , 0.5, are summarized both in Table 1 and in Fig. 2 in Section V, where the accuracy of z is 0.04 (i.e., the values of z we search for are all integer multiples of 0.04), and the accuracy of r is 0.02. Note that z g = ∞ is equivalent to removing the Gaussian filter.

III. DEMODULATOR PERFORMANCE IN FLAT, SLOW RICIAN FADING CHANNELS
In a flat, slow Rician fading channel, the signal amplitude R is Rician distributed: where I 0 (·) was defined in (13). If we let γ denote the E s η 0 without fading, andγ denote the average (over the fade) E s η 0 under Rician fading, then the probability density function (PDF) of γ is given by [17] f where K is the Rician K -factor defined as K ≜ A 2 2σ 2 . As is well known, the Rician fading channel reduces to an AWGN channel when K → ∞ and reduces to a Rayleigh fading channel when K = 0. For all cases analyzed in the previous section, letP b,AWGN denote the union bound on the BER in an AWGN channel. Then the union bound on the average BER in a Rician fading channel can be found by integrating the product ofP b,AWGN and (21) from 0 to ∞.
Note that while the union bound is a tight bound in an AWGN channel, it is not necessarily a tight bound in a Rician fading channel, and as a result, the performance analyzed by the union bound can be somewhat pessimistic in a Rician fading channel, especially in a Rayleigh fading channel, as we will see in Section V.
Similar to the AWGN case, we do an exhaustive search to optimize the parameter pair (z, r) of the two-pole system to minimize E b /η 0 required to reach the BER of 10 −3 , as a function of z g and h. We use the examples of a typical Rician fading channel (K = 10) and a Rayleigh fading channel (K = 0) to present the numerical results. The optimal parameters z opt , r opt and the corresponding performance of the optimal 2-pole system, for z g = ∞, 1, 0.5, h = 1, 0.95, 0.9, . . . , 0.5, and K = 10, 0, are summarized in Table 1 and Figs. 3 and 4, where the accuracy is 0.04 for z, and 0.02 for r.

IV. DEMODULATOR PERFORMANCE WITH RS CODING
The choice of RS codes with hard decision decoding here is particularly appropriate because of the lower complexity compared to soft decision and/or iterative decoding, and the straightforward manner in which the RS encoded codewords can be mapped to the MFSK/GFSK signal set. We optimize the code dimension k in conjunction with the time-bandwidth parameter z and the normalized sampling time r for the 2-pole system with (n, k) RS codes. We fixed the code length and exhaustively searched all possible values of code dimension, k, to find the k that minimizes the E b /η 0 required to reach the BER of 10 −3 .
Application of this transmission scheme to a fading channel generally requires that coded data be interleaved after encoding in order to randomize symbol errors due to burst errors caused by deep fades, thus improving decoder performance [18]. Here we assume that perfect interleavers (i.e., infinitely long) are used. The relationship between uncoded and coded error rate can be found in [8] and [19].
For the code length of n = 15 and the alphabet size of M = 16 (i.e., one RS symbol corresponds to one FSK symbol), we repeat the same optimization by exhaustive search as in previous sections, while adding one more parameter to optimize: the code dimension k. We optimize the triple (z, r, k) to find the optimal coded performance for a given z g and h, in an AWGN, a Rician (K = 10), or a Rayleigh fading channel, and measure the coding gain at the BER of 10 −3 for all cases evaluated in the previous sections. The numerical results are presented in Table 2 and Figs. 2 -5 in Section V.
If the target BER is smaller, we can use a stronger code. For example, the performance curves using length n = 255 RS code are shown in Figs. 7 -9 in Section V. We attempted to optimize the triple (z opt , r opt , k opt ) using length n = 255 RS code, but the range of code dimension k was k ∈ {1, 3, . . . , 253} and the optimization took too long. Thus, we used (z opt , r opt ) of the corresponding cases for n = 15 RS code and only optimized the code dimension k for the larger code.

V. NUMERICAL RESULTS
In this section, we present the numerical results of the previous sections. Our baseline design of M = 16 applies to all the figures in this section. Furthermore, the Rician K -factor is K = 10 and the RS code length is n = 15 or 255 for the coded VOLUME 11, 2023 cases. The metric to evaluate performance is the E b /η 0 measured at the BER of 10 −3 , and coding gain is also measured at this BER. Table 1 lists the optimal parameter pair (z opt , r opt ) for the uncoded GFSK system with z g ∈ {∞, 1, 0.5} in an AWGN, a Rician or a Rayleigh fading channel. Table 2 lists the optimal parameter triple (z opt , r opt , k opt ) for the coded GFSK system with z g ∈ {∞, 1, 0.5} in an AWGN, a Rician or a Rayleigh fading channel. Figs. 2 -4 show the optimal coded (n = 15) and uncoded performance as a function of the modulation index h and the time-bandwidth product of the lowpass equivalent Gaussian filter z g , for h between 0.5 and 1 (i.e., up to 50% saving in bandwidth) and z g ∈ {0.5, 1, ∞}, in an  uncoded performance curves in terms of P b vs. E b /η 0 . We use monte-carlo simulation to support the analysis in this paper, where the number of errors collected is 1000 for each data point.
Some observations from the figures and tables are as follows: VOLUME 11, 2023 By the nature of the union bound, it is an upper bound that is not always tight, and for our case, in a fading channel, it provides a pessimistic estimate. We need to resort to accurate simulation (collect enough errors) for a good approximation of performance. Furthermore, as can be seen  in Figs. 7 -9, the gap between the actual performance and the union bound decreases as K increases: in a Rayleigh fading channel, it can be over 6 dB, in a Rician fading channel, it is roughly 1 dB and in an AWGN channel, since the union bound is tight, the gap is negligible. If we use the union bound to compare the performance in AWGN vs. fading channel conditions, we will get inaccurate results.
For any z g and any channel condition, the performance degrades with decreasing h, but the performance degrades faster when K is larger and when z g is smaller.
For any h and any channel condition, a smaller z g can make the transition between FSK symbols smoother and thus decrease out-of-band spectrum, at the cost of more performance degradation.
For any z g and any channel condition, the performance degrades faster with decreasing h for an uncoded system, compared to a coded system, meaning the power-bandwidth tradeoff leans towards bandwidth for a coded system. For example, for z = 0.5 in a Rician fading channel the performance degradation between h = 0.5 and h = 1 (i.e., the   amount of performance loss to save 50% of bandwidth) is 2.4 dB for a coded system, and 6.8 dB for an uncoded system. If a more powerful code is used (e.g., n = 255 RS code) and optimized, we could save more bandwidth for the same amount of performance degradation, as shown in Figs. 7 -9.
For any h and any channel condition, the coding gain increases with decreasing h. Furthermore, the coding gain is smaller when K is larger, but also increases faster with decreasing h. This is intuitive since coding is more effective for worse uncoded performance.
The performance degradation increases faster when h is smaller, i.e., when the bandwidth is already small, we need to sacrifice more power in order to further save bandwidth, e.g., for the case of coded 2-pole GFSK system with z g = 1 in an AWGN channel, we sacrifice 0.8 dB in performance for 30% saving in bandwidth, but we need to sacrifice another 2.5 dB (i.e., 3.3 dB in total) in performance for another 20% (i.e., 50% in total) saving in bandwidth.
As can be seen in Tables 1 and 2, the optimal z increases with decreasing h, regardless of whether the system is coded or not, and regardless of the whether the channel is fading or not. This is because decreasing h results in more severe ICI and ISI, and we have to use larger z to reduce ISI, at the cost of increasing the noise power going into the system.
It can be seen from the tables that for all cases we evaluated, the optimal r is 1 without Gaussian filtering (i.e., z g = ∞), which is consistent with what we found in [5], and for most cases we evaluated, the optimal r is in the vicinity of 0.9 with Gaussian filtering. This is due to the pulse shaping effect of the Gaussian filter, making the filtered signal waveform peak roughly at 90% of the symbol duration.
It can be seen from Fig. 6 that for all cases we analyzed, i.e., any z g ∈ {∞, 1, 0.5} and any channel condition, we can achieve at least a 30% saving in total system bandwidth at the cost of no more than 0.9 dB loss in E b /η 0 when all the parameters of the coded 2-pole system are optimized.
As is shown in Fig. 6, the performance loss of a GFSK system compared to the corresponding MFSK system is a function of z g and h. Specifically, it is a decreasing function of both z g , i.e., the smaller the bandwidth of the Gaussian filter is, the more penalty in performance. The performance loss also increases when we decrease the tone spacing, i.e., when we want to save more system bandwidth.

VI. CONCLUSION
In previous work [5], [6], the authors addressed the design and performance analysis of an ultra-low power 16-FSK system, where they showed the performance degradation between the proposed system and the optimal system was no greater than 1.2 dB, in an AWGN or a Rician/Rayleigh fading channel. In this paper, we further showed that we could potentially save a large fraction of total system bandwidth at the cost of a small extra performance loss by decreasing the tone spacing and optimizing key parameters including filter bandwidth, sampling time and code dimension. For a fixed loss in performance, the saving in system bandwidth is greater when the channel condition is worse, i.e., when the Rician K -factor is smaller. For example, in an AWGN channel, we could save 30% of system bandwidth at the cost of 0.56 dB in performance, or we could save 50% of bandwidth at the cost of 2.4 dB in performance, where the metric to compare performance is E b /η 0 measured at a BER of 10 −3 for the optimal coded system; in a Rayleigh fading channel, we could save 30% of system bandwidth at the cost of 0.25 dB in performance, or we could save 50% of system bandwidth at the cost of 1 dB in performance.
Furthermore, we extended the results to a GFSK system where a Gaussian filter is used to smooth the transition between pulses and decrease out-of-band spectrum. We quantified the performance degradation of the GFSK system compared to the corresponding MFSK system, as a function of the time-bandwidth product (z g ) of the Gaussian filter and the fraction of bandwidth saved. For example, if we want to save 25% of system bandwidth and we use a Gaussian filter with time-bandwidth product z g = 1, the optimal coded GFSK system is worse than the optimal coded MFSK system by 1.4 dB.
Note that for the same system bandwidth, GFSK always loses to MFSK in terms of power by a small amount. However, this is compensated by the advantage of reducing sideband power and reducing interference with neighboring channels, though the advantage is not shown explicitly in this paper.

APPENDIX
We start by filtering an isolated rectangular pulse cosine wave with frequency f i and phase θ through a 2-pole BPF centered at frequency f n , and a Gaussian filter centered at frequency f i . We use lowpass equivalent filtering for simplicity. The input VOLUME 11, 2023 signal can be expressed as a function of the normalized time r = t/T as where T is the symbol duration, and the lowpass equivalent signal is given by The impulse response of the n th 2-pole BPF is and the lowpass equivalent impulse response of the filter is given by where Therefore, the lowpass equivalent output of the rectangular pulse through the 2-pole BPF is given by and the corresponding bandpass output is given by We now include the Gaussian filter, and use the following result from [20]: where erf(·) is the error function, and λ is defined as Then the lowpass-equivalent output of the Gaussian filter is given by By letting t = τ −r λ and C = πz − j2π(i − n)h, we get and by carrying out the integrals using [20], we get where The corresponding bandpass output signal is given by where θ 1 is the phase associated with the transmitted symbol. Now that we know the output of signal with frequency f i through branch ''n ′′ , y in (r), we can find the output of the previous pulse with frequency f m 1 through branch ''n ′′ , y m 1 n (r), in a similar manner, to be y m 1 n (r + 1) = ℜ{y lp (r + 1)e −j(ω m 1 (r+1)T +θ 0 ) } = ℜ{y lp (r + 1)} cos(ω m 1 (r + 1)T + θ 0 ) +ℑ{y lp (r + 1)} sin(ω m 1 (r + 1)T + θ 0 ) = m m 1 n (r + 1) cos(ω i rT + ω m 1 i rT + ω m 1 T + θ 0 ) +µ m 1 n (r + 1) sin(ω i rT + ω m 1 i rT + ω m 1 T + θ 0 ) ≜ m m 1 n (r + 1) cos(ω i rT + X ) +µ m 1 n (r + 1) sin(ω i rT + X ), where θ 0 is the phase associated with the previous transmitted symbol, and we assume θ 0 = − ω m 1 i rT − ω m 1 T ⇒ X = 0 without loss of generality. Letting X = 0 in (36) yields y m 1 n (r + 1) = m m 1 n (r + 1) cos(ω i rT ).