Software-Defined DDPM Modulators for D/A Conversion by General-Purpose Microcontrollers

The software implementation of Dyadic Digital Pulse Modulators (DDPMs) for Digital to Analog (D/A) conversion is addressed in this paper. In particular, an enhanced software DDPM implementation is proposed and compared with a plain, iterative software transposition of the basic DDPM hardware architecture. Experimental results on an 8-bit software-defined DDPM D/A converter implemented on a Texas Instrument c2000 microcontroller platform validate the approach, revealing for the novel optimized software DDPM a 6X maximum sample rate compared to the simple iterative implementation on the same microcontroller and at the same system clock frequency. Based on measurements, an 8-bit DDPM DAC featuring the proposed optimized implementation operates at 7.8kS/s with a maximum INL of 1.64LSB, a maximum DNL of 1.79LSB, an SFDR of 47.02dB and a SNDR of 45.27dB, corresponding to 7.23 ENOB, demonstrating the effectiveness and the applicability of the proposed approach to implement a low cost, software-defined D/A converters in microcontroller-based embedded systems.


I. INTRODUCTION
T HE dyadic digital pulse modulation (DDPM) has been introduced in [1] to generate deterministic bitstreams with a pulse density proportional to a binary input code, in which the most relevant spurious energy content is pushed at high frequency. The spectral characteristics of DDPM streams have been exploited to release the constraints of the reconstruction filter in baseband digital-to-analog (D/A) conversion [1]- [3], to perform dynamic offset calibration in digital-based operational transconductance amplifiers [4], [5], and have also been applied in analog-to-digital (A/D) conversion [6], [7], in digital RF amplitude modulation [8] and in digitally-controlled power converters [9], [10], aiming to increase the effective resolution of digital pulse width modulators (DPWM) as demanded to avoid the onset of limit cycle oscillations.
Compared to other modulations and dithering techniques, DDPM is amenable of an extremely area-and power-efficient all-digital hardware (HW) implementation, which makes it attractive in tightly power-and cost-constrained Internet of Things (IoT) sensor nodes [2], [3], [11]- [18]. Along with the theoretical assessment of the modulation and its spectral properties, indeed, the efficient HW implementation of a DDPM modulator has been addressed starting from [1], in which two different DDPM modulator architectures have been proposed targeting an FPGA DAC implementation alternative to DWPM [19] and Sigma-Delta [20]. One of the DDPM modulator HW architectures proposed in [1] was also adopted in the standard-cell based synthesized DDPM modulators integrated in 40nm CMOS presented in [6], [7] and in [4], and is also the basis of the DDPM modulator included in the Dyadic Digital Pulse Width Modulator (DDPWM) proposed in [9], [10] and of the DDPM-based RF modulator implementation in [8]. A different DDPM modulator HW implementation has been finally proposed in [3], to achieve graceful performance degradation under frequency and supply voltage overscaling.
While a significant attention has been paid so far to the digital HW implementation of a DDPM modulator, the software (SW) implementation of a DDPM modulator in a traditional microprocessor/microcontroller HW has not been specifically addressed, thus limiting in many practical cases the applicability and the advantages of the DDPM modulation either to application specific integrated circuits (ASIC) or FPGA implementations.
In this paper, different DDPM modulator architectures are discussed and compared with regards to their HW and SW implementation. In particular, a new DDPM modulator architecture, which is amenable to an efficient SW implementation is proposed and compared to a SW architecture resulting from the straightforward translation of the HW implementation. The effectiveness and the efficiency of the proposed SW DDPM modulator is finally demonstrated by the implementation of a software-defined 8-bit DDPM DAC on an off-the-shelf Texas Instruments c2000 microcontroller platform [21].
The rest of the manuscript is organized as follows. In Sect.II, the DDPM modulation and its spectral properties are shortly revised, in Sect.III, the DDPM modulator HW implementations proposed so far are revised and compared in terms of HW complexity and performance, also in view of their possible translation in SW. Moreover, a new optimized DDPM modulator architecture specifically intended for SW implementation is proposed and compared to the other solutions. The effectiveness of the proposed solution is then verified in Sect.IV by the SW implementation of an 8-bit DDPM DAC in a general purpose microcontroller unit, whose operation is experimentally verified in the same Section. Some concluding remarks are finally drawn in Sect.V.

II. THE DDPM MODULATION
Before discussing the HW and SW implementations of a DDPM modulator, the definition and the basic spectral properties of a DDPM stream, as introduced in [1], are revised in this Section.

A. DDPM STREAM DEFINITION
The DDPM modulation, as originally proposed in [1], is a digital modulation technique which associates to an integer with a binary representation on N bits the periodic digital stream obtained by superposition of orthogonal dyadic basis functions S i (t) (i = 0, . . . , N − 1) defined on the fundamental period (0, 2 N T clk ) as: where T clk is the clock cycle and Π (x) is the unit pulse (Π(x) = 1 for 0 ≤ x ≤ 1 and Π(x) = 0 elsewhere).
Orthogonal dyadic basis functions S i (t) are N nonoverlapping, periodically repeated digital streams of 2 N clock cycles, organized so that S N −1 is high (i.e. at V DD ) every other clock cycle (i.e., in 2 N −1 cycles per period), S N −2 is high every other cycle in which S N −1 is low (i.e., in 2 N −2 cycles), S N −3 is high every other cycle in which both S N −1 and S N −2 are low (i.e., in 2 N −3 cycles per period) and so on, till S 0 , which is high just in one cycle per period, as shown in Fig.1.
Since the basis functions S i (t) are non-overlapping, i.e.
and high in exactly 2 i clock cycles per period, DDPM streams Σ n defined in (3) are high for exactly n clock cycles per period and their pulse density is therefore n/2 N as observed in the same Fig.1, where the construction of a DDPM stream by superposition of dyadic basis functions S i (t) is illustrated for n = 10.
The definition of the DDPM modulation given in (3) can be shown to be equivalent to the expression provided in [3], according to which the DDPM bitstream Σ n associated to the integer n represented in binary as in (1) can be expressed as in which [·, ·] is the binary string concatenation operator.

B. SPECTRAL PROPERTIES OF DDPM STREAMS AND APPLICATIONS
The spectra of a DDPM stream have been evaluated in [1] by the Fourier transform of (3) as:  Illustration of the relation between the harmonic spectral coefficients c k,n and the binary representation Bn of the digital input code n, as presented in [8], for the digital input n = 10 (i.e. Bn = (1, 0, 1, 0)).
Moreover, in [8] it has been shown that, very interestingly, the spectral coefficients c k,n in (6) can be equivalently expressed as where, from number theory [22], is the dyadic order of the integer k, i.e., the largest exponent ν such that 2 ν divides k.
In other words, as illustrated in Fig.2, the DC component (k = 0) of the DDPM stream is expectedly equal to the value of the binary input n, whereas higher order spectral coefficients related to the amplitude of the k-th harmonic consisting of the last ν 2 (k) + 1 least significant bits (LSBs) of the binary representation of the input code n, interpreted as a signed integer in two's complement notation.

C. APPLICATIONS OF DDPM
The interest of DDPM in D/A conversion [1]- [3] is related to the envelope of the spectra of DDPM signals evaluated as in (6) over different input codes n, i.e.
which is reported in Fig.3 (top) for N = 16 bit. Indeed, it can be observed that, while the DC component of DDPM streams is proportional to the digital input n, their AC spectral components are concentrated at high frequencies and can be easily filtered out. In detail, based on (8), the energy of the spectral components is related to 2 ν2(k) , and increases with k with a slope of 6dB/oct or 20dB/dec. It follows that in a DAC based on DDPM, a first-order filter with a cutoff frequency of f c = f clk /2 N √ 3 is sufficient to keep all the spurious DDPM spectral components −6(N + 1)dB below the DC [1], as illustrated in Fig.3 (bottom). The same properties have also been exploited to design DDPM-based voltage-mode and current A/D converters, as described in [6], [7].
In view of the low-frequency characteristics of DDPM streams highlighted above, DDPM has also been adopted in power electronics [9], [10], as a dithering technique to increase the effective resolution of DPWM modulators in digitally controlled switching mode power converters, so that to suppress quantization-induced limit cycle oscillations with minimum output voltage ripple degradation and without impairing the dynamic performance, and in dynamic offset calibration of analog integrated circuits [4].
Finally, leveraging (8), in [8] it has been observed that the same DDPM can be conveniently adopted for mixerless RF digital amplitude modulation in SW-defined radio transmitters.
Although previously proposed FPGA and ASIC DDPM modulators are very compact and energy efficient, the implementation of a DDPM modulator in low-cost electronic systems based on off-the-shelf components can be easily not convenient since commercial microcontroller units do not include at present a DDPM modulator peripheral, and the implementation of the DDPM modulator by a separate FPGA, ASIC or spare logic is impractical and possibly expensive.
Addressing the above limitation, the SW implementation of a DDPM modulator is tackled in this paper. For this purpose, previously proposed HW architectures are reviewed, and their possible SW implementation is discussed. Moreover, new DDPM modulator architectures, suitable to an efficient SW implementation on standard digital HW (microprocessor/microcontreller unit) are introduced and compared.

A. PARALLEL DDPM MODULATOR
The first DDPM modulator implementation proposed in [1] is based on a single parallel-input, serial-output (PISO) digital register. In this architecture, the bits of the register are loaded in parallel with the bits of the input code according to the DDPM pattern, as illustrated in Fig.4, and are then streamedout serially to the output.
Such a solution does not require any combinational logic and is therefore suitable to operate at a very high clock frequency, but its complexity (number of D flip-flops) increases exponentially with the number of bits N of the modulator, so that its area and power efficiency is rather low for N exceeding 4-5. Due to this limitation, the parallel architecture in Fig.4 has not been adopted so far in FPGA and ASIC implementations. Since the DDPM function is hardwired in the interconnections of the different memory elements of the circuit in Fig.4, such a solution is clearly not suitable to be implemented in SW.

B. PRIORITY MUX-BASED DDPM MODULATOR
A different DDPM modulator implementation, which was also proposed in [1], is shown in Fig.5 and its behavior is described in synthesizable behavioral VHDL in Fig.6. Its operation is based on a priority MUX, i.e. a combinational network with 2 · N inputs, among which N data inputs D N −1 · · · D 0 and N selection inputs S N −1 · · · S 0 , and one output X, described by the Boolean function where sums and products are intended as Boolean OR and AND operators, respectively. The output X takes the value of D N −k , i.e. the k-th bit from the MSB, of the data input, being k the index of the first "one" in the selection inputs starting from the LSB, i.e. the index for which S k = 1 and S i = 0 for 2 i < k.
The data inputs of the priority MUX are fed by the DAC input data register, while the selection inputs are connected to a free-running 2 N -bit binary counter, so that every other clock period, S 0 = 1 and the output X takes the value D N −1 of the MSB of the input data. Moreover, in the remaining counting periods in which S 0 = 0, in one half of the cases (i.e. 2 N −2 times in a full count), S 1 = 1 and the output X takes the value of the second MSB D N −2 of the input and so on, so that the output X is driven in a full counting period to the logical value of the bit D i of the input exactly 2 i times, arranged according with the DDPM pattern in Fig.1.
The priority MUX-based DDPM modulator HW architecture achieves a good complexity-performance tradeoff and has been therefore widely adopted so far both in ASIC and in FPGA implementations [1]- [3], [7]- [9]. A modified version of the same DDPM modulator architecture, which achieves graceful performance degradation under clock frequency and supply voltage overscaling, thanks to the special custom implementation of the priority MUX shown in Fig.7 has been also proposed in [2].

C. ITERATIVE DDPM MODULATOR
Since the DDPM modulator presented in Sect.III-B requires a priority MUX combinational network, which is normally not included in arithmetic logic units (ALUs) nor is implemented by a dedicated opcode in general purpose microcontrollers, it is not immediately suitable to SW implementation. The same logical behavior described by the VHDL code in Fig.6 can be however translated almost straightforwardly in a procedural form, and implemented iteratively by the architecture shown in Fig.8.
In this architecture, the LSB of binary counter is checked first: if COU N T 0 = 1, the DDPM output is assigned the value of the MSB of the input data register (i.e., OU T = IN N −1 ) and the procedure is terminated. Otherwise, the second LSB of binary counter is checked: if COU N T 1 = 1, An HW implementation of an N = 4 iterative DDPM modulator is shown in Fig.8. It includes two shift registers used as "one hot" counters: a right shift register (RSR), and a left shift register (LSR), which are initialized with the binary values 1, 0, 0 . . . , 0 and 0, 0, . . . , 0, 1, respectively. For each iteration, the LSR output is ANDed with the binary counter value. If the result of the AND operation is true, the DDPM output bit is updated according to the result of the AND of the RSR and the input data register, and the iteration is terminated. Otherwise, the iteration continues until the "hot one" in the RSR is discarded from the MSB position and the RSR register content becomes zero.
Unlike the architectures discussed in Section IIIa and IIIb, the iterative architecture in Fig.8 is suitable to SW implementation in a general purpose microcontroller and can be translated into the C-code in Fig.9. Here, the LSR (RSR) are mapped into the mask (cmask) variables, which are right-(left-) shifted by one position at each iteration of the while loop. The content of mask is put in AND with the binary counter value COUNT, and the result is checked by an if-else construct. If the condition is true, the output variable bit returned by the function is either 1 or 0, depending on the result of the AND operation among  cmask and input data value, and the while loop iteration is terminated. Otherwise, the while loop proceeds with one more iteration, and the same operations are performed until The iterative SW implementation presented above is effective but not efficient in terms of execution time, since N iterations of the while loop are needed in the worst case, i.e. when just the MSB of COUNT is one and all the other bits are zero, to get one bit of the DDPM output stream. Since the sample rate needs to be the same for any input code, the worst-case execution time limits in practice the maximum sample rate at which the DDPM can be operated, resulting in modest performance.

D. PROPOSED OPTIMIZED DDPM MODULATOR
In order to address the limitations of the iterative DDPM modulator presented above, a new DDPM architecture suitable to SW implementation is proposed in Fig.10. In such an architecture, the position of the first "one" starting from the LSB in the content of the binary counter COUNT is obtained by bitwise XOR of the present value of COUNT and of the previous value of the counter, i.e. COUNT-1. The result of the XOR is a thermometric-encoded binary word in which all the bits starting from the LSB and up to the first "one" of COUNT are high, whereas the remaining bits, up to the MSB, are low. Such a word is then right-shifted by one position and incremented by one unit, so that to get a word which has a single bit at "one", which turns out to be exactly in the position of the first "one" of the binary counter starting from the LSB. This word, when put in AND with the bitreversed DDPM input gives the bit of the DDPM stream corresponding to COUNT and is the output of the DDPM procedure.
This architecture is amenable to SW implementation and can be translated into the C-code shown in Fig.11. The bitwise XOR of COUNT (before incrementing) and COUNT++ (after incrementing, which gives the present counter value) provides information on the position of the first "one" starting from LSB, in a thermometric-encoded fashion. In the example illustrated in Fig.10, since the first "one" lies at second LSB position, i.e.,COUNT=1010, the result contains two ones, i.e., 0011. In order to retrieve the position of the first "one", the result is right-shifted by one position and once more incremented by one unit. The result (i.e., 0010 in the example) is then bit-wise ANDed with the bit-reversed input data. The input data must be bit-reversed to align respective bit positions and it is worth being observed that the input bit-reversal operation does not need to be performed at each DDPM clock period, but only once in a DDPM pattern consisting of 2 N clock periods and DDPM function evaluation, so it has a negligible impact on the execution time. The DDPM output is then eventually updated based on the result of the AND (if -condition in the code). For IN=1001 in the example, the DDPM output takes the value "zero", since the second MSB of the input data, corresponding to the second LSB of the counter value is, "zero".
In this architecture, the whole N -bits of binary counter are evaluated in parallel to decide the DDPM output code, with a significant reduction of the execution time, which is now

IV. HARDWARE TEST SETUP AND EXPERIMENTAL RESULTS
To validate the proposed DDPM modulators, two 8-bit software-defined DDPM DAC prototypes have been implemented on a c2000 microcontroller [21] using the simple iterative approach presented in Sect.III-C and the optimized SW implementation proposed in Sect.III-D, and their static and dynamic performance have been experimentally tested and compared.

A. MICROCONTROLLER-BASED DDPM DAC AND EXPERIMENTAL TEST SETUP
To highlight the effectiveness of the proposed DDPM technique, both the proposed and the iterative 8-bit softwaredefined DDPM modulator based on the architectures in Fig.10 and Fig. 8, respectively, have been implemented on a Texas Instruments c2000 microcontroller in C language, so that to drive a 3.3 V general-purpose digital output with a new DDPM pulse, evaluated as in Fig.9 for the iterative implementation and as in Fig.11 for proposed optimized implementation, in correspondence of a periodically-generated interrupt signal from the c2000 microcontroller timer pe- ripheral [21]. The interrupt period has been tuned, both for the iterative and for the optimized DDPM modulator, to the minimum compatible with the correct DDPM DACs operation. The software-defined DDPM DACs operate at a system clock frequency of 150MHz and includes an on-board output RC filter with R = 100 kΩ and C = 1 nF. The DAC output voltage V DDPM (n) has been measured under static conditions and under sine-wave input with the test setup in Fig.12, applying the double-slope error digital compensation technique described in [1] and also adopted in [2], [3], [6], [7], which consists in applying to the DDPM DAC a digitally-predistorted input code n evaluated from the integer n to be converted as: where · is the rounding operator to the closest integer, α is a compensation factor evaluated based on one-time calibration so that to compensate the error due to the unbalance in the rise/fall times in the digital pulses generated by the microcontroller output drivers [1].

B. EXPERIMENTAL RESULTS AND DISCUSSION
For both the iterative and the proposed optimized DDPM modulators, the digital bitstream outputs for the input code For that specific input code, the output bitstream sequence is 0,1,0,1,. . . , having 2 N /2 pulses are ones and remaining 2 N /2 pulses are zeros. Considering the sequence and timing information of single pulse, the minimum DDPM unit pulse duration T DDPM = 1/f DDPM set by the internal timer interrupts in our microcontroller platform, is found to be 500ns for the optimized DDPM implementation and 3µs for the iterative implementation, which correspond, for an N = 8-bit DDPM DAC, to a maximum sample rate f S = f DDPM /2 N of 7.81 kS/s and of 1.30kS/s, respectively, for the optimized and iterative implementation, as reported in Tab.1.
Based on the measured static input-output transcharacteristic of the two DDPM DACs, their integral nonlinearity (INL) error where LSB 8 = V DD /2 8 = 12.9 mV is the least significant bit at 8-bit resolution. and the differential nonlinearity, error evaluated as: Time are reported in Fig.14 and Fig.15 corresponding to a 25Hz sine wave with 90% full-swing amplitude are generated as where, f o = 25 Hz and n is the discrete-time index, using the sin library function in the C-code. The time-domain and the frequency-domain DAC outputs under sine wave input are shown in Fig.16 and in Fig.17 for the iterative and for the proposed optimized DDPM DAC. The power spectra (PS) of the measured DAC waveforms are obtained using the pwelch MATLAB function. Based on the measured PS, the signal to noise and distortion ratio (SNDR), spurious free dynamic range (SFDR) and effective number of bits (ENOB) are determined by (16), (17) and (18) in what follows. The SN DR, in particular, is the ratio of the The SFDR is defined as where P S dBFS (f 0 ) is the power of the fundamental and P S dBFS (f n ) is the power of the largest spurious expressed in decibel with respect to the full swing of the converter (dBFS). Based on the SNDR, the effective number of bit (ENOB) of the converter is finally evaluated as Based on measurements, the SNDR, SFDR and ENOB are evaluated by the above equations for both the proposed and iterative DDPMs are calculated and reported in

C. DISCUSSION AND COMPARISON
The results of the measurements performed on the microcontroller reported so far are summarized in Table.1, where they are also compared to DDPM and non-DDPM FPGA DAC implementations proposed in the last years. Looking at the table, the enhanced SW DDPM DAC proposed in this paper is found to be better in all aspects than the simple iterative SW implementation. The maximum achievable sample rate, in particular, is 7.812 kS/s, i.e. 6X higher than in the iterative SW DDPM DAC implementation which is just 1.302 kS/s, thus significantly widening its application domain. Considering the static characteristics results, the INL peak error of proposed converter is 1.64 LSBs compared to 2.8 LSBs of iterative converter. The better behavior of the proposed DAC is further confirmed by the measured DNL, which is 1.79 LSBs compared to 3.53 LSBs for the iterative implementation, and by dynamic characterization (i.e., SNDR, SFDR, ENOB) which give a clear picture of the improvement. Moreover, the proposed SW DDPM takes less memory and is more efficient than the iterative one.
Compared to FPGA implementations of a DDPM DAC [1] and non-DDPM DACs, like the Digital PWM FPGA DAC in [20] and the FPGA Σ∆ DAC in [19], whose main parameters are reported in Tab.1 for comparison, the proposed DDPM DAC achieves comparable performance in terms of sample rate (5X more than [1], 2.5X less than [19], [20]) and slightly worse effective resolution (2.0-4.8 effective bits less than [19]- [1]). Unlike the other solutions, however, it does not require expensive programmable logic devices and can be easily implemented in software in any general purpose microcontroller unit at a very low cost.

V. CONCLUSION
The hardware and software implementation of Dyadic Digital Pulse Modulators (DDPMs) for Digital to Analog (D/A) conversion has been addressed and an enhanced software DDPM implementation has been proposed and compared with a plain, iterative software transposition of the DDPM hardware architecture. The effectiveness of proposed technique has been demonstrated by software defined 8-bit DDPM DACs described and programmed in C and implemented on a c2000 microcontroller platform.
Based on experimental results, the software-defined DDPM DAC featuring the proposed enhanced DDPM modulator achieves 6X higher maximum sample rate and better static and dynamic performance (+0.61 ENOB) compared to the plain iterative software implementation.
The reported performance are comparable with previously proposed DDPM and non-DDPM DACs implemented on FPGA at a significantly higher cost and system complexity and are compatible with the requirements of many practical applications [23].