On the Design and Modeling of a Full-Range Piezoelectric MEMS Loudspeaker for In-Ear Applications

MEMS loudspeakers are emerging as very promising solutions to meet the ever-increasing requirements for modern audio devices to become smaller, lighter and potentially more power efficient. The piezoelectric actuation principle, thanks to the relatively large driving force achievable at low voltages, represents the most promising implementation of loudspeakers at the microscale. Despite a significant number of new structures have been proposed in the last years, research work is still needed both at the design level, in order to obtain full-range microspeakers with good sound quality, and at the simulation level, to accurately capture the linear and nonlinear responses of these type of devices. We here propose the design, modeling and characterization of a high performance piezoelectric MEMS speaker for in-ear applications, based on a piston-like movement of the microspeaker central component, connected to the actuators through a set of folded springs. The device features a Sound Pressure Level (SPL) greater than <inline-formula> <tex-math notation="LaTeX">$\mathrm {107 \text {dB}}$ </tex-math></inline-formula> from <inline-formula> <tex-math notation="LaTeX">$\mathrm {500 \text {Hz}}$ </tex-math></inline-formula> onwards for actuation voltages of 30 <inline-formula> <tex-math notation="LaTeX">$\text{V}_{\text {pp}}$ </tex-math></inline-formula> and a compact footprint of qtyproduct [product-units = power]4.5 x 4.5 m m. A Total Harmonic Distortion (THD) smaller than <inline-formula> <tex-math notation="LaTeX">$\mathrm {1 \%}$ </tex-math></inline-formula> has also been observed at <inline-formula> <tex-math notation="LaTeX">$\mathrm {1 \text {k} \text {Hz} }$ </tex-math></inline-formula> at <inline-formula> <tex-math notation="LaTeX">$\mathrm {94 dBSPL}$ </tex-math></inline-formula>. Therefore, even if at the prototype stage, the proposed device represents a promising solution towards a new set of high performances piezo-MEMS speakers that do not require further additional closing membranes to minimize acoustic losses. An excellent numerical-experimental matching in terms of SPL was also proved, thus opening the path to a new systematic design procedure for this class of MEMS structures. [2023-0113]

On the Design and Modeling of a Full-Range Piezoelectric MEMS Loudspeaker for In-Ear Applications Chiara Gazzola , Valentina Zega , Member, IEEE, Fabrizio Cerini, Silvia Adorno, and Alberto Corigliano , Member, IEEE Abstract-MEMS loudspeakers are emerging as very promising solutions to meet the ever-increasing requirements for modern audio devices to become smaller, lighter and potentially more power efficient.The piezoelectric actuation principle, thanks to the relatively large driving force achievable at low voltages, represents the most promising implementation of loudspeakers at the microscale.Despite a significant number of new structures have been proposed in the last years, research work is still needed both at the design level, in order to obtain full-range microspeakers with good sound quality, and at the simulation level, to accurately capture the linear and nonlinear responses of these type of devices.We here propose the design, modeling and characterization of a high performance piezoelectric MEMS speaker for in-ear applications, based on a piston-like movement of the microspeaker central component, connected to the actuators through a set of folded springs.The device features a Sound Pressure Level (SPL) greater than 107 dB from 500 Hz onwards for actuation voltages of 30 V pp and a compact footprint of 4.5 × 4.5 mm 2 .A Total Harmonic Distortion (THD) smaller than 1 % has also been observed at 1 kHz at 94 dBSPL.Therefore, even if at the prototype stage, the proposed device represents a promising solution towards a new set of high performances piezo-MEMS speakers that do not require further additional closing membranes to minimize acoustic losses.An excellent numerical-experimental matching in terms of SPL was also proved, thus opening the path to a new systematic design procedure for this class of MEMS structures.

I. INTRODUCTION
L OUDSPEAKERS are electroacoustic transducers able to convert an electrical signal into a corresponding sound.Thanks to the increasing request for their employment in smartphones and laptops and to their in-ear applications like earphones and hearing aids, loudspeakers represent fundamental components for the consumer electronic market.Traditional loudspeakers, like moving coil loudspeakers and balanced armature loudspeakers [1], [2], offer limited incremental improvements with respect to the ever-increasing requirements for modern devices to become smaller, lighter and more power efficient.On the other side, MEMS loudspeakers, thanks to their instrinsic low power consumption, small dimensions, integrability with on-chip circuits and high cost-efficiency in mass production, are emerging as very promising solutions.
Electrodynamic MEMS loudspeakers exploit the electromagnetic actuation as most macroscale loudspeakers.They show high power density, low driving voltage, and linear responses, but the need of permanent magnets results in large footprint and makes the full integration with standard micro-fabrication processes challenging.Thermoacoustic transduction has been also exploited for MEMS speakers.However, current thermoacoustic speakers achieving SPLs comparable with the other transduction mechanisms resulted in much larger sizes, e.g.1-4 cm, which are not compatible with in-ear applications.Electrostatically-driven speakers have been also widely studied in the last years because of their promising performances and full compatibility with MEMS fabrication processes.The simplest implementation of an electrostatically driven loudspeaker relies indeed on a parallelplate capacitor.The electrostatic force is however nonlinearly dependent, i.e. inversely proportional, on the gap between the two electrodes, thus resulting in a strong limitation of the speaker displacement range.Moreover, capacitive loudspeakers requires high DC voltages to guarantee linearity, thus suffering of pull-in instability.Only very recently, a new electrodes configuration which is independent from pull-in instabilities has been proposed by Kaiser et al. as a promising solution for high-performance MEMS capacitive loudspeakers [15].The proposed structure design is an evolution of the device proposed by the same group in 2019 [10] and it consists of in-plane balanced bending actuators based on the push-pull principle.Finally, piezoelectric MEMS loudspeakers [1], thanks to the relatively large driving force achievable at low voltages, represent the most promising implementation of loudspeakers at the microscale.They can indeed play a crucial role in in-ear applications where the pressure chamber effect due to the closed volume defined by the ear canal allows the generation of high SPL without excessive deflections of the mechanical diaphragm.
Several examples of piezoelectric MEMS loudspeakers are available in the literature and only recently some of them entered the market [32], [33], [34].
In 2018, Stoppel et al. [26] firstly proposed the Mechanically-Open and Acoustically-Closed (MOAC) design principle as a promising solution for high-performance MEMS loudspeakers.The mechanically-open feature comes from the presence of narrow air-gaps between the different mechanical components of the loudspeaker properly sized to allow larger deflections with respect to a closed membrane design.The acoustically-closed feature comes instead from the viscous boundary layers induced by the air-gaps that prevent the acoustic short-circuit between the front and rear sides of the loudspeaker.However, being the four actuators completely decoupled, the acoustically-closed feature is guaranteed only for low actuation voltages and in case of small pre-stresses.In 2020, Cheng et al. [35] proposed two innovative designs for high performance piezoelectric MEMS loudspeakers.In the same year, Tseng et al. [25] proposed a piezoelectric MEMS loudspeaker with SPL improved by a dual-electrode driving scheme.With the purpose to enlarge the MEMS microspeakers bandwidth, in 2021, Wang et al. [24] proposed a solution based on a multi-way device made by four cantilevers of different dimensions and consequently different natural frequencies.To prevent the sound pressure cancellation after the cantilevers resonances, a hybrid driving scheme made by a combination of in-phase (before resonance) and out-of-phase (after resonance) signals was also proposed.In the same year, Wang et al. [28] proposed a rigid-flexible vibration coupling mechanism by depositing a Parylene film on a pre-etched diaphragm to maintain the large displacements of the unsealed diaphragms without acoustic losses.
In this work, we conceive a piezoelectric MEMS loudspeaker for in-ear applications which maintains the acoustically-closed feature in the full-range of actuation voltages and in presence of pre-stresses induced by the fabrication process.The proposed microspeaker, without the addition of closing membranes to minimize acoustic losses, exhibits a SPL competitive with the actual state-of-the-art solutions, thus demonstrating the potentiality of the proposed design strategy.To further support such statement, we mention that in the contemporary work [31], a similar design idea resulted in a cantilever-plate actuator connected to a central circular diaphragm through meandering-springs.The two different topologies, i.e. the one studied in [31] and the present one, achieve promising performances in terms of SPL and THD, thus representing a proof of the effective design principle proposed by the two research groups simultaneously.
Moreover, in this work, differently from [31], a clear design guideline based on a a-priori fully coupled Electro-Mechano-Acoustic finite element model is presented and excellent numerical-experimental matching in terms of SPL is demonstrated.
The paper is organized as follows: the design concept along with a study on the linearity of the proposed speaker is presented in section II.Mechanical and acoustic performances of the device evaluated through an Electro-Mechano-Acoustic FEM model are reported in section III.The fabrication process is described in section IV, while section V illustrates the experimental characterization of the microspeaker.Performances of the proposed device are compared with respect to the state of the art in section VI.Finally, in section VII, conclusions are drawn together with future perspectives.

II. DESIGN CONCEPT
A single-degree-of-freedom loudspeaker radiating in a cavity acts as a low-pass filter and the generated sound pressure is proportional to the volume displacement V [36], given by the product of the speaker maximum displacement d max and the speaker effective area S e f f : being γ the adiabatic index (equal to 1.4 for dry air at 20 • C), P 0 the ambient pressure and V 0 the volume of the cavity.The effective area of the speaker can be expressed as: where is the modal shape function, normalized such as to exhibit a unitary displacement in the piston area and A is the microspeaker footprint.The parameters to address in the design phase to reach a certain SPL are then the speaker's maximum displacement and effective area.The speaker maximum displacement is mainly related to the speaker compliance, while the effective area depends on the type of movement.The speaker compliance is also related to the speaker resonance frequencies.
Due to the limited total footprint of the devices under consideration, a closed membrane design is not an effective solution, being too rigid and hence the associated displacements too low.Piston structures [35], [37], [38], [39] guarantee the maximum effective area given a certain footprint, but suffer the acoustically-open feature if the maximum out-ofplane displacement exceeds the out-of-plane thickness of the microspeaker.
The proposed microspeaker provides a solution to this issue thanks to the combination of the MOAC principle with the goal of optimality in terms of sound emission of piston-based structures.The schematic view of the proposed loudspeaker is shown in Fig. 1a.The moving mechanical structure consists of four trapezoidal actuators (orange in Fig. 1a) connected to a central squared piston (green in Fig. 1a) through a set of properly sized folded elastic springs (yellow in Fig. 1a).The mechanically-open feature of the proposed design relies on the presence of different mechanical components separated by air-gaps (black in Fig. 1a).The four trapezoidal actuators are fixed to the substrate through an external silicon frame (gray in Fig. 1a) that also delimits the back cavity located under the moving structure.The mechanical structure is made by a 13 µm poly-Si layer and shows a footprint of 4.5 × 4.5 mm 2 comprising the 350 µm width of the external frame.
The mechanical structure of the microspeaker is designed to have a unique actuated vibration mode in the audible regime, characterized by a synchronous motion of the four trapezoidal actuators and the central piston.The linear electro-mechanical vibration mode occurs at 10.9 kHz and its modal shape function is depicted in Fig. 1b.The central piston movement serves the twofold purpose of enhancing the effective area of the loudspeaker and synchronizing the movement of the four actuators, thus guaranteeing the acoustically-closed feature independently on the level of pre-stresses induced by the fabrication process and in case of high actuation voltages.Rigorously, the acoustically-closed feature is completely guaranteed with air-gaps smaller than 5 µm [27], if a sufficiently big back chamber is considered.Below this dimension, the air leakage between the front and rear sides of the loudspeaker becomes negligible due to the high viscous losses along the gap sidewalls.To comply with fabrication process constraints, in the present implementation, the air-gaps width is set to 10 µm.A partial acoustic short circuit at low frequencies is then expected, as detailed in section III.
Piezoelectric actuation is provided through a 2 µm thick sol-gel PZT layer embedded between two driving electrodes deposited on the entire top surface of the four trapezoidal plates, as schematically shown in Fig. 1c.
In operation, the piezoelectric d 31 -mode is activated: when an out-of-plane electric field is applied between top and bottom electrodes, an in-plane strain is induced in the PZT film.The in-plane strain state in the PZT thin film bends the trapezoidal actuators which in turn triggers the piston movement through the connecting springs.The d 31 coefficient of the PZT employed in the fabrication of the proposed device is equal to −156 pm/V at 30 V pp .
As expected, thanks to the piston structure in the middle of the microspeaker here proposed, the numerically computed effective area is 30 %, more than 10 % greater than the one of the device presented in [26], that instead shows an effective area of 19 %.It is worth noting that a higher effective area does not guarantee a-priori a higher SPL especially if the loudspeaker maximum displacements at the same forcing level are significantly different, as demonstrated by Equation 1.A trade-off between high effective area and high loudspeaker's compliance must be then considered in the design phase.

A. Mechanical Linearity
The THD is defined as the ratio between the sum of the effective values of the harmonics (k 2 , k 3 , . . ., k n ) and the effective value of the fundamental harmonic k 1 [40]: A low THD ensures a high sound reproduction fidelity and consequently it is desired to enhance the performances of the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.microspeaker.The main sources of THD are the hysteretical behaviour of the piezoelectric material employed for actuation [41], [42], [43] and the nonlinear mechanical behavior, i.e. geometric nonlinearities, of the diaphragm [44].
The first one can be considered as a known input to the system, once the actuation voltage is selected.Lower driving voltages ensure a smaller contribution of this nonlinear term in the overall THD.
Geometric nonlinearities are instead related to the dynamic behavior of the mechanical structure which is strongly dependent on the geometry and on the boundary conditions.Geometric nonlinearities can be indeed addressed in the design phase by a proper choice of the connections among the different mechanical components composing the speaker.Aware of this, in the present work, we identify an efficient strategy to minimize it by design.
In the microspeaker here proposed, the suspension springs connecting the actuators to the central piston, play a crucial role to ensure the linearity of the device response over a wide range of out-of-plane displacements.The more compliant the suspension springs are, the more linear the microspeaker response is.However, to make suspension springs more compliant, e.g.longer, thinner or with more folds, the overall dimensions of the microspeaker and/or the total air-gaps path must be increased.A compromise between springs design simplicity and compliance must be then achieved.To prove this statement, we report in Fig. 2 three alternative microspeakers designs exhibiting the same effective area of the device under study, but different suspension springs: a structure with rigid links among actuators and the central piston, a structure with the central piston connected to the four actuators through four elongated springs and through four Y-shaped springs.The three designs reported in Fig. 2 are chosen since they represent the simplest implementation of the design concept here proposed: Y-shaped spring, elongated straight spring and no springs at all between the central piston and the four trapezoidal actuators.
To assess the range of linearity of the proposed speaker, we compare the voltage-displacement curves numerically estimated through an Electro-Mechano nonlinear static analysis, as detailed in Section III.From Fig. 3a, it is evident that the proposed structure remains linear up to a displacement of 55 µm, wheres the other three lose their linearity after few microns of displacement.In Fig. 3b, we quantify the mechanical nonlinearity of the proposed design for an actuation voltage in the range 0-30 V as 3.7 %, which is sufficiently low to expect a small THD due to geometrical nonlinearities in the overall response of the microspeaker.

III. NUMERICAL RESULTS
The mechanical and acoustical performances of the proposed loudspeaker are simulated through an Electro-Mechano-Acoustic finite element model implemented in COMSOL Multiphysics ® v6.1.The numerical model takes into account the different physics involved in the real functioning of the device to provide a precise estimation of its dynamic behaviour.In particular, the following features are taken into account: • the boundary layers induced by the air viscous properties in the narrow gaps through the Narrow Region Acoustics formulation [45]; • the elasto-dynamic response of the mechanical structure (Solid Mechanics module); • the linear stress-charge constitutive law of the piezoelectric material (Electrostatics module); • the acoustic-structural coupling through the continuity of the normal stress and acceleration between the solid and the acoustic domain on the front and rear sides of the speaker; • the in-ear condition through the occluded ear simulator (Fig. 4a) available in COMSOL Multiphysics ® [46] and • the Ear Canal Extension (ECE) (Figs.4b-c) that serves to couple the Device Under Test (DUT) with the ear simulator in experiments.
The mesh is made by quadratic prisms in the electromechanical microspeaker domain and in the air-gaps of the coupler expansions, while quadratic tethraedral elements are employed to discretize the remaining air domain.The model total degrees of freedom are 588223.
Static deformations induced by residual stresses coming from the fabrication process and by the Direct Current (DC) voltage applied between top and bottom electrodes are carefully taken into account in the simulation procedure.They indeed determine the pre-deflected configuration around which the dynamic response occurs.Mechanical pre-stresses are simulated through nonlinear static analyses, while the DC voltage contribution on the pre-deflection of the MEMS loudspeaker is estimated through a nonlinear electrostatic analysis.
The Numerical SPL frequency spectrum computed at the ear surface under an Alternate Current (AC) voltage of 5 V pp is reported in Fig. 5a.To underline the fundamental importance of the air-gaps width in terms of acoustic short-circuit between the front and rear sides of the speakers, we reported in Fig. 5a the SPL curves numerically obtained by considering air-gaps of 5 µm and 10 µm.In case of 10 µm air-gaps, the partial acoustic short circuit at low frequencies, i.e. below 500 Hz, is indeed evident.
The two peaks in Fig. 5a correspond to the loudspeaker nonlinear resonance frequency at 12 V (10.6 kHz) and to the half-wavelength resonance of the main cylinder of the coupler (13.8 kHz) [46], respectively.The blue curve represents instead the SPL frequency spectrum computed at the ear surface under an AC voltage of 5 V pp when the ECE is considered.The introduction of an ear-canal extension in front of the ear simulator shifts to lower frequencies the mode of the main cylinder, decreasing as a consequence the SPL response.
In Fig. 5b, the mesh of the model without ECE from poor quality (red) to good quality (green) is shown along with the total deflection (static contribution plus harmonic contribution) of the mechanical structure and the Sound Pressure Level in the air domain, evaluated at 5 kHz.The back chamber considered in the model has a volume of 1 cm 3 and mimics the one of the experimental package exploited for the mechanical and acoustic tests (see Section V).

IV. FABRICATION
The piezoelectric MEMS loudspeaker shown in Fig. 1 is fabricated by STMicroelectronics through the standard process flow schematically shown in Fig. 6.An oxide layer is firstly deposited on the thick, i.e. 400 µm, Si Mono layer to control the back chamber definition.The Epi-Poly layer with a thickness of 13 µm is then deposited on top of it such as the oxyde layer and the PZT stack including top and bottom electrodes Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.(Fig. 6a) properly patterned (Fig. 6b).Passivation oxyde layers are then deposited and patterned in order to expose the bonding pads of the top and bottom electrodes (Fig. 6c).Subsequently, metal connections are deposited and patterned for the definition of the bonding pads of electrodes (Fig. 6d) and the Epi-Poly is etched according to the loudspeaker mechanical design (Fig. 6e).Finally, to define the back cavity, the Si Mono layer is etched from the back side of the wafer by using Deep Reactive Ion Etching (DRIE) (Fig. 6f).
In Fig. 7a a microscope optical image of the device is reported together with a close-up view of the air-gaps forming the folded suspension springs.For experimental tests, the device is mounted on a custom Printed Circuit Board (PCB) and coupled with a ABS (Acrylonitrile Butadiene Styrene) thermoplastic package composed by a back chamber of 1 cm 3 (Fig. 7c) and a front adapter able to connect the loudspeaker with the ear simulator (Fig. 7b).

V. EXPERIMENTAL RESULTS
Both mechanical and acoustical tests are carried out to assess the performances of the microspeaker and to validate the FEM model.A 3D optical profilometer MSA-500 by Polytec is exploited to apply a surface topography analysis to detect the static deformation of the structure due to fabrication pre-stresses plus an applied DC voltage (Fig. 8). Figure 9a reports the experimental static deformation of the loudspeaker under the effect of the pre-stresses induced by the fabrication process, i.e. at 0 V of DC voltage.Figures 9b-c refer instead to the condition of a DC voltage equal to 12 V and 30 V, respectively.The corresponding numerical static deformations evaluated through the FEM model described in section III are reported in Figs.9g-h-i.The out-of-plane displacements along the middle cross-section indicated as A-A' are also reported for the experimental (Figs.9d-e-f) and numerical (Figs.9j-k-l) static deformed shapes.The difference in terms of experimental and numerical maximum deflection is equal to 2 % for 0 V, to 10 % for 12 V and to 5 % for 30 V, thus demonstrating a very good agreement between numerical and experimental results.
The acoustic performances of the loudspeaker are validated through an experimental campaign with the measurement set-up reported in Fig. 10.The latter includes the anechoic  The measured SPL curves are reported in Figs.11a-c-e-g, for actuation voltages of 5 V pp , 7 V pp , 20 V pp and 30 V pp , respectively.For AC voltages greater than 5 V pp , the acquisition is stopped at 10 kHz, to avoid the break-up of the prototype, since at this stage no equalization filter is applied to limit the loudspeaker maximum displacement.For an actuating voltage of 5 V pp the SPL is above 82 dB in the whole frequency range and it reaches a maximum of 135 dB in correspondence of the loudspeaker natural frequency.By increasing the voltage up to 30 V pp , the SPL reaches values above 107 dB from 500 Hz onwards.A flat frequency response could then be achieved using electronic equalization.Therefore, even at the prototype stage, the device promises to meet typical SPL demands for in-ear consumer electronics applications.
The numerical-experimental matching is excellent in the frequency range 100 Hz-10 kHz, in which the ear simulator is suitable to reproduce the human ear response.The discrepancy in the quality factor of the loudspeaker and in the coupler resonance frequencies can be ascribed to the different ear simulator implemented in COMSOL Multiphysics ® [46] and the one used in the experiments [47].
Thanks to the microspeaker linearity, the sound pressure at 30 V pp is still not saturated as evident from Fig. 12, where the sound pressure at 100 Hz, 1000 Hz and 3000 Hz is reported for increasing actuating voltages.
To assess the sound reproduction quality of the in-ear speaker, THD measurements have also been executed.The measurement results for an actuating voltage of 5 V pp are shown in Fig. 11b.The THD is overall very low, under 2 % in the whole frequency range, except for the neighbourhood of the resonance frequency subharmonics, where the speaker exhibits harmonic distortion up to 11 %.The unwanted excitation by subharmonics is directly reduced, if the speaker resonance frequency is damped.This can be achieved through electronic equalization, as already implemented in [26].A THD increase is expected at higher voltages because of the hysteretic behaviour of the piezoelectric layer [41], [42], [43], as demonstrated by Figs.11d-f-h.However, a real-time compensation of the loudspeaker nonlinearities could drastically diminuish the sound distortion, making 30 V pp an acceptable working point for the proposed speaker.This could be implemented through a digital signal processing technique based on loudspeaker virtualization, as for instance proposed in [48] and [49] for the case of macroscale loudspeakers.

VI. DISCUSSION
In table I, different piezoelectric MEMS microspeakers available in the literature [24], [25], [26], [28], [31], [35] and on the market [32], [33], [34] are compared in terms of piezoelectric layer thickness, active area dimension, i.e. diaphragm size, first natural frequency f 0 , SPL and THD evaluated at 1 kHz.For the sake of clarity, we indicated as This work the performances of the loudspeaker here designed (Fig. 1), fabricated (Fig. 7) and tested (Fig. 10).In table I, SPL and THD of the proposed device are reported for several actuation voltages to fairly compare it with the state of the art.Experimental data of known solutions are indeed available only for specific actuation voltages.
To compare the design here proposed with the microspeakers available in the literature, we consider for example the SPL at 1 kHz, i.e. 83 dB, and the THD at the same frequency, i.e. 0.25 %, measured for an actuating voltage of 2 V pp .In [25] the higher SPL is justified by the larger footprint and by the proposed dual electrode driving.In [24] a higher SPL is achieved through a smaller footprint thanks to the low first Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.natural frequency, i.e. 1.54 kHz.However, the presence of four resonance frequencies in the audible regime makes the dynamics of the speaker more complex, thus, a higher value of THD is expected in the whole audible spectrum, due to the presence of multiple higher order harmonics.In [28], the higher SPL is again justified by the low resonance frequency, i.e. 6.7 kHz.The addition of a Parylene membrane that acoustically decouple the front and rear sides of the speaker allows indeed the design of compliant structures made of long paths of air-gaps without suffering from acoustic losses.The higher SPL of the speaker proposed in [26] is determined by the larger footprint and also in this case by the lower resonance frequency.Finally, by comparing the present solution with the contemporary design based on the same MOAC working principle [31], we can say that the difference in terms of SPL per unit area is due to the significant differences in terms of employed fabrication process.In [31] smaller gaps, i.e. 5 µm versus 10 µm of our device, and out-of-plane silicon thickness, i.e. 5 µm versus 13 µm of our device, are indeed exploited.Thanks to the smaller gap, it is possible to design much more compliant suspension springs and consequently achieve bigger displacements of the inner diaphragm without facing acoustic short-circuit.Thanks to the smaller out-of-plane thickness of the device is again possible to achieve bigger displacements of the inner diaphragm because of the improved compliance of the mechanical structure.To date we cannot achieve such Fig.11.Comparison between numerical (blue curves) and experimental (red curves) SPL frequency spectra evaluated at the ear surface for a bias voltage of 12 V plus (a) 5 V pp , (c) 7 V pp , (e) 20 V pp and (g) 30 V pp .For actuating voltages greater than 5 V pp the acquisition stops at 10 kHz, since no equalization filters are implemented at this stage to limit displacements.The Total Harmonic Distortion (THD) for the above mentioned voltages is reported in features through the employed industrial fabrication process that has to guarantee a high reliability for mass production.
The proposed microspeaker is then well positioned in the literature panorama, despite being at its first characterization Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Measured sound pressure at 100 Hz, 1000 Hz and 3000 Hz at increasing actuation voltage levels.Due to the geometric linearity of the proposed speaker, at 30 V pp the sound pressure of the device is still not at saturation.stage and being realized through a non-optimized fabrication process.
It also well compare with the performances of the commercially available MEMS microspeakers produced by USound [32] and xMEMS [33], [34].The speaker proposed by USound features indeed a SPL at 1 kHz of 117 dB and a THD at 1 kHz equal to 0.3 % for an actuating voltage of 30 V pp and 2.8 V pp , respectively.The most performing speaker proposed by xMEMS, i.e. the Montara series, features instead a SPL at 1 kHz of 115 dB for an actuating voltage of 30 V pp and a THD at 1 kHz equal to 0.5 % for an AC voltage corresponding to 94 dBSPL at 1 kHz.By looking at the SPL curve measured on the device here proposed for an actuating voltage of 30 V pp (Fig. 11g), we can clearly state that our solution is only 6 % and 5 % less performing in terms of SPL, than the commercial devices of USound and xMEMS, respectively.To compare the THD performance of the proposed design with the one reported by USound, we measured the THD at 2.8 V pp , thus obtaining 0.33 %.
To compare instead with the THD of the device fabricated by xMEMS, we consider the experimental data available for an actuating voltage of 7 V pp (Fig. 11d).For such voltage level we indeed achieve a SPL of 93.8 dB at 1 kHz, which well aligns with the 94 dBSPL exploited by xMEMS.At this SPL level, the present design shows a THD lower than 1 %, thus in line with the xMEMS performances.
Finally, it is worth mentioning that despite only package dimensions are available for the two commercially available microspeakers, the design here proposed shows a significantly lower footprint that justifies the lower SPL values achieved both numerically and experimentally.Moreover, the present design does not employ polymeric membranes to improve the acoustically-closed feature of the microspeaker as done for instance for the USound device, thus resulting in a much simpler geometry which is easily fabricable and then cheaper than some commercially available solutions.It is also worth mentioning that the performances of commercial devices are evaluated on a custom package, that can contribute to increase the SPL performances.Our device, being at a prototypal stage, is instead tested in a manually assembled package.
Despite a direct fair comparison is sometimes difficult due to the different working conditions exploited by the microspeakers available in the literature and in the market, we can conclude that the proposed design is competitive in terms of dimensions, SPL and THD with the actual state-of-the-art solutions and can play a role in future applications of MEMS speakers.

VII. CONCLUSION
A full-range piezoelectric MEMS speaker for in-ear applications is designed, simulated, fabricated and experimentally characterized.The main novelty of the design concept here proposed lies in the piston-like movement of the central component achieved thanks to a set of folded suspension springs.This allows the maximization of the speaker effective area and the fulfillment of the acoustically-closed feature in the full-range of actuation voltages and in presence of pre-stresses and geometrical imperfections.The chosen springs design determines the geometric linearity of the microspeaker allowing for a maximization of the SPL and a minimization of the THD at a reduced footprint.Hence, the proposed device represents the first step towards a new class of high performances piezo-MEMS speakers that do not require further additional closing membranes to minimize acoustic losses.
The developed fully coupled Electro-Mechano-Acoustic finite element model proved excellent numerical-experimental matching, thus opening the path to a new systematic a-priori design tool which was not available so far in the literature and that can guide the design process of future high-performance microspeakers.
Future work will address a systematic maximization of the volume displacement and of the linearity of the speaker through an optimization procedure.Research work will be also devoted to the enhancement of the FEM multiphysics model with the introduction of the hysteretic piezoelectric constitutive law to obtain a quantitative sound distortion estimation.

Fig. 1 .
Fig. 1.(a) Schematic view of the proposed piezoelectric MEMS speaker.Close-up view of the elastic springs (in yellow) connecting trapezoidal plates (in orange) with the central piston (in green) are reported for the sake of clarity.(b) Modal shape function of the first resonant mode of the proposed microspeaker.The contour of the displacement field is shown in color.(c) Cross-sectional view of the proposed MEMS loudspeaker structure.

Fig. 2 .
Fig. 2. Schematic view of a rectangular plate with air-gaps separating the four trapezoidal actuators with three different types of connections with the central piston: no springs, rigid straight springs and Y-shaped springs.

Fig. 3 .
Fig. 3. (a) Comparison of the voltage-deflection curves numerically estimated through an Electro-Mechano nonlinear static analysis for the design shown in Fig. 1a and the three devices reported in Fig. 2. (b) Voltage-deflection curve of the proposed microspeaker.

Fig. 4 .
Fig. 4. (a) FEM model of the proposed loudspeaker with a back chamber of 1 cm 3 and the occluded ear-canal simulator available in COMSOL Multiphysics ® .(b) FEM model including the Ear Canal Extension (in green) used in the acoustic tests to couple the Device Under Test (DUT) with the ear simulator.(c) Detail of the experimental set-up including the ear canal extension.

Fig. 5 .
Fig.5.(a) Numerical SPL frequency spectrum computed at the ear surface at 12 V plus 5 V pp .The black curve refers to the condition with the speaker directly connected to the coupler (without ear extension, Fig.4a): the solid line is for air-gaps width of 10 µm and the dashed line for air-gaps width of 5 µm.The blue curve refers to the condition with ECE (Fig.4b).(b) FEM model of the proposed microspeaker coupled with the ear simulator available in COMSOL Multiphysics ® without ECE.The mesh from poor (red) to good quality (green) is reported along with the total deflection of the mechanical structure (in microns) and the SPL in the air domain (in dB), evaluated at 5 kHz.

Fig. 6 .
Fig. 6.Fabrication process of the piezoelectric MEMS speaker produced by STMicroelectronics.

Fig. 7 .
Fig. 7. (a) Microscope optical image of the fabricated loudspeaker with a close-up view of the air-gaps forming the folded suspension springs.(b) Front and (c) rear sides of the fabricated microspeaker mounted on a custom PCB and coupled with the package for in-ear acoustic tests.

Fig. 8 .
Fig. 8. Schematic view of the experimental set-up employed for the microspeaker static characterization.

Fig. 9 .
Fig. 9. Topography of the microspeaker at different constant bias voltages: (a) 0 V (b) 12 V and (c) 30 V. Numerical static deformations evaluated through the proposed FEM model are reported in (g), (h) and (i) for bias voltages of 0 V, 12 V and 30 V, respectively.Out-of-plane displacement profiles in the middle cross-section A-A' are shown in (d), (e), (f) and (j), (k), (l) for the experimental and numerical static deformed shapes, respectively.

Fig. 10 .
Fig. 10.Sketch of the acoustic measurement set-up composed of the anechoic chamber G.R.A.S. AL0030-S2 and the ear simulator G.R.A.S. RA0402 together with the microphone G.R.A.S. 46 BD 1/4 ′′ .The Audio Analyzer (APx525) allows to generate DC and AC signals for the MEMS actuation and to convert the signal from the microphone into SPL data.

TABLE I COMPARISON
AMONG DIFFERENT PIEZOELECTRIC MEMS MISCROSPEAKERS AVAILABLE IN THE LITERATURE AND ON THE MARKET Fig. 12.