Integrated Circuits for Medical Ultrasound Applications: Imaging and Beyond

—Medical ultrasound has become a crucial part of modern society and continues to play a vital role in the diagnosis and treatment of illnesses. Over the past decades, the develop- ment of medical ultrasound has seen extraordinary progress as a result of the tremendous research advances in microelectronics, transducer technology and signal processing algorithms. However, medical ultrasound still faces many challenges including power- efﬁcient driving of transducers, low-noise recording of ultrasound echoes, effective beamforming in a non-linear, high-attenuation medium (human tissues) and reduced overall form factor. This paper provides a comprehensive review of the design of integrated circuits for medical ultrasound applications. The most important and ubiquitous modules in a medical ultrasound system are ad- dressed, i) transducer driving circuit, ii) low-noise ampliﬁer, iii) beamforming circuit and iv) analog-digital converter. Within each ultrasound module, some representative research highlights are described followed by a comparison of the state-of-the-art. This paper concludes with a discussion and recommendations for future research directions.


I. INTRODUCTION
U LTRASOUND is defined as sound with frequencies greater than or equal to 20 kHz, and is consequently beyond the upper limit of the human hearing range [1]. Ultrasound has many useful properties and medical ultrasound technology has become an indispensable feature of modern society. In order to appreciate the importance of medical ultrasound and gain an understanding of its current and emerging research directions, it is appropriate to start by reviewing its history.
The history of ultrasound can be traced back to the late 19 th century when major discoveries, both theoretical and experimental were made. Notable pioneers of that age include John William Strutt (also known as the Lord Rayleigh) who laid down the theoretical foundations of the study of ultrasound with his book Manuscript  "The Theory of Sound" [2] and the Curie brothers (Pierre and Jacques) who discovered the piezoelectric effect in 1880. The scope of this review paper is limited to ultrasound advances from the 1950s onward since earlier developments were concentrated on sonar instead of medical applications. The interested reader can refer to [3], [4] for a detailed review of the history of ultrasound.
The 1950s was a high point and had a far-reaching influence in the development of medical ultrasound. Two of the greatest milestones in medical ultrasound were achieved in this decade. In 1953, Inge Edler and Carl Hellmuth Hertz performed the first successful echocardiogram in an attempt to diagnose mitral stenosis. Ian Donald, John Macvicar and Tom Brown published their seminal paper "Investigation of Abdominal Masses by Pulsed Ultrasound" in 1958 [5] and henceforth revolutionised the field of obstetrics and gynaecology with ultrasound diagnostics. These breakthroughs demonstrated the immense value of ultrasound imaging and established imaging as the dominant research direction in medical ultrasound. 1 Although the breakthroughs in ultrasound imaging in the 1950s were very impressive, it faced many poor performance issues like slow image acquisition, poor image quality, bulky equipment and operator dependence. Therefore, in the subsequent decades, research efforts were directed toward three interdependent tracks of developing i) smaller and better performing ultrasound transducers, ii) ultrasound imaging integrated circuits (ICs) to improve portability and performance, and iii) advanced signal processing algorithms to increase the visual clarity of ultrasound images. For the sake of brevity, some important, pioneering works in tracks i) and ii) that are of particular interest to the microelectronics design community are highlighted. Firstly, the work in [6], [7] pioneered the development of medical ultrasound imaging ICs to process real-time images from multiple phased arrays. The development of ultrasound ICs is a key step toward the miniaturisation and integration of ultrasound systems and leverages on the exponential progress in the CMOS industry (Moore's law). Secondly, outstanding contributions to the development of a new class of ultrasound transducersthe capacitive micromachined ultrasound transducers (CMUTs) can be seen in [8]- [10]. CMUT technology is a game changer and presents many advantages over traditional piezoelectric transducers including greater bandwidth, ease of fabrication of large arrays and better integration with CMOS circuits [11]. Looking back at the history of medical ultrasound development, it is evident that medical ultrasound has been and continues to be an active area of research. The reason for this is twofold. Firstly, ultrasound is relatively safe and does not induce ionisation in human cells, unlike computed tomography and other methods that exploit the electromagnetic spectrum [12]. Secondly, the advent of two enabling technologies, CMOS and CMUT has paved the way for the trend in miniaturisation of ultrasound systems. With miniaturisation, there are many benefits that can be reaped including improved reliability, portability and reduced cost. The value of miniaturising medical ultrasound has long been recognised by industry. One prominent example is Butterfly Network, Inc. that aims to revolutionise ultrasound imaging by producing hand-held, smartphone-connected ultrasound probes in contrast to conventional cart-based systems [13].
A general hardware architecture for ultrasound systems is shown in Fig. 1. It is helpful to have a system level understanding in order to better appreciate the relationship between individual modules. On the transmit (TX) side, the TX beamformer circuit generates the delay pattern (time domain) and complex weights (amplitude domain) based on the desired transmitted ultrasound beam characteristics. The outputs of the TX beamformer are amplified into several tens of Volt by the transducer driving circuit. The signal waveform that drives the transducer elements can have different shapes e.g. square pulse, sine wave and Gaussian pulse.
Note that when targeting implantable (non-portable) operation, for instance intravascular imaging, the transducer driving circuit can sometimes be replaced by high-voltage switches that route high-voltage transmit pulses generated by an external imaging system to the transducer elements [14]. This helps to reduce the power dissipation of the ultrasound IC significantly. On the other hand, for non-implantable and portable operation as in [13], the power dissipation requirement of the IC is more relaxed compared to implantable operation. Nevertheless, the IC in portable applications should still be power-efficient because the available power is limited (by battery life).
On the receive (RX) side, there is a transmit/receive (T/R) switch to protect the low-voltage RX circuitry from the high TX voltage pulses. It is desirable for the low-noise amplifier (LNA) to provide some form of time-gain compensation (TGC) when receiving the ultrasound echoes. The RX beamformer generates the required delays and complex weights for the received echoes, a complementary operation to the TX beamformer. Finally, the analog-digital converter (ADC) performs the necessary signal conversion to allow for post-processing. This paper provides a comprehensive review of integrated circuit designs for medical ultrasound systems with emphasis on the core modules, i) transducer driving circuit, ii) LNA, iii) beamformer and iv) ADC. Although there are many excellent review papers published on the transducer [15]- [17] and signal processing [18], [19] aspects of medical ultrasound, there are not many review papers published on the hardware aspect. Therefore, this paper aims to fill this gap in the literature. Section II presents a brief overview of the basics of medical ultrasound technology. Section III introduces the main classes of ultrasound transducers and elaborates on their respective equivalent circuit models. Sections IV to VII are dedicated to the analysis of the core ultrasound modules (transducer driver, receiver, beamformer, ADC). The T/R switch, LNA and TGC that constitute the ultrasound receiver are discussed in Section V. Recommendations for future directions and challenges are provided in Section VIII and concluding remarks are drawn in Section IX.

A. A Brief Description of Waves
Ultrasound or in general, acoustic wave is a type of mechanical wave, which is associated with the transfer of energy from one point to another but not with the transfer of mass [20]. In the context of medical ultrasound, ultrasound waves are normally assumed to be longitudinal. This is because in most cases soft tissues can be approximated as a fluidic material which does not support the propagation of shear (transverse) waves. However, it is still possible for low frequency shear waves to exist in soft tissues [1]. This property is exploited in a special ultrasonic imaging technique -elastography (see [21], [22]). For the remainder of this paper, ultrasound waves will be considered as longitudinal.
Ultrasound waves can also be classified as plane or circular waves. Plane waves have uniform amplitude and planes of constant phase perpendicular to the propagation direction. Circular waves propagate symmetrically around a reference point or around a reference line [1]. The shape of ultrasound waves is largely determined by the transducer's properties. For instance, if the ratio between a disk-shaped transducer's diameter to the ultrasound wavelength is decreased, the ultrasound wave will tend to exhibit more spherical wave characteristics [1].
The wave equation describes the ultrasound wave phenomenon succinctly and is given by (1) for the 3-D case. u is the wave function, t is time, x, y, z are spatial coordinates and c is the wave velocity. (1)

B. Transmission and Reflection
Assume an acoustic wave is travelling through a material medium. A pressure gradient is formed in this medium and induces motion and strain on the particles of that medium [1]. In this case, the pressure gradient (P ) and the corresponding particle velocity (U ) are analogous to voltage and current respectively. The acoustic impedance is defined in (2).
Note that the acoustic impedance is a function of pressure which is related to the amplitude, power and intensity of the acoustic wave. Therefore, it is convenient to use the acoustic impedance to construct the acoustic counterparts to the wellknown Fresnel coefficients. The pressure reflection, transmission, intensity reflection, transmission coefficients are defined as (3) -(6) respectively where the angles of incidence (θ I ) and transmission (θ T ) and the acoustic impedances (Z 1,2 ) of two different media have their usual meaning [1]. The acoustic wave travels from medium 1 to medium 2. Υ There are two extreme scenarios that are of interest. Firstly, if Z 2 Z 1 , then R, Γ → 1. This means that the reflected wave has a negligible decrease in amplitude. Secondly, if Z 1 Z 2 , then R → −1. Likewise, this means that the reflected wave has a phase shift of π radians relative to the incident wave but a negligible decrease in amplitude. A common example of the second scenario is the presence of air bubbles between the skin and the ultrasound transducer (no gel applied) which results in strong reflections and poor imaging quality. Poor imaging quality due to largely dissimilar acoustic impedances can be avoided by including an acoustic impedance matching layer. Its purpose is to facilitate the transmission of ultrasound waves through the target medium. This layer can be thought of as an intermediary layer between the source (ultrasound transducer) and the target (human tissue).

C. Attenuation
The acoustic wave travelling through a medium inevitably suffers from attenuation. This attenuation follows an exponential relation and can be described in (7) where A(x) is the wave amplitude as a function of distance (x), A 0 is the initial reference amplitude, α is the attenuation coefficient.
In the context of medical ultrasound, α is a function of frequency and the attenuation increases with increasing frequency [1]. Ultrasound waves with higher frequency (smaller wavelength) have greater sensing resolution but suffer from greater attenuation which limits its penetration depth in target tissues, and vice versa. Some common attenuation values are 0.5 dB/cm/MHz for soft tissues and 10-20 dB/cm/MHz for bones [1]. The penetration depth versus resolution is a fundamental trade-off in ultrasound systems.

III. ULTRASOUND TRANSDUCERS
There are three main classes of ultrasound transducers, i) piezoelectric materials, ii) CMUTs, and iii) piezoelectric micromachined ultrasonic transducers (PMUTs). PMUTs offer several advantages over CMUTs, for example, PMUTs do not need a large voltage bias, making integration with low voltage CMOS electronics easier [16]. Some examples of work that make use of PMUTs can be seen in [23]- [25]. Nevertheless, compared to piezoelectric materials and CMUTs, PMUTs have not been widely adopted yet due to fabrication difficulties, performance issues and the lack of accurate design/modeling tools [16]. Therefore, PMUTs will not be discussed in this paper. The reader is referred to [16] for more details. A comparison of ultrasound transducers can be found in Table I.

A. Piezoelectric Ultrasonic Transducer
Piezoelectric transducers (PZTs) are the conventional type of ultrasonic transducers, with a long history that dates back to the late 19 th century. The working principle of a PZT is based on the piezoelectric effect, in which an applied mechanical stress to a piezoelectric material generates an electric field [1]. The inverse is also true -applying an electric field to a piezoelectric material generates a mechanical strain. Common piezoelectric materials include quartz crystals, Rochelle salt, polyvinylidene difluoride as well as lead zirconate titanate which was first formulated by Jaffe in the 1950s [26] and is the most popular choice today.
The sandwich structure of a typical PZT is shown in Fig. 2. The impedance matching layer [27], [28] is necessary for efficient energy transmission while the backing layer provides damping to shorten the pulse duration in ringing-prone, high quality factor PZTs [29]. In some applications, it may also be necessary to design external impedance matching networks [30], [31].
There has been a substantial body of research dedicated to the modeling and characterisation of PZTs. The models  [34], to more software-based models [35], [36] that employ finite element methods (FEM).
Early models aimed to represent the piezoelectric effect, an electromechanical phenomenon in a compact and friendly form to electrical engineers. By drawing on the close analogies between electrical and mechanical systems, equivalent circuits were constructed and greatly aided the understanding of PZTs. For instance, it is intuitive to see the analogies between voltage-current and force-velocity, whereas the analogies between resistance-capacitance-inductance and friction-springmass can be seen from their respective governing differential equations.
PZT characteristics depend on the type of vibration (compression and shear) it is subjected to [37]. For simplicity, assume the transducer is a thin plate and is vibrating in a compressional thickness mode. In this case, a popular equivalent circuit is the KLM model proposed in 1970 and depicted in Fig. 3. The KLM model is an improvement on the Mason model, which involved a negative capacitance (unphysical) element. Mason introduced the transformer to model the electromechanical coupling in a PZT. The transformer is also used in the KLM model [32].
The parameters of the KLM model in Fig. 3 are given by where ρ is density, ω is angular frequency, β S 33 is dielectric impermeability, and c D 33 is elastic stiffness. v D t is acoustic wave velocity, h 33 is piezoelectric constant [32]. l, w and t are the transducer dimensions as shown in Fig. 3. On the electrical port, C 0 represents the clamped capacitance between the two electrodes on the transducer. The electrical port is coupled to the acoustic port by a transformer. The acoustic port is represented by a section of a transmission line with characteristic impedance and velocity Z 0 and v respectively. The transmission line is a neat representation of the inevitable time delay incurred when acoustic wave signals travel from one side of the transducer to the other [37].
The equivalent circuit can be further simplified as in the BVD model (Fig. 4), which is a band-pass filter highlighting the resonant nature of PZTs. In the BVD model, the electrical part is represented by C 0 , the capacitance of the transducer. The acoustic/mechanical part is represented by R 1 , L 1 , C 1 where L 1 , C 1 model the resonant behaviour and R 1 models the dissipative loss. The values of R 1 , L 1 , C 1 are selected so that the resonant frequency and Q factor of this RLC circuit are numerically equal to that of the mechanical resonance of the PZT [37].
The component values at resonance in the BVD model can be deduced from its admittance, Y (ω) as in (9).
The magnitude of Y (ω) is the greatest (smallest) at series (parallel) resonance. Therefore, at series resonance, the imaginary part of the denominator in (9) is zero and the series resonance frequency ω s is given by At ω s , (9) reduces to R 1 and C 0 can be deduced from the real and imaginary parts of (12) respectively. Similarly, at parallel resonance (ω = ω p ), the magnitude of the admittance is at a minimum, and by setting the real part of the numerator of (9) to zero, yields Note that the component values given in (10) -(13) are frequency dependent and are only valid near resonance. Therefore, the valid range of the BVD model is limited and the BVD model is best used in initial approximations or iteratively.

B. Capacitive Micromachined Ultrasonic Transducer
CMUT technology was developed to address some of the drawbacks of PZTs. Compared to conventional PZT, CMUT technology offers the major advantages of increased bandwidth of operation, ease of fabrication of large arrays, reduced temperature sensitivity and better integration with CMOS electronics using through-wafer interconnect vias [38]- [40] or monolithic CMUT-CMOS integration [41], [42]. CMUT technology has been in development for more than three decades. Several pioneering works on the application of micromachining techniques to the fabrication of capacitive ultrasonic transducers were reported in the late 20 th century [43]- [45], whereas the concept of capacitive acoustic transducer can be traced back to the 1940s [33].
The basic operating principle of a CMUT is rather intuitive and can be inferred from its structure as shown in Fig. 5. A CMUT comprises a capacitor cell that has a movable membrane positioned over a vacuum gap. A metal layer on top of this membrane serves as the top electrode, whereas the silicon substrate serves as the bottom electrode. The insulating layer prevents the shorting of the two electrodes and the passivation layer provides protection. The CMUT is dc-biased which results in the top electrode being attracted toward the bottom electrode by electrostatic force. The stiffness of the top plate results in a mechanical restoring force. By applying an ac-voltage to the CMUT, ultrasound waves can be generated from the movement of the membrane. The vacuum gap is necessary to prevent mechanical loading of the bottom side of the moving membrane [8]. On the other hand, if the top plate is subjected to impinging ultrasound waves, the incoming pressure will cause a displacement on the top plate and change the capacitance. This change in capacitance under a constant dc-bias voltage in turn generates an electrical current signal that can be recorded and amplified [15]. The amplitude of the electrical current signal  depends on the frequency of the wave, the bias voltage and the capacitance of the CMUT device [15].
In early work, the CMUT model [8], [45] was derived theoretically from first principles and was largely based on Mason's work on electromechanical transducers [33]. This type of CMUT model is a two port network with the electrical domain on one port and the acoustic domain on the other. It was necessary to make some simplifying assumptions (to be explained later) to construct such a model; otherwise, the mathematical equations would be too involved. The two port network CMUT model is shown in Fig. 6 and its complete derivation can be found in [8].
As shown in Fig. 6, the electromechanical coupling that is at the crux of the CMUT is represented succinctly by the transformer with a transformer ratio n. The equation for the current I is the sum of an electrical component caused by the electric source and a mechanical component that arises from the motion of the membrane. The mechanical component in I is weighted by n which ensures dimensional consistency. The mechanical load impedance is represented by the membrane impedance Z m and the mechanical impedance in the target medium is represented by Z a . This is a small signal model that is valid for a receiving CMUT and even for a transmitting CMUT as long as the membrane displacement does not reach the collapse point and the bias voltage does not result in severe spring softening [8]. This model assumes the absence of any parasitic electrical elements in the device and air bubbles beneath the membrane. As computational capabilities increased, however, CMUT models became more accurate and were able to account for non-ideal effects with the help of FEM [46], [47].
In most cases where the CMUT is not air-loaded, i.e. immersion contexts, Z a is usually much larger than Z m . In this context, the equivalent circuit can be further simplified to the commonly used RC-parallel circuit shown in Fig. 7. The equivalent resistance and capacitance are given by where l t is the membrane thickness, l a is the separation between the bottom electrode and the membrane, is the dielectric constant of the membrane material, V DC is the dc voltage applied between the top and bottom electrodes, and S is the area of the membrane [8].

IV. TRANSDUCER DRIVER CIRCUIT
The transducer driver circuit design can be classified into continuous-wave and pulsed-wave systems. Continuous-wave ultrasound systems are normally reserved for specific medical ultrasound applications such as continuous-wave Doppler wave imaging and certain therapies. There are several commercial continuous-wave ICs [48], [49]. However, in the microelectronics research community, there are much more efforts dedicated to the design of pulsed-wave systems. Therefore, this paper focuses on transducer driver circuits for pulsed-wave applications, also known as pulsers.
A pulser delivers short bursts of electrical energy to the transducer elements. In order to increase the penetration depth of ultrasound waves, the pulser is typically expected to drive the transducer with voltage pulses of large amplitudes that are several tens of Volt, to more than 100 V. This requirement typically necessitates the use of high-voltage transistors. However, high-voltage transistors tend to be costly and occupy a larger die area, which complicates the design for area-constrained applications like intravascular imaging.
The pulser is typically the most power-hungry block in the ultrasound front-end. Regardless of implantable or wearable/portable applications, it is crucial to design the pulser for high energy efficiency. Closely related to the pulser energy efficiency is its pulse shape which influences the energy spectrum of the transmitted pulse, the amount of acoustic energy being converted and the dynamic power consumption in the pulser [50]. Ideally, the energy spectrum of the transmitted pulse should be concentrated within the effective bandwidth of the transducer's transfer function for optimal response. There can be advantages in transmitting pulses with different shapes other than conventional digital square wave pulses, for example, continuous sine or Gaussian-modulated sinusoidal waves. In this section, the two main classes of pulsers, arbitrary waveform pulsers and square-wave pulsers are discussed. A comparison of the state-of-the-art can be found in Table II.

A. Arbitrary Waveform Pulsers
A linear amplifier is designed to produce an output that is an accurate, scaled copy of the input but with increased power level. Linear amplifiers can take a variety of waveforms as input and are generally used to output arbitrary excitation waveforms for ultrasound transducers [51]- [54]. Compared to square wave pulsers, linear amplifiers are more complex and less power efficient [55]. Nevertheless, linear amplifiers are attractive because of their capability to generate complex and arbitrary waveforms as well as low harmonic distortion [54]. Note that, low second-order harmonic distortion (HD2) from the transmitter is especially valuable as it allows for tissue harmonic imaging (THI), an alternative ultrasound imaging method accidentally invented in 1997 with the benefits of reduced reverberation noise, improved border delineation and increased contrast resolution 2 [56]. Typically, THI requires the transmitted signal to have less than −40 dB HD2 [53]. The difficulty in designing the linear amplifier lies in simultaneously achieving large signal swing, low HD2 and wide bandwidth with high-voltage transistors which are inherently slow [52]. It is also very challenging to implement such a linear amplifier as an IC, whereas discrete, PCB-based linear amplifiers for ultrasound imaging are relatively easier to implement [57], [58].
Linear amplifiers designed for medical ultrasound applications are typically class AB or class B. A general architecture consists of a multi-stage approach. It is usually more practical to have a low-voltage supply stage that uses standard MOSFETs followed by a separate high-voltage supply stage that uses highvoltage transistors like double-diffused MOS (DMOS). The lowvoltage gain stage can be realised as a two-stage Miller op-amp (voltage amplifier) [53], a transconductance amplifier [51] or a current amplifier [54]. By way of example, the transconductance amplifier in [51] uses bipolar junction transistors at the input stage for maximum g m /I b and cascoded MOSFETs to boost the output impedance and raise current flow into the subsequent transimpedance amplifier stage. It achieves a transconductance gain of 60 mS and I Quiescent of 4.5 mA.
Unlike the linear amplifiers in [51], [53] which use voltage feedback, the low-voltage gain stage in [54] is a current amplifier because the overall linear amplifier uses current feedback. The advantage of using current feedback is that the amplifier is not restricted by a constant gain-bandwidth unlike voltage feedback amplifiers. By selecting the appropriate resistors in the feedback loop, the current feedback amplifier can achieve a high bandwidth over a wide range of gains. As a result of using current feedback, the design in [54] achieves high bandwidth (over 20 MHz) and slew rate (12 V/ns) yielding good distortion performance (−43 dBc) at a relatively low power level (20 mW).
The primary objective of the subsequent high-voltage gain stage is to maximise the output signal swing. An example design is shown in Fig. 8. This transimpedance amplifier provides high gain and uses high-voltage DMOS apart from the input transistors which are thin-oxide transistors to reduce input impedance. The transimpedance amplifier has a balanced topology to accommodate the positive and negative current waveforms at node A [51]. To avoid gain degradation at the load (transducer), it is necessary to have an output buffer. The output buffer can be designed as class B [51] or class AB [52]- [54].
An alternative linear amplifier architecture is shown in Fig. 9. It illustrates a differential design that inherently cancels out  even-order harmonic distortion. The advantage of the differential design is reflected in its superior HD2 performance (−56 dBc) and increased output signal swing (180 V pp ). However, the drawback is the need for a bulky off-chip transformer to convert differential signals to single-ended signals.
B. Square Wave Pulsers 1) Level Shifter: In order to generate large square wave voltages from control signals, the use of a level shifter or a level shifter followed by an output stage (typically class D for high power efficiency) is arguably the most intuitive and popular approach as seen from the numerous published designs [31], [38]- [40], [64], [66]- [73].
A basic implementation of a high-voltage level shifter [31], [64] is shown in Fig. 10. It uses only a few transistors and could be preferable in area-constrained applications. However, the drawback is that the gate control of the output PMOS device (M 2 ) and M 3 is not ideal and it is very likely that on an input '1' to '0' transition, M 2 will not be driven into the cut-off region completely [74]. As a result, the output voltage will be a small positive dc offset instead of 0 V ideally.
In order to overcome the drawbacks of the basic level shifter, the circuit in Fig. 11 was proposed in [65] and has been widely used in many ultrasound ICs [38]- [40], [67], [68]. The level shifter is implemented with two cross-coupled branches each consisting of a high-voltage common source NMOS (M 1 , M 2 ) connected to a diode-connected PMOS load (M 4 , M 5 ) in parallel with a PMOS transistor (M 3 , M 6 ) that pulls nodes A  [65].
Although the topology in Fig. 11 is effective, there are several drawbacks that have been addressed by recent work. Firstly, the circuit requires high-voltage transistors which are costly and have a larger die area, parasitic capacitance and on-resistance compared to standard CMOS transistors. The designs in [69], [75] attempt to circumvent the use of high-voltage transistors by stacking standard thin-oxide 1.8 V or thick-oxide 5 V CMOS transistors only. The high-voltage output stage is no longer the simple two high-voltage transistor implementation in Fig. 11 but is composed of stacked standard CMOS transistors to support a high voltage as well as a dynamic gate biasing circuit modified from that in [76] for a smooth push-pull operation.
The level shifter is also more complex and employs stacked transistors and dynamic gate biasing. The use of stacked standard CMOS transistors could potentially apply stress to the reverse diode between the n-well and the p-substrate, resulting in a potential long-term reliability hazard. Therefore, extra precaution must be taken to ensure that the stacked-transistor design is safe and reliable over the working voltage range and across process corners. The IC in [69] occupies a very small area (0.022 mm 2 ) but has a rather small output voltage (9.8 -12.8 V) compared to the several tens of Volt that can be delivered with high-voltage transistors.
Secondly, a major disadvantage of the topology in Fig. 11 is the significant static power dissipation in the level shifter. More specifically, the presence of continuous power dissipation in the voltage mirrors regardless of the input voltage level is a significant wastage of power. Several high-voltage level shifters have been proposed with significantly reduced power dissipation [74], [77], [78]. For instance, in [78], the proposed level shifter was designed for wearable medical ultrasound therapeutic applications and improves on the design in [79] by modifying the level-triggered level shifter to be pulse-triggered. With a pulse-triggered approach, power is mostly consumed during the short trigger pulses. Thus, this pulse-triggered level shifter has a much lower power dissipation. It also has a smaller propagation delay.
Thirdly, the topology in Fig. 11 can only produce unipolar pulses, rendering it inapplicable to ultrasound systems that require bipolar pulses. However, by arranging multiple level shifters and output stages, it is possible to generate bipolar pulses as evidenced in [71], [80]- [84]. The design in [71] is elaborated here to explain how bipolar, return-to-zero pulses can be generated with the help of conventional level shifters and a novel high-voltage output stage.
In Fig. 12, M 1 and M 4 are turned on to generate a high output voltage (V DDH ), while M 1 and M 3 are turned on to return to zero. M 2 acts as the floating gate driver of M 1 . M 2 charges and discharges the gate of M 1 with the help of parasitic capacitances C GS1 , C SUB1 and C DS2 . The low-voltage MOSFET M 0 acts as the embedded T/R switch. An identical, complementary design to that shown in Fig. 12 generates the negative-going pulse and completes the 60 V pp bipolar, return-to-zero pulser.
In summary, the level shifter with output stage approach is the most widely used when designing square wave pulsers to drive ultrasound transducers. Undeniably, innovations in the design of level shifters and output stages have improved the pulser's performance. However, the pulsers in this category still suffer from one common drawback; the large fCV 2 power wasted at the ultrasonic transducer load, which can be especially capacitive. The following category of ultrasound pulsers aims to overcome this drawback.
2) Multi-Level Pulse-Shaping: Considering the significantly higher capacitive load that PMUT/CMUT presents, pulsers for driving them have to adopt techniques to reduce the fCV 2 power dissipation. The stepwise charging technique [85], [86] has been successfully applied in several designs [25], [59], [61], [87]- [89]. The working principle of the stepwise charging or adiabatic switching technique is based on the following observation. With the switching frequency, load capacitance and voltage swing fixed, decreasing the average voltage dropV across the load capacitance is the only way to decrease the power dissipation [85].   Fig. 13(a) shows the basic stepwise charging technique for C L using N uniformly distributed voltage supplies that are connected to C L in a successive, ascending order to charge C L to V N , i.e. connect V 1 , disconnect V 1 , connect V 2 , disconnect V 2 ... connect V N . The discharging of C L is done in a descending order from V N −1 to V 1 and then the switch 0 is closed, grounding the output. For each charging step, the energy dissipated is given by For a total of N charging steps, the energy dissipation is The overall energy dissipation taking into account both charging and discharging will be twice that in (16). The overall power dissipation will be smaller by a factor of N than the conventional case (no stepwise charging/discharging) because the average voltage drop across each switch is N times smaller [85].
One of the first designs that applied the stepwise charging technique to ultrasound systems is depicted in Fig. 14. This design uses a dc-dc converter with large off-chip capacitors to generate the required voltage levels (0, 15, 30 V, N = 2) and four high-voltage transistors to switch to these voltage levels. While the theoretical power dissipation improvement is 50%, the measured results showed a 38% power dissipation improvement Fig. 15. Working principle of a pulser designed specifically for a PMUT load [25], [89]. (a) Charging phase. (b) Redistribution phase. over the conventional two-level waveform. This discrepancy is a result of the power dissipation in the high-voltage MOSFET switches, which are large and capacitive. This self-loading effect erodes most of the power savings from having more voltage levels. In this design, three-level pulsing was determined to provide the greatest power efficiency improvement.
Although the design in Fig. 14 reduces power dissipation, there remains two areas of improvement, namely the use of off-chip capacitors and the relatively modest reduction in power dissipation (38%). Recently, several pulsers have been proposed that improve on the design in Fig. 14 [25], [61], [89]. In [61], a 7-level (including the ground level) ultrasound pulser that drives a PMUT load was presented. This design adopts a modular supply multiplying approach that enables a high-voltage output pulse several times the supply voltage (5 V) to be generated. Essentially, each module is similar to a switched-capacitor cell, in which a storage capacitor is either in the charging or transferring mode. By connecting several of these modules in series and introducing the appropriate time delays, a high-voltage multi-level output waveform can be achieved. In this case, each inter-level step is equal to the supply voltage, with a total of 6 intended steps. In comparison to the design in Fig. 14, the modular supply multiplying pulser is able to introduce more levels in the output waveform before the self-loading effect becomes non-negligible. Thus, the design in [61] is able to reap a greater percentage saving in power dissipation. Two prototypes intended for a load of 55 pF and 1 nF were fabricated. For the 55 pF load prototype, an on-chip design was presented that integrates 3 nF metal-insulator-metal capacitors as the storage capacitors. A 58% power dissipation reduction relative to fC L V 2 was achieved. The 1 nF load prototype resorted to 60 nF external capacitor for the storage capacitors. Nevertheless, a peak power reduction of 75.4% relative to fC L V 2 was achieved. This is one of the highest power reductions relative to fC L V 2 reported so far.
A new type of ultrasound pulser designed specifically for a bimorph PMUT load was reported in [25], [89]. Its working principle is shown in Fig. 15. The PMUT is modeled as an equivalent capacitor in which the outer and inner electrodes of the PMUT correspond to the top and bottom plates of the equivalent capacitor. Initially, the top plate of the PMUT capacitor is charged to a potential of V DDH while its bottom plate is grounded. The piezoelectric membrane is deformed. During the redistribution phase, the ground and supply connections are opened and the top and bottom plates are shorted together 3 . The piezoelectric membrane return to its straight form. Assuming the plates (electrodes) have identical size and negligible leakage, the charge on the capacitor is evenly redistributed on both plates and the potential on both plates will equalise to a value of V DDH /2 with respect to ground [89]. Hence, in the next charging (discharging) phase, each electrode only charges (discharges) from V DDH /2 to V DDH (ground). The voltage step is decreased by half, which could lead to a power saving. The measured results in [25], [89] show a power reduction of 32.8% and 42.6% relative to fC L V 2 respectively.

V. ULTRASOUND RECEIVER CIRCUIT
The receiver circuit directly affects the subsequent back-end processing, and for this reason, is often the performance bottleneck in ultrasound systems. A complete ultrasound receiver architecture typically comprises of i) a LNA to amplify the weak echo signals to allow for subsequent beamforming and analog-digital conversion, ii) a TGC circuit to support the large input signal dynamic range and iii) a T/R switch to protect the low-voltage receiver circuit from the high-voltage TX pulses. The LNA and the TGC circuit may be separate modules or combined. The designer of an ultrasound receiver circuit faces many trade-offs e.g. bandwidth, distortion, noise, power and area. The challenge in designing an ultrasound receiver is in achieving very low noise and a large dynamic range simultaneously [29].
In this section, the analysis of the receiver circuitry is divided into three parts according to its three constituent elements. In the first part, circuit topologies that implement low-noise amplification are examined. The second part introduces the concept of TGC and highlights various circuit topologies that realise TGC. T/R switch designs are explored in the third part. This section concludes with a discussion of figure-of-merit (FoM) and Table III that summarises the state-of-the-art. 3 Although the PMUT is typically modeled as a capacitor, the reader should be aware of a slight discrepancy with this approach here. The shorting of the top and bottom plates of a charged capacitor cannot be adequately explained by ordinary circuit theory as it seemingly violates the law of conservation of energy. This is a variant of the well-discussed two capacitor paradox (see [90] for a detailed treatment). Nevertheless, this does not detract from the ability of this pulser to reduce its power dissipation by shorting the outer and inner electrodes of the bimorph PMUT load.

A. Low-Noise Amplifier
In ultrasound systems, the LNA can be implemented in a number of ways to support different transducers and applications. The LNA has been realised as a charge-based amplifier [91], transconductance amplifier [92] and current amplifier [14], [93] (modified from the transimpedance amplifier (TIA) in [94]). The charge amplifier circuit with a floating node charge adaptation circuit in [91] achieved high signal-to-noise ratio (SNR) and low-power performance for CMUTs. However, its bandwidth was limited to the kHz range, which is insufficient for the MHz range required for typical ultrasound medical imaging applications. On the other hand, the use of a current amplifier or transconductance amplifier is largely architecture-dependent. For example, in [93] the LNA was implemented as a current amplifier to be compatible with the subsequent beamforming stage that was designed in current-mode. In [14], the output signal needs to be a current given that the IC was designed for an intra-vascular ultrasound probe with only one cable available.
Given the fact that an ultrasound transducer element produces a current signal in response to impinging ultrasound waves, a TIA that performs current-to-voltage conversion is the most popular choice for the LNA in ultrasound systems. TIAs are also popular in other biomedical applications such as biosensing and blood pressure monitoring with photoplethysmography [95], [96]. Fig. 16(a) shows the basic TIA. The closed-loop amplifier adopts shunt-shunt feedback in which the negative feedback network senses a voltage at the output and returns a current back to the input. The shunt-shunt feedback helps to decrease the input and output resistances, making a more ideal TIA. A typical implementation of this amplifier for ultrasound systems is shown in Fig. 16(b), which consists of a common-source gain stage followed by a source follower and resistive feedback. The closed-loop transimpedance gain, R T of the circuit in Fig. 16(b) is given by (17). The input-referred noise current is given by (18). The topology in Fig. 16(b) is very popular and has inspired many variants. For instance, R D was replaced with an active load in [38]- [40], [42], [66] and R F was replaced with a pseudo-MOS resistor to save chip area in [42], [75]. The design in [75] also employed a resistor network to bias the body of M 1 (forward body bias technique) to reduce the threshold voltage, supply voltage and consequently, the power consumption.
where k is the Boltzmann constant, T is temperature in Kelvin, and γ is the excess noise coefficient. As a single-ended implementation, the circuit in Fig. 16(b) has the advantages of small area and power consumption. These advantages are very useful in probe-based ultrasound imaging applications. A single-ended circuit would also be acceptable where distortion concerns are less critical. However, the single-ended amplifier inevitably faces a number of problems including poor supply rejection and supply-dependent biasing [97].
To address these problems, differential amplifiers have been proposed to reap benefits such as suppression of common-mode noise, power supply noise and even-order distortion. In [59], a differential input, single-ended output amplifier (two-stage Miller op-amp followed by source follower) was designed. The amplifier was optimised for trade-offs between bandwidth, noise and power dissipation by carefully sizing its transistors in order for the location of its poles to coincide with the target bandwidth [59].
A singled-ended input, differential output amplifier was designed in [98] and depicted in Fig. 17. At its core, the TIA comprises a common-gate stage and a common-source amplifier. The novelty of this design lies in its use of negative feedback. With negative feedback at the common-gate stage, the power consumption is reduced, whereas R F provides a noise cancelling scheme at the differential outputs for the common-source transistor [98].

B. LNA With Time-Gain Compensation
For ultrasound receivers, a single LNA with a fixed gain is insufficient when handling a large input signal dynamic range. The echo signals that originate from deep tissues take a longer time to reach the receiver and will be more heavily attenuated than echo signals from nearby tissues. With a fixed-gain LNA, the strong echo signals could saturate the amplifier, whereas the Fig. 17. Low power, low noise, single-ended to differential TIA [98]. weaker signals could have insufficient amplification. In ultrasound imaging, the former case shows up as a bright speck while the latter manifests as an indistinguishable feature. Therefore, it is necessary to augment the LNA with some form of automatic gain control in which weak signals that take a longer time to arrive are amplified with a larger gain whereas stronger signals are amplified with a smaller gain to achieve a relatively flat amplitude response. This automatic gain control is termed timegain compensation (TGC) in the context of ultrasound. Ideally, the TGC network should also reduce the overall signal dynamic range (Fig. 18) to relax the circuit requirements for later stages, especially if there is subsequent analog-digital conversion.
Furthermore, the TGC network should exhibit an exponentially varying gain (gain increases linearly in dB with time) to compensate for the exponential attenuation of ultrasound waves in human tissues (see Section II). This is challenging to achieve in CMOS technology because the MOSFET is a square-law device. On the other hand, it is easier to design dB-linear circuits with BJTs. Consequently, the CMOS circuits that implement the linear-in-dB TGC are approximations at best. These circuits Fig. 19. Circuits that perform TGC [99]. (a) PGA with resistive or capacitive feedback network. (b) VGA with variable transconductance [112]. (c) VGA with linear terms [113]. (d) VGA using an interpolated ladder attenuator [114].
can be largely classified into two categories; amplifiers with discrete gain steps, also known as programmable gain amplifier (PGA), and amplifiers with continuous gain control, also known as variable gain amplifiers (VGA) [99].
1) Programmable Gain Amplifier: The most straightforward and popular approach for performing the TGC function in ultrasound systems is to use a digitally-programmable resistive feedback network [100]- [106] or capacitive feedback network [107]- [109] to approximate the exponentially varying gain with discrete gain steps. This TGC network is shown in Fig. 19(a). The discrete gain steps can also be distributed among multiple amplifier stages. For example, if the LNA and the PGA are kept separate, then the LNA can implement coarse gain steps while the subsequent PGA implements fine gain steps [107], [110]. Typical implementations of the PGA include inverter-based amplifier, current-reuse operational transconductance amplifier and cascoded flipped-voltage follower. Interestingly, Kelvin switches have been used to mitigate the gain inaccuracy due to the on-resistances of the switches in the feedback network [100], [102], [110].
The benefits of the PGA include ease of control and more importantly, the accurate definition of gain steps with the ratios of feedback resistances or capacitances that are insensitive to process and temperature variations [111]. However, the inability to scale is the main drawback of this topology. For a closer approximation to the ideal exponential characteristic, more discrete gain steps are required by adding more resistors or capacitors. This method is impractical as it would increase the chip area significantly. Other limitations include i) the changing input and output impedances of the PGA that could complicate the performance of inter-connected modules, ii) low operating bandwidth due to the close-loop configuration, and iii) switching artifacts in the ultrasound image from one discrete gain step to the next [99].
2) Variable Gain Amplifiers: The disadvantages of PGAs have prompted designers to adopt amplifiers with variable gain control in applications where a continuous gain transition is desirable. VGAs normally have an dB-linear gain that can be set with an analog control signal, typically a control voltage. In general, the design of ultrasound VGAs is more challenging than that of PGAs. There is a relatively small number of ultrasound VGAs published. However, there are many VGAs designed for communication systems which provide a good theoretical foundation for the design of ultrasound VGAs. Thus, this section takes a slight detour into communication system VGAs in order to better elaborate on ultrasound VGAs.
In order to achieve this dB-linear gain, VGAs can be broadly classified into two categories; amplifiers based on exponential approximation functions and amplifiers with interpolation between discrete gain steps [99]. In the first category, amplifiers achieve dB-linear gain by using the inherent linear and quadratic characteristics of MOSFETs to implement exponential approximation functions e.g. the Padé approximation or the Taylor series expansion up to second order terms. The Padé approximation is given in (19). For −0.32 ≤ x ≤ 0.32, the relative error of (19) is less than 5% [115]. This shows that the dB-linear range of VGAs using the Padé approximation is very limited. Thus, many VGAs have to be cascaded in order to extend this range. This would, however, incur power, chip area and bandwidth penalties.
The Taylor series expansion of an exponential function is given by (20). The relative error of (19) is less than 5% [115] for −0.575 ≤ x ≤ 0.815, a slight improvement compared to (19). However, in order to realise the terms in (20), special circuit blocks e.g. a linear V-I converter and a current square circuit, are required, which increase the design complexity [113].
The limited linear input gain range of the Padé and Taylor series approximations have spurred designers to use other approximation functions [112], [113], [116], [117]. An example approximation function from [112] is presented in (21), where a and k are constants. A plot of f (x) against x shows that for k less than 1, the dB-linear range of (21) increases substantially and peaks at k = 0.12 [112]. The circuit implementation is shown in Fig. 19(b). By varying the bias currents of the differential pair and the diode-connected load as a function of the control voltage, a variable transconductance and a non-linear transfer function that follows the form given in (21) can be obtained [112]. Despite not following (21) exactly, an improved variable gain amplifier for ultrasound imaging that also varies the bias current has been proposed recently [118]. Fig. 20. T/R switches [128]. (a) Zener diode bias [120], [121]. (b) Floating latch [122], [123]. (c) Level shifter [124]. (d) Dynamic gate-source shunt [128].
In the second category, amplifiers with interpolation between discrete gain steps [99], [114], [119] can be seen as a compromise between the PGA and the approximation-based VGA. An example of this type of amplifier was first reported in [114] and shown in Fig. 19(d). The input signal is attenuated by the resistor ladder network (R-2R) in discrete steps. The attenuated input signal is then applied to an amplifier with multiple input stages. The novelty of this design lies in gradually changing the bias currents of these input stages via a current steering mechanism which would effectively lead to interpolation between the discrete gain steps imposed by the ladder network [114]. This interpolation amplifier has influenced subsequent designs. Recently, a current-interpolation TIA that uses a capacitive ladder network to avoid the additional noise associated with a resistor ladder was proposed [99]. However, a disadvantage of this category of VGA is that it requires a substantial portion of the die area to be reserved for the passive component feedback network [118].

C. Transmit/Receive Switch
The T/R switch plays a crucial role in protecting the sensitive receiver circuit from the high-voltage TX pulses. Several T/R switch designs with varying complexity have been proposed for ultrasound systems. The simplest T/R switch in ultrasound ICs is a high-voltage NMOS [38], [66]. With careful sizing, the on-resistance and capacitance of this high-voltage MOSFET can be set within tolerable margins. However, the presence of body diodes in high-voltage MOSFETs means that a single high-voltage transistor is insufficient if the TX pulse contains both positive and negative voltages. Thus, two back-to-back connected high-voltage transistors are normally used to provide bi-directional isolation as seen in Fig. 20.
The two most important attributes of the T/R switch are the ability to provide good, effective isolation and a low onresistance for better SNR and power efficiency. To this end, various T/R switches have been proposed to address this problem. These T/R switches (Fig. 20) can be classified into four categories, Zener diode bias approach [120], [121], floating latch approach [122], [123], level shifter approach [124]- [127] and dynamic gate-source shunt approach [128].

VI. BEAMFORMER
The primary function of the beamformer is to establish directivity in the transmitted or received ultrasound beam by manipulating the spatial distribution of the pressure field amplitude in the target volume [129]. For instance, on the TX side, the beamformer should drive the pulsers in order for the ultrasound beam emanating from the transducers to be steered toward a certain direction and/or be focused at a specific depth. On the RX side, the beamformer performs the complement function. Echo signals from a specific direction and/or focal depth are selectively amplified and summed whereas other echo signals are filtered out. In essence, beamforming relies on the controlled constructive and destructive interference of ultrasound waves to achieve the desired effect. In this section, an overview of beamforming is given to provide the necessary theoretical background. Subsequently, analog and digital ultrasound beamforming circuits are discussed. A comparison of the state-of-the-art can be found in Table IV.

A. Beamforming Overview
The mathematical treatment of beamforming is rather involved. The reader is referred to [130] for a complete derivation. In this section, a simple and intuitive explanation of beamforming is presented to help the reader understand what beamforming is and how it can be achieved.
Consider a phased array of ultrasound transducer elements where each element can be driven and have its response recorded separately. In the TX mode, if each element is driven identically, i.e. identical electrical pulses drive the elements at the same time, then each element acts as a point source emitting a spherical wave [130]. These spherical waves combine and propagate along the horizontal axis [ Fig. 21(a)]. However, if relative time delays between the driving pulses were applied, then the phased array would steer the ultrasound beam in different directions [ Fig. 21(b)]. By using more complex time delays, beam focusing on top of beam steering can be achieved [ Fig. 21(c)]. Furthermore, individual amplitude weights could be given to the transducer elements on either TX or RX modes. This is known as apodisation [ Fig. 21(d)] and is commonly used to reduce the effect of side lobes in the ultrasound beam pattern [130].
Relative time delays can also be used during RX beamforming. For instance, by applying relative time delays to the electrical signals generated by impinging ultrasound waves, the electrical signals can be time-aligned and then summed to result in one large output response. Effectively, the phased array can be viewed as a single large transducer that is oriented to face the incoming wave at normal incidence [130]. Ultrasound RX beamforming is illustrated for two cases in Fig. 22.
The ultrasound beamformer circuit can be divided into analog and digital implementations. The two crucial circuit elements  in the beamformer circuit are the variable delay cell and the adder. In an analog beamformer, the variable delay cell can be implemented as a cascaded delay cell or an analog memory cell [131], whereas the summer can often be designed as a summing op-amp. In a digital beamformer, the variable delay and adder can be implemented with FIFO registers. The analog beamformer (Fig. 23) only requires one high speed, high resolution ADC, a significant advantage in terms of power dissipation and area over digital beamformers. However, poor matching between channels remains the most significant limitation of analog beamformers [29]. In a digital beamformer (Fig. 24),    every channel contains an ADC which allows for the subsequent beamforming operations to be conducted entirely in the digital domain. Consequently, the main advantage of a digital beamformer is its robustness and noise immunity.

B. Analog Beamformer
The emphasis of this section is the most crucial module in an analog beamformer, the delay element. More specifically, analog RX beamformers are discussed in this section. To the best of the authors' knowledge, all of the proposed TX ultrasound beamformers are implemented as digital blocks and are discussed subsequently. The analog delay element used in ultrasound RX beamformers can be broadly classified into two categories; cascaded delay cell and analog memory cell as shown in Fig. 25 [131]. In the cascaded delay cell [ Fig. 25(a)], the input signal is applied through a chain of delay cells (taps) and the output signal is taken after a certain number of delay cells depending on an external control signal. The amount of the delay applied to the signal is thus dependent on the number of taps it goes through. This type of cascaded delay cell has been implemented in a variety of ways e.g. an LC delay line [134], [135], a first-order, fully-differential RC all-pass filter [136], a log-domain BiCMOS all-pass filter [137] and a current mirror all-pass filter [93]. By way of example, the current mirror all-pass filter delay cell is shown in Fig. 26. This current mirror all-pass filter aims to approximate an ideal delay (22). Two biquad current mirrors are cascaded together to form a broadband all-pass filter with a transfer function given in (23). The resulting second-order low-pass filter as seen in (23) is intended for bandwidth extension by exploiting the fact that a second-order low-pass filter exhibits a flat amplitude response over a wider frequency range [93].

C. Digital Beamformer
TX beamformers are typically implemented with digital control logic. For instance, [60] uses shift registers and a global counter, whereas [63], [73] use a delay-locked loop to generate TX pulses with well-defined timing and phases. The digital TX beamformer in [63] is also one of the few designs that generates TX pulses with both programmable phases and amplitudes. The sixteen phase delays enable beam focusing and steering while the four scalable amplitude levels provide apodisation to reduce side lobes.
On the other hand, the challenges with designing digital ultrasound RX beamformers are very different compared to those of analog RX beamformers. Many of the proposed digital ultrasound RX beamformers are not designed for implantable applications and frequently involve the use of FPGAs and/or commercial DSP chips [142]- [145]. With a digital beamformer, the focus is not on realising variable delays but on implementing advanced beamforming algorithms efficiently on the FPGA. The design of ultrasound beamformers using FPGAs and/or commercial ICs is beyond the scope of this review paper. The focus of this section is directed to the work in [146]- [149], which are some of the few non-commercial, digital RX ultrasound beamforming ICs that have been published.
In [147], a 64-channel digital RX ultrasound beamformer with non-uniform ADCs was proposed. The novelty of this design is that at each channel, the received signal is non-uniformly sampled by the ADC and only the necessary data for RX beamforming is stored. A look-up table stores the non-uniform ADC sample times. This helps to shrink the FIFO memory size to 25% [147] compared to a conventional approach. This work is an important step toward miniaturising digital ultrasound beamformers that can be deployed in area-constrained applications.
In [146], an analog-digital hybrid RX beamformer was proposed as a compromise solution when interfacing with a large 2D CMUT array (64 × 128). It is impractical to wire all 8192 transducer elements to beamforming circuits. Therefore, subarray beamforming [150] is adopted in which the 2D array is divided into smaller sub-arrays (8 × 8), so that only 128 outputs remain. The sub-array beamforming is split into two stages. The first stage uses analog beamformers and the second stage is implemented in the digital domain. This two stage beamforming approach retains the advantages of performing beamforming operations in the digital domain whilst reducing the number of ADCs that consume significant chip area by using analog beamformers in the first stage.
In [148], a VLSI implementation of a 10k-channel fully digital 3-D beamformer was presented. The entire 3-D delay-and-sum beamforming operation is integrated on-chip without the need for off-chip memories. It is capable of producing 298.1 M focal points per second which allows for the creation of a high-resolution volume. This is a marked improvement over its analog counterparts and even conventional digital beamformers which mainly performs beam steering. Nevertheless, this design is not intended for implantable or even wearable applications as its power dissipation is too large.

VII. ANALOG-DIGITAL CONVERTERS
ADCs designed for medical ultrasound systems are typically optimised for low power and compact area. These requirements are especially important for ultrasound imaging probes. Where possible, the size of the IC should be smaller than the ideal half-wavelength pitch for the transducers used so as to reduce side lobes and improve image quality. On the other hand, resolution and speed considerations can be relaxed. Among the published ultrasound ADCs, successive-approximation register (SAR), pipeline and delta-sigma architectures are the most popular. In this section, several noteworthy ultrasound ADCs are highlighted. A comparison of the state-of-the-art can be found in Table V.

A. SAR ADC
It is well-known that SAR ADCs are very often used for medium-to-high resolution applications with sample rates in the order of a few megasamples per second. SAR ADCs have low power consumption and occupy a relatively small chip area, making them a good choice when designing ultrasound ADCs [108], [109], [147]. A novel SAR ADC designed for miniature 3D ultrasound probes was proposed in [108]. In this design, the digitisation was conducted in the charge domain, instead of the conventional voltage domain. The digitisation was achieved by comparing the signal charge with binary-scaled charge references generated from a pre-charged capacitor DAC array through a successive approximation algorithm. The rationale for this is to eliminate intermediate ADC buffers in order to reduce the power dissipation and area.

B. Pipeline ADC
Pipeline ADCs have seen a surge in popularity for mediumhigh sampling speed applications. Pipeline ADCs have decent power and area performance, making them suitable for ultrasound systems. To the best of the authors' knowledge, there have only been two pipeline ADCs published for ultrasound systems [151], [152]. For instance, in [152], a 10 b pipeline ADC was implemented in a 250 nm CMOS technology. In an attempt to deal with the growing number of channels, the ADC used two parallel multiplexing sample-and-hold stages to multiplex eight ultrasound channels. While pipeline ADCs have been applied in commercial ultrasound ICs, there has been little research into pipeline ADCs for medical ultrasound technologies recently.

C. Delta-Sigma ADC
Delta-sigma ADCs are typically used when it is especially important to have low noise or good precision. Several deltasigma ADCs designs have been reported for ultrasound applications [107], [153], [154]. For instance, in [154], an elementmatched delta-sigma ADC was proposed. The novelty of this design lies in utilising the band-pass filter characteristic of PZT to remove redundant A/D conversion hardware. In this way, an entire delta-sigma ADC could be fitted under the area of a transducer element.

VIII. RECOMMENDATIONS FOR FUTURE WORK
Throughout the past few decades, it can be seen that the advancements in medical ultrasound are largely driven by the advent of enabling technologies (CMUT and CMOS) and new applications (e.g. fetal scan, intracardiac echocardiography). Therefore, in looking ahead to what will be important in future medical ultrasound research, it is worthwhile exploring new technologies and applications.
Currently, there are three new technologies that can prove to be a game changer in medical ultrasound. Firstly, there is the new type of ultrasound transducer, PMUT. A significant advantage of PMUTs over CMUTs is that PMUTs do not require a large dc bias voltage, making PMUTs more implant-friendly. However, as mentioned previously, there has not been a definitive model proposed for the PMUT, which has resulted in its low adoption rate. Research into PMUT design, fabrication, modelling and applications [157], [158] is important and can generate innovations in biomedical IC design.
Secondly, with Moore's law having effectively reached its limit, packaging technology has gained attention and popularity. It is true that for most ultrasound ICs, the analog circuits do not need to use very small feature sizes. However, the point to make here is that advancements in packaging technology (e.g. chiplets) can pave the way toward better heterogeneous integration. Improving the integration of CMOS circuits and transducers can revolutionise the application space of medical ultrasound.
Thirdly, the exponential rise of artificial intelligence (AI) technology opens up new possibilities for ultrasound imaging systems. Deep learning has already been applied to medical ultrasound imaging [159], [160]. Ultrasound imaging quality is largely dependent on three broad factors: transducer quality, image reconstruction algorithms, and IC performance. The IC performance, more specifically the RX circuit has been typically regarded as the bottleneck in ultrasound imaging quality. Early ultrasound imaging ICs contained only the most basic functions which constrained the imaging quality. Subsequent research into ultrasound imaging ICs faced a very challenging task of realising more advanced features such as continuous gain control and transducer element pitch-matching. These advanced integrated features had a direct positive impact on the imaging quality. For instance, in [99], the continuous gain control resulted in a clear image without saturation or blurring; in [108], the pitchmatched IC helps to improve imaging quality by reducing side lobes greatly. Given that it is very challenging to design highperformance ultrasound imaging IC, an interesting problem to explore is if it is possible to relax the burden of IC design and compensate with improved signal processing algorithms.
There are also new ultrasound applications being discovered. A prominent example is the discovery of ultrasound neuromodulation [161], [162] which opens the possibility for the use of ultrasound in more therapeutic applications. For decades, imaging has dominated the medical ultrasound research arena with therapy being the undercurrent. However, this situation could change. Neuromodulation plays a greater role in our society for ameliorating diseases [163] and ultrasound neuromodulation is a valuable addition on top of conventional electromagnetic neuromodulation methods. The design of ICs to target ultrasound neuromodulation remains to be explored further. Closely related to the topic of neuromodulation is the use of ultrasound as a method of wirelessly powering biomedical implants [164]- [166]. This is an active field of research and should be explored in tandem with IC designs for ultrasound neuromodulation.

IX. CONCLUSION
This paper has described the design of ultrasound ICs. To the best of the authors' knowledge, this is the first comprehensive review of IC design for medical ultrasound and beyond. In this paper, a brief overview of the history and present situation of medical ultrasound research has been presented. Next, the basics of ultrasound and transducer modeling have been examined to provide the reader with the necessary foundation. The bulk of this review paper centers on IC implementations for the ultrasound transducer driving circuit, receiver circuit, beamformer and ADC. A significant number of the ultrasound circuits reviewed are part of complete ultrasound systems such as in intracardiac and transesophageal echocardiography probes. Several recommendations have been provided for future work.