A Broadband 300 GHz Power Ampliﬁer in a 130 nm SiGe BiCMOS Technology for Communication Applications

—A broadband three-stage pseudo-differential SiGe- interconnection bipolar transistor (HBT) power ampliﬁer (PA) for high-speed communication at around 300 GHz is presented. The ampliﬁer is fabricated in an experimental 130-nm SiGe BiCMOS technology with an f t / f max of 470/650 GHz. The use of asymmetric coupled line transformers is proposed to facilitate broadband impedance transformation with device reactance compensation at all ampliﬁer interfaces. The ampliﬁer achieves a maximum small-signal power gain of 23.0 dB and a P sat /OP 1dB up to 9.7/6.7 dBm, respectively. It shows a 3-dB bandwidth of 63 GHz (239–302 GHz) in small-signal operation and 94 GHz (223–317 GHz) when saturated. The ampliﬁer consumes about 360 mW at a 3-V supply voltage yielding a peak power-added efﬁciency (PAE) of 1.95% at 260 GHz.


I. INTRODUCTION
F UTURE wireless links, operating in the IEEE 802.15.3d-2017 band around 300 GHz [1], require several tens of gigahertz of absolute radio frequency (RF) bandwidths to support 100+ Gb/s networks [2], [3]. Limited link budget, high receiver noise figure, and other impairments resulting from the circuit operation at the speed-related technology boundary limit systems to low-order modulation schemes with low spectral efficiency [4]- [7]. In addition, gain flatness and group delay stability have growing importance next to output power, power efficiency, and linearity for ultra broadband communication, as they are directly related to the symmetry between the upper and lower sidebands [8]. This affects the I and Q leakage, and thus ultimately limits the maximum data rate even when a high absolute bandwidth is achieved.
While advancements in SiGe-interconnection bipolar transistor (HBT) transistor technology [9], [10]  to increase the operating frequencies above 200 GHz [6], [7], [11]- [20], the aforementioned IEEE standard is still not addressed by silicon solutions. Notably, achievable output powers for power amplifiers (PAs) above 200 GHz range around a few milliwatts [11], [21], [22], which has to be compensated by antenna systems with high directivity (20 dBi and more) to close the link budget in practical communication applications. In this context, previous works have demonstrated highly directive silicon-only communication links at 220-260 GHz over 1 m with data rates up to 110 Gb/s [6], [7], [11]- [14]. Especially for complementary metal-oxidesemiconductor (CMOS), the limited output power and gain has been the main challenge which is addressed by massive power combining and transistor layout optimization [9], [23]- [25]. Despite the physical limitations, only silicon-based systems can provide communication solutions in a mass market due to the existing infrastructure and integration capability. The majority of SiGe-based PAs above 200 GHz uses transmission-line impedance matching networks and inductive peaking of the cascodes [21], [26], [27]. For bandwidth maximation, different stages are typically frequency staggered, while the in-band group delay variations are not considered. In [22], a 3-dB bandwidth of 67 GHz was measured; however, the in-band group delay variations are ±14 ps, which is ±35% of the symbol duration time for a 100-GB/s 16-quadrature amplitude modulation (QAM) modulation data stream. In this article, a high-gain broadband PA operating in the IEEE 802.15.3d-2017 band [1], first presented in [28], is analyzed in detail. The measured results are based on an additional wafer run with improved RF performance. The amplifier as shown in Fig. 1 was fabricated in a 130-nm SiGe HBT BiCMOS technology with f t / f max of 470/650 GHz based on a development state similar to [29].
It consists of three cascaded pseudo-differential cascode stages, each using devices with an emitter area of 6 × 0.96 × 0.12 μm 2 with the cascode layout introduced in Section II. To achieve high absolute bandwidths both in small-signal and large-signal operation with high group delay stability, asymmetric broadside coupled line impedance transforming sections were exploited for impedance matching purposes. The transformers allow for a broadband compensation of the cascode output reactance while providing a high load impedance with minimum insertion loss. A general analysis of these transformers and their use in silicon-PA integration around 300 GHz and the design of such for the interstage and output stage is presented in Section III, resulting in an Block diagram and chip micrograph of the three-stage PA. On-chip baluns are used at the input and output for probing. The biasing of the used cascode cells (CCs) is shown for one cell. The base of the commonbase (CB) stage is biased through a resistive voltage divider connected to the supply voltage (Vcc) which is fed through a center tap in the interstage and output transformers. The base of the common-emitter (CE) stages is biased separately through with 1-k resistors connected to the base.
amplifier providing a small-signal gain of 23 dB with a 3-dB defined bandwidth of 63 GHz. The 3-dB operation bandwidth increases to 94 GHz when operated in saturation with a peak P sat of 9.7 dBm as shown in Section V.

II. DEVICE AND INTERCONNECT MODELING
The amplifier is implemented in a development variant of the SiGe BiCMOS technology SG13G3 of Leibniz Institute for High Performance Microelectronics (IHP). The main features of the HBT technology are outlined in [29] and [30]. Fig. 2 shows comparison of the maximum available gain (G max ) for different amplifier core topologies biased at the f max current density with devices sized to generate 10 dBm of saturated output power at 300 GHz. With a G max of 12 dB, the cascode configuration (without inductive peaking) provides sufficient gain headroom at 300 GHz in this technology for PA implementation given following implementation losses.
For the ideal cascode with an emitter area of 6 × 0.96 × 0.12 μm 2 without any layout, an input impedance of 17.6 -11.5j and an output impedance of 12 -78j are extracted from the simulation at 270 GHz which is the center design frequency (see Fig. 5). The output node presents the hardest challenge for broadband impedance matching due to the high required impedance transformation in the presence of the high Q-factor of the cascode output node. Therefore, layout implementation should minimize the introduced parasitics as it increases this challenge while allowing access to matching networks implemented on higher metal layers leading to minimized changes in S 11 and S 22 (see Fig. 5). The equivalent output resistance and capacitance Simulated G max of typical differential amplifier topologies for identical emitter areas of 6 × 0.96 × 0.12 μm 2 and f max collector current density. The cascode provides 4 dB more gain at 300 GHz compared with the CB topology, whereas the gain of the ideal CE is lower over the entire frequency range due to the increased feedback.  Table I.   TABLE I LAYOUT EXTRACTED VALUES AT 200 AND 300 GHz of the cascode, modeled as a parallel R C circuit, is 500 /7.35 fF. Therefore, the vertical and horizontal interconnects to and between the devices are optimized for minimum input and output impedance modification. These interconnects are modeled as coupled lines as shown in Fig. 3 with their respective layout extracted values shown in Table I.
The connection of the base of the CB transistors to the biasing networks is modeled with equivalent common-and differential-mode inductances [26], [31]. In the layout, ground planes are established on all sides of each transistor with an additional vertical ground plane along the differential symmetry plane (see Fig. 4). Thereby, the isolation between the devices is improved and the current return path is well-defined for both common and differential-mode excitations.  (a) Resulting input S 11 and output S 22 of the cascode for an ideal cascode, implemented optimized layout and under large-signal excitation between 200 and 350 GHz. The respective termination impedances for the matching networks can be derived from the shown Smith chart. Hundred Ohms is used as reference resistance. (b) Optimum load impedances for maximum output power of a single cascode including the layout from 220 to 300 GHz. The real part of the optimum load impedance decreases with frequency.
At 300 GHz, the vertical connections at the input and output from the lowest M1 to the second highest used metal layer TM1, modeled as coupled lines shown in Fig. 3, cause an impedance transformation along a length of up to 13 • (see Table I). Both the connections are arranged in a vertical staircase similar to [22] and [32]. For the output of the cascode, this creates an additional parallel parasitic capacitance increasing the initial 7.35 to 10 fF (see Fig. 5). At the same time, the resistance decreases to 370 .
The physical distance between the collector of the CE and the emitter of the CB transistors on M1 and M2 is given by the active area of the two transistors and the intrinsic transistor layout with the base of the CB stage placed below the emitter of the CB stage (see Fig. 4). These two restrictions lead to a distance of 17 • at 300 GHz, which in combination with the low characteristic line impedance will again cause an unwanted impedance transformation. Wiring this connection on higher metal layers not only increases the electrical length but also increases the coupling between the input and the output. The impedance transformation along this path helped to increase the output resistance again to 520 without affecting the capacitance.
For a ×6 device, the base width of 14 μm leads to different inductance values observed from the innermost and outermost transistor fingers. As already shown in [22], this differential inductance can cause instability when an uncompensated inductance of 7 pH is applied to the base nodes of the CB stage. Therefore, the base strip and the additional wiring parasitics are compensated using bypass capacitors on the sides and an additional distributed bypass capacitor (C bypass ) across the bases of the CB stage (see Fig. 4). The base is directly vertically connected to the bottom side of the bypass capacitor. Thereby, parasitic differential inductance seen from the outermost finger (L diff,outer ) is kept to a maximum of 1.72 pH at 300 GHz, while the innermost base finger sees a differential inductance (L diff,inner ) of 0.3 pH which remains constant over the observed frequency range (see Table I).
The resulting input and output impedances of the cascode including the layout can be derived from the S-parameters shown in Fig. 5. The ideal output impedance of 12 -70 j at 270 GHz is decreased by the layout parasitics to 6.5 -58 j, which again increases for large-signal excitation to 9.5 -58 j when the generated output power is saturated. The increased output capacitance creates a challenge for compensation using transmission lines at 270 GHz, as their length (≤20 μm) cannot be modeled accurately [33]. In addition, this compensation is typically narrowband creating an additional challenge for broadband interstage and output impedance matching networks.
For the device and interconnect model (see Fig. 3), the load-pull simulation is used to determine the optimum load impedance for a maximum output power of 10 dBm at 300 GHz (see Fig. 5). The chosen device size with an emitter area of 6 × 0.96 × 0.12 μm 2 presents the minimum device size for the desired output power, while further increasing the device size adds additional modeling inaccuracy, especially for longer CB base strip, and will further decrease the output impedance which leads to higher required impedance transformation ratios, especially in the final stage, which ultimately limits the bandwidth.
Asymmetric broadside coupled line transformers propose a solution for broadband impedance matching with high impedance transformation ratios, most severe for the final transformation to the external 100 . In addition, they allow for common-mode (CM) rejection and improved group delay flatness in comparison to transmission-line stub-based impedance matching networks.
In the following, these asymmetric broadside coupled line sections are introduced and presented in detail to generate broadband impedance matching networks at the input-output, and between each of the three stages.

A. Theoretical Background
As classical transformers operating below their selfresonance frequency (SRF) are inaccessible at around 300 GHz, coupled transmission lines can be effectively applied to transform the impedance between two network nodes. In particular, for purely real termination/source impedances, the required line sections are typically around a quarterwavelength. The operation bandwidth of such a transforming section can be further traded against the tolerable in-band mismatch and group delay distortion. In the simplest case, symmetric line arrangements can be exploited with regular even-/odd-mode analysis. Here, the line coupling factor, "k," controls the impedance transformation ratio, which involves a difference between even-and odd-mode line impedances [34], [35]. The required impedance difference scales not only with a transformation ratio but also with the absolute values of the source and load impedances, similar to a regular quarter-wave line section. Some impedance transformation ratios may require very high coupling factors, physically unreachable for the commonly applied edge coupled lines.
In a seven-layer back end of line (BEOL) stack, broadside coupling between lines can be used to overcome the previously mentioned limitations of symmetric lines with additional design flexibility due to variable vertical line spacing (see Fig. 4). In this case, however, a simple even/odd symmetry plane referring to a global ground cannot be applied, and a more general modal analysis valid for an asymmetric couple line system in inhomogeneous media should be used [36]- [38].
In this case, the line behavior can be expressed in terms of two independent (normal) modes, known as "c" and "π." The corresponding modal impedance parameters become Z c,t , Z π,t and Z c,b , Z π,b for the top and bottom strips, respectively, of a broadside coupled line section. Both the modes correspond to a linear combination of voltages and currents on the two lines under certain magnitude and phase relationships with the corresponding voltage ratios (R c /R π ) between both the strips. The propagation constants, γ c and γ π , for both the modes of operation are in general different in an inhomogeneous dielectric stack. It can be further shown that the following holds: For a class of lossless symmetrical lines, R c = 1 and R π = −1, and the two modes are known as "even" and "odd." The network model of a general coupled line section can be expressed by a 4 × 4 Z matrix [36], whereby each entry in the matrix is a linear superposition of two terms, each depending on either "c" or "π" mode. Therefore, the complete four-port network model becomes a series connection of two subnetworks, where each of them represents a transmission-line section with the corresponding set of two transformers related to either R c or R π .
Assuming that the impedance transformer can be considered a suitable connection of two coupled line sections with an ac ground for differential mode of operation (see Fig. 6), the general four-port line model for each coupled line section in the transformer can be reduced to two ports assuming a short-circuit at the nodes "1, 1 " and "3, 3 ." Assuming further for simplicity that the propagation constants γ c and γ π are equal, and each line section operates around the quarter-wavelength (sinh γ c l = sinh γ π l = 1 and coth γ c l = coth γ π l = 0), the effective transforming impedance between Simplified coupled line section model of the transformer for differential excitation scheme. nodes "2, 2 " and "4, 4 " providing an ideal match at both the ports can be expressed, by analogy to a regular λ/4 line section, as follows: where Z g and Z l denote the termination impedances at ports "2, 2 " and "4, 4 ", respectively, and the other variables are defined as It can be noted that opposed to the symmetric line arrangement, the impedance transformation involves not only the line modal impedances but also R c and R π , which provides additional degrees of freedom in the design process. The same transforming section can be applied to implement different transformation ratios provided that (1) holds. It should, however, be expected that with an increasing transformation ratio, the resulting bandwidth reduces and the in-band insertion loss increases [39].

B. Overview of Transformer Design
All implemented coupled line distributed transformers exploit the available seven-layer BEOL stack to widely vary the requested impedance transformation ratio. In particular, the four-layer buried stack enclosed between TM1 and M3 is mostly used in the transformers. Thereby an improved matching between phase velocities of "c" and "π" modes is achieved. The bottom strip is chosen to be wider, wherever possible, due to its lower thickness compared with a 2-μm-thick TM1 to realize the required characteristic impedances. For improved line coupling, the bottom ground is removed with the line sections referring to the side ground only, which is located in close proximity to make the layout compact. Its presence is fully included in the modal analysis of the coupled line sections. Imperfect isolation of the coupled strips from the surrounding global ground is commonly considered a performance degradation factor and therefore often omitted in the design process. The side ground plane enclosing the transformer layout typically spans through the complete TM2-M1 metal stack to shield it from the neighboring components.
All relevant previously defined line modal parameters were extracted from 2-D quasi-static simulations of a general multi-conductor system with the cross sections corresponding to the most relevant parts of the transformer layout. They provided an initial design guess for the consecutive full-wave optimization of the complete structure.
All buried coupled line sections are operated in the proximity of the critical point [38], wherein the capacitively and inductively defined line coupling factors "k c " and "k l " are nearly equal and the resulting voltage ratios R c and R π are in-phase (lossless case). According to Z c,t /Z c,b = Z π,t /Z π,b = −R c R π , it implies that some modal impedances become negative, which is not a contradiction but it is in contrast to the most common even/odd analysis with the voltage ratios out of phase. The negative −R c R π product indicates that the normal modes cannot be excited separately from each other by any combination of voltage sources and terminating impedances and both the modes effectively co-exist in the coupled line section [40]. It can be further shown that the effective coupling factor between the strips may differ from "k c " and "k l " to account for a relative difference in the velocity between two normal modes [38]. For the chosen buried stack, the asymmetry in phase velocity is below 6% for all line implementations.

C. Transformer for the Input Stage Matching
The ×6 device used for the cascode layout in Section II was used in all the stages. While smaller devices provide a higher input impedance and therefore would reduce the necessary impedance transformation ratio of the input transformator, the challenge for the interstage matching is further increased due to the increased impedance transformation ratio. According to the post-layout simulated input impedance of the cascode stage, shown in Fig. 5, the transforming coupled line section should provide broadband 100-to-20-impedance conversion for differential signaling. The corresponding 3-D simulation model of the optimized transformer layout is shown in Fig. 7 with all the major dimensional parameters indicated. The coupled lines are located on TM1 and M4 with a 1.88 μm vertical spacing. The complete layout is subdivided into three line subsections, named from "1" to "3," each one implementing a slightly different transformation ratio to optimize the operation bandwidth in the presence of multiple reactive discontinuities resulting from the full wiring system. A simplified equivalent circuit model corresponding to half of the layout is included in Fig. 8.
The most relevant dimensions for each section in the model from Fig. 8 with the corresponding modal parameters extracted from 2-D simulations are shown in Table II. The parameters are, in general, complex numbers due to the presence of ohmic and dielectric losses and only their magnitudes are given for   simplicity. Only two selected modal impedances are given, which are positive. The other two are negative and can be calculated from Z c,t /Z c,b = Z π,t /Z π,b = −R c R π . Z eff stands for the effective impedance of the transforming section, as defined in (1). The line coupling factor, "k," is further given for completeness and is calculated as an average of the capacitive and inductive factors. The effective length, , of each line section at 300 GHz is calculated along the corresponding outer perimeter. The overall transformer length, as calculated from 2-D simulations, deviates considerably from 90 • as it does not account for the presence of reactive discontinuities along the tightly coupled closed line loop. Z 0,source and Z 0,load are the characteristic impedances of the differential lines (120 ) connecting the transformer to the external feed lines (TM1) on both sides. They contribute to an additional phase shift of around 5 • at 300 GHz. To minimize CM radiation into the substrate, additional vertical and horizontal ground strips on M1 were added to reduce the aperture size of the ground opening, which at the same time slightly reduced the operational bandwidth.
The input transformer terminated with pure real terminations of 100 and 20 has a simulated −10-dB defined operation S 11 bandwidth of around 250-325 GHz with an in-band insertion loss of 1.2 dB.

D. Inter/Output Stage Matching Transformer
The primary design goal for the interstage transformer is to provide a broadband impedance conversion from around 20 at the input of the consecutive stage to sufficiently high values close to Z opt at the output of the preceding cascode stage in the presence of a large capacitive load of the ×6 devices (high Q-factor). The design technique of the input stage transformer cannot be applied, as the here required effective impedance faces the layout feasibility limitations at 300 GHz. Therefore, to reach the higher impedance ratio needed for interstage impedance matching with an additionally improved operation bandwidth, an alternative approach was applied. A 3-D simulation model with the corresponding simplified equivalent circuit model for the impedance transformer between the first two amplification stages is shown in Fig. 9.
Here, the required impedance transformation is achieved in multiple steps. First, the high-impedance coupled-line system translates the elongated arch-like trajectory at the cascode output, defined by the capacitive load of the ×6 devices (see Fig. 5), into a small tear-drop-like loop in the inductive part of the Smith chart. The final location along the real axis of the Smith chart is achieved with an appropriate length   of a 100 differential line (Z feed ) in combination with a series metal-insulator-metal (MIM) capacitor, C ser . The main part of the coupled line network is divided into two subquarter wavelength long sections, "2" and "3," each one again implementing a different transformation ratio to improve the operation bandwidth. The values of the effective transforming impedance, Z eff , with the related strip dimensions, impact both the transformer insertion loss and bandwidth. For the implementation of the required transformation ratio, a zero-length feed line, Z feed , is needed, and a series capacitor, C ser , of 15 fF is placed directly between layers TM1 and M5 at the transformer output port. The transformer is tuned to the middle of the amplifier frequency operation band.
A similar topology was applied to the interstage transformer between the second and third stages as well as to the output stage, with the center frequency tuned differently (see Fig. 10). The output stage presents the most challenging transformation scenario to a high-impedance load of 100 . In both the second interstage and the output transformer, the feed line length, Z feed , and C ser are increased to 17 and 34 μm and to 15 and 7 fF, respectively.
By analogy to the input stage, all the major dimensions for each coupled section of the two interstage and output stage transformers with the corresponding modal parameters de-embedded from the 2-D simulations are given in Tables III-V. Fig. 10. Simulated small signal of the individual stages and the total amplifier without balun-integrated pads. As the input stage is tuned to the middle/top end of the band, the output stage compensates for this, resulting in a flat gain response. The input and output show a broadband match over the entire frequency range. The increased output impedance for large-signal operation alleviates impedance matching of the output transformer, causing a better match to the external 100 load. S 11 did not change with increasing input power and therefore was left out of this figure.

E. Transformer-Integrated Common-Mode Rejection Network
For CM excitation, the ac ground, as shown previously in Fig. 6, established in the vertical transformer center is replaced with a magnetic wall symmetry. This results in open-end terminations along 1 − 1 and 3 − 3 , which are further transformed into equivalent short-circuit connections for 2 − 2 and 4 − 4 , assuming a perfect quarter-wave long coupled line section. This, however, does not necessarily apply to the surrounding close proximity global ground and results in partial leakage between the input and output ports. By placing an MIM capacitor in the layout center of the transformer, this issue of CM is largely eliminated.
The capacitor is shunt on one side to the global ground provided by the center bottom strip on M1, whereas the other plate can be divided into two separate plates and connected with the respective center taps on both sides to provide two separate bias for the neighboring stages. This way, a global ac short on both the transformer ports is provided, not only between the strips but also with respect to the surrounding ground. This is, however, still not optimal from the CM rejection point of view. If only one center tap is connected with the decoupling capacitor, a global short-circuit established on one side is transformed into a global open-circuit on the other port, providing a broadband CM suppression across multiple hundred gigahertz. Due to the high-Q factor at the cascode outputs, they were dc-biased from the center tap feed point, whereas the input ports of all the amplification stages were dc-supplied from the separate networks shown in Fig. 1. Given the limited transformer inner dimensions, the size of the required decoupling capacitor could not be Fig. 11. Ideal load impedances and the respective load impedances generated by each transformer. The generated load impedances increase with frequency while the optimum load impedance decreases. This contrary behavior leads to broadband power generation. The contour curves mark the impedance area for which an output power of 10 dBm can be generated.
fully accommodated in the layout center, and, therefore, two additional side-located capacitors were further implemented, as shown in Fig. 9. The initial CM rejection was increased from 4 to 30 dB in the entire frequency band, resulting in a simulated CM gain of −38 dB for the three-stage amplifier.
With respect to load-pull results, it can be seen that each transformer (see Fig. 11) provides impedances close to the Z opt impedance at each stage.

IV. INPUT-OUTPUT BALUN
To facilitate broadband measurements of PA for both smalland large-signal operations with minimum de-embedding effort, a broadband Marchand-type balun with compensation of the pad's low-pass characteristic was applied at the amplifier input and output ports. Similar to the transformer sections, the balun takes advantage of the asymmetric coupled lines and transforms the reactive impedance provided by the pad to a differential 100 along multiple sections.
The balun was measured in a back-to-back configuration for the consecutive second-tier de-embedding procedure with a maximum insertion loss of 1.7 dB and an input match superior to −15 dB in the entire 220-320-GHz band.
V. MEASUREMENT Fig. 1 shows the micrograph of the amplifier. The total chip area for this circuit is 0.26 mm 2 , while the amplifier, excluding the baluns and signal pads, covers an area of 0.07 mm 2 . The amplifier was characterized for both small-signal and largesignal operations, using frequency extenders in combination with a vector network analyzer (VNA) for S-parameter measurements (see Fig. 12) and high power sources and a PM4 power meter for large-signal measurements.
The presented amplifier is based on an experimental wafer run, which can be seen in the increased overall S 21 (see Fig. 13) in comparison to [28], leading to a maximum S 21 of 20.1 dB at 254 GHz with a 3-dB bandwidth of 63 GHz, spanning from 239 to 302 GHz for a supply voltage of 3 V. Both S 11 and S 22 show a broadband match.
Changing the supply voltages in both the directions decreases the small-signal gain as cascode biasing is dependent Fig. 12. Measurement setup for both small-signal and large-signal measurements. The small-signal calibration was done using thru-reflect-line (TRL) standards. For large signal, a two-tier calibration was done including the subtraction of the waveguide losses.  Measured group delay of the amplifier. Over the small-signal bandwidth (239-302 GHz), the group delay stays within ±4 ps. Similar values were measured for varying supply voltages. on the supply voltages (see Fig. 1). At the same time, the input and output match remains unaffected. The measured k-factor also shows that the amplifier remains unconditionally stable in the measured frequency range.   16. Saturated output power of the amplifier in the entire frequency band with and without balun losses. A peak P sat of 7.3 dBm at 259 GHz with a saturated output power 3-dB bandwidth 94 GHz (223-317 GHz) for a supply voltage of 3 V. This output power was further increased to 8.2 dBm when 4 V is applied. PAE and OP 1 dB remain mostly constant over frequency and biasing, while PAE slightly worsens for 4 V. When the losses of the pad and balun are de-embedded, a peak output power of 9.7 dBm is reached.
For bit rates of 100 GB/s and more, the amplifier's in-band group delay variations have to be low to avoid intersymbol interference. As shown in Fig. 14, the measured group delay stays within ±4 ps in the entire frequency band, which is ±10% of the symbol duration time in a 100-GB/s 16-QAM modulation data stream and shows a very high correlation to the simulation.
For large-signal measurements, the input power was adjusted with a pre-calibrated mechanical attenuator (see Fig. 12). The loss of the probes and additional s-bend waveguides was subtracted from the measurement.
The thereby measured output power (P out ), large-signal transducer gain (G T ), and PAE at 270 GHz are shown in Fig. 15. G T aligns with the measured small-signal gain (see Fig. 13), while an output-referred P 1 dB compression Over frequency, these performance metrics are also shown in Fig. 16 with a maximum OP 1 dB of 5.2 dBm and maximum PAE of 1.38% at 260 GHz.
The bandwidth is extended from 63 to 94 GHz (223-317 GHz) for saturated output power. The maximum output power of 7.3 dBm is reached at 259 GHz. The amplifier consumes about 360 mW for a supply voltage of 3 V and a current density of 27.49 μA/mm 2 close to the f t / f max current density (26 μA/mm 2 ). By de-embedding the losses of the broadband balun, the power delivered to an external 100 load is estimated as shown in Fig. 16, and the de-embedded small-signal S 21 is 23 dB at its peak. A maximum de-embedded output power of 8.7/9.7 dBm for 3-and 4-V supply voltage, respectively, is achieved at 259 GHz.

VI. CONCLUSION
The presented amplifier shows high gain over a large bandwidth, covering most of the frequency band of the IEEE 802.15.3d-2017 standard [1] while generating high output power levels in a small circuit size (see Table VI). This bandwidth is further extended in large-signal operation, resulting in a 3-dB bandwidth of 94 GHz covering most of the J-band (see Fig. 16). At the same time, OP 1 dB and PAE stay mostly constant over the entire frequency band. This in combination with the generated group delay (see Fig. 14) makes this amplifier a promising key component in broadband communication systems. The advances in the transistor technology here enable a further increase in the operating frequency closer to 300 GHz, while the implemented impedance matching networks allow for a broadband output power generation.