A Cost and Power Feasibility Analysis of Quantum Annealing for NextG Cellular Wireless Networks

In order to meet mobile cellular users' ever-increasing data demands, today's 4G and 5G wireless networks are designed mainly with the goal of maximizing spectral efficiency. While they have made progress in this regard, controlling the carbon footprint and operational costs of such networks remains a long-standing problem among network designers. This article takes a long view on this problem, envisioning a NextG scenario where the network leverages quantum annealing for cellular baseband processing. We gather and synthesize insights on power consumption, computational throughput and latency, spectral efficiency, operational cost, and feasibility timelines surrounding quantum annealing technology. Armed with these data, we project the quantitative performance targets future quantum annealing hardware must meet in order to provide a computational and power advantage over complementary metal–oxide semiconductor (CMOS) hardware, while matching its whole-network spectral efficiency. Our quantitative analysis predicts, that with 82.32 $\mu$s problem latency and 2.68 M qubits, quantum annealing will achieve a spectral efficiency equal to CMOS while reducing power consumption by 41 kW (45% lower) in a large MIMO base station with 400-MHz bandwidth and 64 antennas, and a 160-kW power reduction (55% lower) using 8.04 M qubits in a centralized radio access network setting with three large MIMO base stations.


Introduction
Today's 4G and 5G Cellular Radio Access Networks (RANs) are experiencing unprecedented growth in traffic at base stations (BSs) due to increased subscriber numbers and their higher quality of service requirements [17,64].To meet the resulting demand, techniques such as Massive Multiple-Input Multiple-Output (MIMO) communication, cell densification, and millimeter-wave communication are expected to be deployed in fifth-generation (5G) cellular standards [34].But this in turn significantly increases the power and cost required to operate RAN sites backed by complementary metal oxide semiconductor (CMOS)-based computation.While research and industry efforts have provided general solutions (e.g., sleep mode [56] and network planning [82]) to increase energy efficiency and decrease power consumption of RANs, the fundamental challenge of power requirements scaling with the exponentially increasing computational requirements of the RAN persists.Previously (ca.2010), this problem had not limited innovation in the design of wireless networks, due to a rapid pace of improvement in CMOS's computational efficiency.Unfortunately however, today, such developments are not maintaining the pace they had in past years, due to transistors approaching atomic limits [18] and the end of Moore's Law (expected ca.2025-2030 [44,50,68]).This therefore calls into question the prospects of CMOS to achieve NextG cellular targets in terms of both energy and spectral efficiency.
This work investigates a radically different baseband processing architecture for RANs, one based on quantum computation, to see whether this emerging technology can offer cost and power benefits over CMOS computation in wireless networks.We seek to quantitatively analyze whether in the coming years and decades, mobile operators might rationally invest in the RAN's capital (CapEx) by purchasing quantum hardware of high cost, in a bid to lower its operational expenditure (OpEx) and hence the Total Cost of Ownership (TCO = CapEx + OpEx).The OpEx cost reduction would result from the reduced power consumption of the RAN, due to higher computational efficiency of quantum processing over CMOS processing for certain heavyweight baseband processing tasks.Figure 1 depicts this envisioned scenario, where quantum processing units (QPUs) co-exist with traditional CMOS processing at Centralized RAN (C-RAN) Baseband Units (BBUs) [1,15].QPUs will then be used for the BBU's heavy baseband processing, whereas CMOS will handle the network's lightweight control plane processing (e.g., resource allocation, communication control interface), transfer systems (e.g., enhanced common public radio interface, mobility management entity), and further lightweight tasks such as pre-and post-processing QPU-specific computation.
Fig. 2: Projected year-by-year timeline of QA-based RAN processing.Data points (▲) in the hatched area (2011-2020) represent the historical QA qubit counts.The 2023 data point (★) with 7,440 qubits corresponds to a next-generation QA processor roadmap [24,26].The blue filled (dark shade) area is the projected QA qubit count, whose upper/lower bounds are extrapolations of the best-case (2017-2020) and the worst-case (2020-2023) qubit growths respectively.Annotations corresponding to further data points (■) show the base station (BS) scenarios their respective qubit counts will enable (see §6).The figure shows that if future QA qubit count scales along this best-case trend, starting from the year 2036, QA may be applicable to practical wireless systems with power/cost benefits over CMOS hardware (see §6).
This paper presents the first extensive analysis on power consumption and quantum annealing (QA) architecture to make the case for the future feasibility of quantum processing based RANs.While recent successful point-solutions that apply QA to a variety of wireless network applications [8,13,14,19,42,47,48,51,55,77,78] serve as our motivation, previous work stops short of a holistic power and cost comparison between QA and CMOS.Despite QA's benefits demonstrated by these prior works in their respective point settings, a reasoning of how these results will factor into the overall computational performance and power requirements of the base station and C-RAN remains lacking.Therefore, here we investigate these issues head-on, to make an end-toend case that QA will likely offer benefits over CMOS for handling BBU processing, and to make time predictions on when this benefit will be realized.Specifically, we present informed answers to the following questions: In order to realize the architecture of Figure 1, several key system performance metrics need to be analyzed, quantified, and evaluated, most notably the computational throughput and latency ( §3), the power consumption of the entire system and resulting spectral efficiency (bits per second per Hertz of frequency spectrum) and operational cost ( §5).Our approach is to first describe the factors that influence processing latency and throughput on current QA devices and then, by assessing recent developments in the area, project what computational throughput and latency future QA devices will achieve ( §3).We analyze cost by evaluating the power consumption of QA and CMOS-based processing at equal spectral efficiency targets ( §5).Our analysis reveals that a three-way interplay between latency, power consumption, and the number of qubits available in the QA hardware determines whether QA can benefit over CMOS.In particular, latency influences spectral efficiency, power consumption influences energy efficiency, and the number of qubits influences both.Based on these insights, we determine properties (i.e., latency, power consumption, and qubit count) that QA hardware must meet in order to provide an advantage over CMOS in terms of energy, cost, and spectral efficiency in wireless networks.
Table 1 summarizes our results, showing that for 200 and 400 MHz bandwidths, respectively, with 1.54 and 3.08M qubits, we predict that QA processing will achieve spectral efficiency equal to today's 14 nm CMOS processing, while reducing power consumption by 8 kW (16% lower) and 41 kW (45% lower) in representative 5G base station scenarios.In a C-RAN setting with three base stations of 200 and 400 MHz bandwidths, QA processing with 4.62M and 9.24M qubits, respectively, reduces power consumption by 70 kW (41% Our further evaluations compare QA against future 1.5 nm CMOS process, which is expected to be the silicon technology at the end of Moore's law scaling (ca.2030 [44]).In a base station scenario with 400 MHz bandwidth and 128 antennas, QA with 6.2M-qubits will reduce power consumption by 30.4 kW (37% lower), in comparison to 1.5 nm CMOS, while achieving equal spectral efficiency to CMOS.
Figure 2 reports our projected QA feasibility timeline, describing year-by-year milestones on the application of QA to wireless networks.Our analysis shows that with custom QA hardware (cf.§2) and qubits growing 2.65× every three years (the 2017-2020 trendline), QA application in practical RAN settings with potential power/cost benefit is a predicted 15 years (ca.2036) away, whereas the feasibility in processing for a base station (BS) with 10 MHz bandwidth and 32 antennas is a predicted five years away (ca.2026) (cf.§6).
Overall, our quantitative results predict that QA hardware will offer power benefits over CMOS hardware in certain wireless network scenarios, once QA hardware scales to at least a million qubits (cf.§6) and reduces its problem processing time to hundreds of microseconds, which we argue is feasible within our projected timelines.Scaling QA processors to millions of qubits will pose challenges related to engineering, control, and operation of hardware resources, which designers continue to investigate [11,12].Recent further work demonstrates large-scale qubit control techniques, showing that control of million qubit-scale quantum hardware is already at this point in time a realistic prospect [74].
Roadmap.In the remainder of this paper, Section 2 describes background and assumptions, Section 3 analyzes QA hardware architecture and its end-to-end processing latency, and Section 4 describes power modeling in RANs and cellular computational targets.We will then be in a position to present our CMOS versus QA power comparison methodology and results in Section 5. We conclude by discussing a projected feasibility timeline of QA-based RANs in Section 6.

Background and Assumptions
While classical computation uses bits to process information, quantum computation uses qubits, physical devices that allow superposition of bits simultaneously [23].The current technology landscape consists broadly of fault-tolerant approaches to quantum computing versus noisy intermediate scale quantum (NISQ) implementations.Fault-tolerant quantum computing [65,69] is an ideal scenario that is still far off in the future, whereas NISQ computing [66], which is available today, suffers high machine noise levels, but gives us an insight into what future fault-tolerant methods will be capable of in terms of key quantum effects such as qubit entanglement and tunneling [66].NISQ processors can be classified into digital gate model or analog annealing (QA) architectures.
Gate-model devices [54] are fully general purpose computers, using programmable logic gates acting on qubits [81], whereas annealing-model devices [23], inspired by the Adiabatic Theorem of quantum mechanics, offer a means to search an optimization problem for its lowest ground state energy configurations in a high-dimensional energy landscape [10].While gate-model quantum devices of size relevant to practical applications are not yet generally available [41], today's QA devices with about 5,000 qubits enable us to commence empirical studies at realistic scales [23].Therefore we conduct this study from the perspective of annealing-model devices.

Quantum Annealer Hardware
Quantum Annealing (QA) is an optimization-based approach that aims to find the lowest energy spin configuration (i.e., solution) of an Ising model (defined in §2.2) described by the time-dependent energy functional (Hamiltonian): where   is the initial Hamiltonian,   is the (input) problem Hamiltonian, s (∈ [0, 1]) is a non-decreasing function of time called an annealing schedule, Γ() and L(s) are energy scaling functions of the transverse and longitudinal fields in the annealer respectively.Essentially, Γ() determines the probability of tunneling during the annealing process, and () determines the probability of finding the ground state of the input problem Hamiltonian   [23].The QA hardware is a network of locally interacting radio-frequency superconducting qubits, organized in groups of unit cells.Fig. 3 shows the unit cell structures of recent (Chimera) and state-of-the-art (Pegasus) QA devices.The nodes and edges in the figure are qubits and couplers respectively (detailed below) [47].

Quantum Annealing Algorithm
The process of optimizing a problem in the QA is called annealing.Starting with a high transverse field (i.e., Γ(0) >> (0) ≈ 0), QA initializes the qubits in a pre-known ground state of the initial Hamiltonian   , then gradually interpolates this Hamiltonian over time-decreasing Γ() and increasing L(s)-by adiabatically introducing quantum fluctuations in a low-temperature environment, until the transverse field diminishes (i.e., (1) ≫ Γ(1) ≈ 0).This time-dependent interpolation of the Hamiltonian is essentially the annealing algorithm.The Adiabatic Theorem then ensures that by interpolating the Hamiltonian slowly 2 enough, the system remains in the ground state of the interpolating Hamiltonian [7].Thus during the annealing process, the system ideally stays in the local minima and probabilistically reaches the global minima of the problem Hamiltonian   at its conclusion [23].
The initial Hamiltonian takes the form   =     , where    is the result of the Pauli-X matrix 0 1 1 0 acting on the  ℎ qubit.Thus, the initial state of the system is the ground state of this   , where each qubit is in an equal superposition state The problem Hamiltonian is described by where    is the result of the Pauli-Z matrix 1 0 0 −1 acting on the  ℎ qubit, ℎ  and    are the optimization problem inputs that the user supplies [4,23].Input Problem Forms.QAs optimize Ising model problems, whose problem format matches the above problem Hamiltonian:  =  ℎ    + <         , where E is the energy of the candidate solution,   is the  ℎ solution variable which can take on values in {−1, +1}, ℎ  and    are called the bias of   and the coupling strength between   and   , respectively.Biases represent individual preferences of qubits to take on a particular classical value (−1 or +1), whereas coupling strengths represent pairwise preferences (i.e., two particular qubits should take on same/opposite values), in the solution the machine outputs.Biases and coupling strengths 2 If the adiabatic evolution is infinitely slow, then the annealing algorithm is guaranteed to find the global minima of   [70].
Fig. 4: The figure shows embedding process of Eq. 2, where the logical variable  3 in (a) is mapped onto two physical qubits  3 and  3 as in (b) with a JFerro of   (dotted).
are specified to qubits and couplers, respectively, using a programmable on-chip control circuitry [46,52].The QA returns the solution variable configuration with the minimum energy E at its output [47].
Assumption 1-Ising Model formulation.To enable QA computation, cellular baseband's heavy processing tasks must be formulated as Ising model problems.Recent prior work in this area has formulated the most heavyweight tasks in the baseband, such as frequency domain detection, forward error correction, and precoding problems, into Ising models [8,19,47,48,51].Further baseband tasks (e.g.filtering) will either admit Ising model formulations via binary representation of continuous values [6,61] (we leave for future work), or are so lightweight they require negligible power.

Input Problem Embedding
The process of mapping a given input problem onto the physical QA hardware is called embedding.To understand embedding, let us consider an example Ising problem: The direct/logical representation of Eq. 2 is depicted in Fig. 4(a), where nodes and edges in the figure are qubits and couplers respectively.The curved arrows in the figure are used to visualize the linear coefficients.However, observe that a complete three-node qubit connectivity does not exist in the Chimera graph (cf.Fig. 3(a)).Hence the standard approach is to map one of the logical problem variables (e.g.,  3 ) onto two physical qubits (e.g.,  3 and  3 ) as Fig. 4(b) shows, such that the resulting connectivity can be realized on the QA hardware.
To ensure proper embedding:  3 and  3 must agree with each other.This is achieved by enforcing the condition ℎ 3 = ℎ 3 + ℎ 3 , and chaining these physical qubits with a strong ferromagnetic coupling called JFerro (  )-see dotted line in Fig. 4(b).The physical Ising problem the QA optimizes for the example in Eq. 2 is then: Programming Anneal Readout Readout Delay

QPU Access Time
Post-processing Fig. 5: Timing diagram of a quantum annealer device.Machine access overheads not relevant to our proposed use case are omitted.Post-processing runs on integrated silicon, in parallel with the annealer computation [23].
Assumption 2-Bespoke QA hardware.Qubit connectivity significantly impacts performance, with sparse connectivity negatively affecting dense problem graphs due to problem mapping difficulties [47].Recent advances in QA have bolstered qubit connectivity-6 to 15 to 20 couplers per qubit in the Chimera (2017), Pegasus (2020), and Zephyr (ca.2023-24) topologies respectively [25,26]-while further improvement efforts continue [49,58], which will allow QA hardware tailored to baseband processing problems within the timescales of our predictions, resulting in a highly efficient minor embedding process.

Quantum Processing Performance
To characterize current and future QA performance, this section analyzes processing time on QA devices, the client of which sends quantum machine instructions (QMI) that characterize an input problem computation to a QA QPU.The QPU then responds with solution data.Fig. 5 depicts the the entire latency a QMI experiences from entering the QPU to the readout of the solution, which consists of programming ( §3.1), sampling ( §3.2), and post-processing ( §3.3) times.

Programming
As the QMI reaches the QPU, the QPU programs the QMI's input problem coefficients-biases and coupling strengths ( §2): room temperature electronics send raw signals into the QA refrigeration unit to program the on-chip flux digital-toanalog converters (Φ-DACs).The Φ-DACs then apply external magnetic fields and magnetic couplings locally to the qubits and couplers respectively.This process is called a programming cycle, and in current technology it typically takes 4-40 s [22], dictated by the bandwidth of control lines and the Φ-DAC addressing scheme [11].During the programming cycle, the QPU dissipates an amount of heat that increases the effective temperature of the qubits.This is due to the movement of flux quanta 3 in the inductive storage loops of Φ-DACs.Thus, a post-programming thermalization time is 3 QA devices store coefficient information in the form of magnetic flux quanta and it is transferred via single flux quantum (SFQ) voltage pulses [12].
Table 2: The QPU on-chip energy dissipation values for the worst-case programming (i.e., using all qubits and couplers) and their associated thermalization time required for various choices of QPU sizes and Φ-DAC critical currents.

QubitsCouplers Φ-DACs
Energy, Thermalization time required to cool the QPU, ensure proper reset/initialization of qubits, and allow the QPU to maintain a thermal equilibrium with the refrigeration unit (≈20 mK).QA clients can specify thermalization times in the range 0-10 ms with microsecondlevel granularity.The default value on D-Wave's machine is a conservative one millisecond [23].QMI coefficients are programmed by using six Φ-DACs per qubit and one Φ-DAC per coupler, and the supported bit-precision is currently up to five bits (four for value, one for sign) [12].Each Φ-DAC consists two inductor storage loops with a pair of Josephson junctions each.The energy dissipated on chip is on the order of   × Φ 0 per single flux quantum (SFQ) moved in an inductor storage loop, where   is the Φ-DAC's junction critical current and Φ 0 is the magnetic flux quantum. 4For the worst-case reprogramming scenario, this corresponds to 32 SFQs (−16 to +16) moving into (or out of) all inductor storage loops of each Φ-DAC [12].Table 2 reports on-chip energy dissipation values for various QPU sizes and Φ-DAC critical currents, showing that programming an example large-scale device with 10 M qubits and 75 M couplers (15 per qubit [25]) will dissipate only 36 pJ on chip.With typical ≈30 W cooling power available at the 20 mK QPU stage [9], this accounts for 1.2 s of QPU thermalization time.The next step resets/initializes the qubits (cf.§2.2), during which each qubit transitions from a higher energy state to an intended ground state, generating spontaneous photon emissions, heating the QPU.Reed et al. [67] demonstrate the suppression of these emissions using Purcell filters, requiring 80 ns (120 ns) for 99% (99.9%) fidelity.
An   qubit,   coupler, and five-bit precision QA device need to program a worst-case 5 • (  +   ) amount of data, which is 27 Kbytes for the current QA (  = 5,436,   = 37,440) and 100 Mbytes for a large-scale QA (  = 10M,   = 75M).Thus, to maintain today's microsecond level programming cycle time in future large-scale QA, programming control lines' bandwidth must be increased by a factor of 10 3 (i.e., GHz bandwidth lines are needed).By Purcell filter design integration and sufficient amount of control line bandwidth, overall programming time (i.e., coefficient programming time + thermalization and reset) therefore reaches to 42 s in a 10M-qubit large-scale QA device.

Sampling
The process of executing a QMI on a QA device is called sampling, and the time taken for sampling is called the sampling time.The sampling time is classified into three subcomponents: the anneal, readout, and readout delay times.A single QMI consists of multiple samples of an input problem, with each sample annealed and read out once, followed by a readout delay (see Fig. 5).Sampling a QMI begins after the QPU programming process.

Anneal.
In this time interval, the QPU implements a QA algorithm ( §2.2) [23] to solve the input problem, where low-frequency annealing lines control the annealing algorithm's schedule.The bandwidth of these control lines hence limits the minimum annealing time, which is one microsecond today.Weber et al. [79] propose the use of flexible print cables with a moderate bandwidth (≈ 100 MHz) and high isolation (≈ 50 dB) for annealing, which potentially decrease annealing time to tens of nanoseconds.

Readout.
After annealing, the spin configuration of qubits (i.e., the solution) is read out by measuring the qubits' persistent current (  ) direction.This readout information propagates from the qubits to readout detectors located at the perimeter of the QPU chip via flux bias lines.Each flux bias line is a chain of electrical circuits called Quantum Flux Parametrons (QFPs), which detect and amplify qubits'   to improve the readout signal-to-noise ratio.These QFP chains act like shift registers, propagating the information from qubits to detectors [80].In current QA devices with   qubits, there are √︁   /2 flux bias lines, with each flux bias line responsible for reading out √︁ 2  qubits.Further, each flux bias line reads out one qubit at a time (i.e., time-division readout), thus a total of √︁   /2 qubits are readout in parallel.Hence, the readout time depends on the qubits' physical locations, the bandwidth of flux bias lines, and the signal integration time.For the current status of technology, the readout time is 25-150 s per sample [23].Nevertheless, recent research demonstrates promising fast readout techniques, which we describe next.Chen et al. [16] and Heinsoo et al. [38] describe frequencymultiplex readout schemes that enable simultaneous readout of multiple qubits within a flux bias line.While there is no fundamental limit on the number of qubits read out simultaneously, a physical limit is imposed by the line width of qubits' readout microresonators and the 4-8 GHz operating band (6 GHz center frequency, 4 GHz bandwidth) of commercial microwave transmission line components used in the readout architecture [80].Microresonators with quality factor   can capture line widths up to 6/  GHz, thus enabling up to 4×  /6 qubits to be readout simultaneously.Table 3 reports these results, showing that a   of 10 6 will enable up to ≈666 K-qubit-parallel readout.This analysis assumes that each microresonator can be fabricated at exactly its design frequency, which is currently not the case.Further developments in understanding the RF properties of microresonators will therefore be needed to achieve this multiplexing performance.
In order to avoid sample-to-sample readout correlation, microresonators reading out the current sample's qubits must ring down before reading the next sample's qubits.McClure et al. [62] achieve ring-down times on the order of hundreds of nanoseconds by applying pulse sequences that rapidly extract residual photons exiting the microresonators after readout.Fast ring down can also be achieved by switching off the QFP (after the readout) coupled to a microresonator, and then switching on a different QFP that couples the microresonator to a lossy line.While QFP on-off switching takes hundreds of nanoseconds [36,39], it ensures high fidelity readout.
Recent work by Grover et al. [36] show the application of QFPs as isolators, achieving a readout fidelity of 98.6% (99.6%) in 80 ns (1 s) only.Walter et al. [76] describe a single-shot readout scheme requiring only 48 ns (88 ns) to achieve a 98.25% (99.2%) readout fidelity.Their designs are also compatible with multiplexed architectures and earlier readout schemes, implying that by design integration readout time reaches on the order of microseconds per sample.

Readout delay.
After a sample's anneal-readout process, a readout delay is added (see Fig. 5).In this time interval, qubits are reset for next sample's anneal, and QA clients can specify times in the range 0-10 ms, and the default value is a conservative one millisecond.Nevertheless, about one microsecond is sufficient for high fidelity qubit reset ( §3.1) [67].

Postprocessing
This time interval is used for post-processing the solutions returned by QA for improving the solution quality [20].Multiple samples' solutions are post-processed at once in parallel with the current QMI's annealer computation, whereas the final batch of post-processing occurs in parallel with the programming of next QMI (see Fig. 5).Thus, the post-processing time does not factor into the overall processing time [22].
In summary, the projected overall programming time is 42 s (programming: 4-40 s, thermalization and reset: 2 s), anneal time is one s/sample, readout time is one s/sample, and readout delay time is one s/sample.For a target sample count   , total QMI run time is 42 + 3  s.

Power Modeling
RAN power models account for power by splitting the BS or C-RAN functionality into the components and sub-components shown in Figs. 1 and 6.This section details these components and their associated power models.We follow the developments by Desset et al. [29] and Ge et al. [32].

RAN Base Station.
A RAN BS (see Fig. 6) is comprised of a baseband unit (BBU), a radio unit (RU), power amplifiers (PAs), antennas, and a power system (PS).The entire BS power consumption ( BS ) is then modeled as: where   is the  ℎ BS component's power consumption, and  A/C (9%),  MS (7%), and  DC (6%) correspond to fractional losses of Active Cooling (A/C), Mains Supply (MS), and DC-DC conversions of the power system respectively [32].
The BBU performs the processing associated with digital baseband (BB), and control and transfer systems.The baseband includes computational tasks such as digital predistortion (DPD), up/down sampling or filtering, OFDM-FFT processing, frequency domain (FD) mapping/demapping and equalization, and forward error correction (FEC).The control system undertakes the platform control processing (PCP), and the transfer system processes the eCPRI transport layer.The total BBU power consumption ( BBU ) is then [29,32]: where   is the  ℎ computational task's power consumption, and  Leak is the leakage power resulted from the employed hardware in processing these baseband tasks.FD processing is split into two parts, with linear and non-linear scaling over number of antennas [29,32].The RU performs analog RF signal processing, consisting of clock generation, low-noise and variable gain amplification, IQ modulation, mixing, buffering, pre-driving, and analog-digital conversions.RU power consumption ( RU ) scales linearly with number of transceiver chains, and each chain consumes about 10.8 W power [29].
For macro-cell BSs, each PA (including antenna feeder) is typically configured at 102.6 W power consumption [32].

C-RAN.
In the C-RAN architecture, BS processing functionality is amortized and shared, where Remote Radio Heads (RRHs) perform analog RF signal processing and a BBU-pool performs digital baseband computation (of many BSs) at a centralized datacenter (see Fig. 1).Fronthaul (FH) links connect RRHs with the centralized BBU-pool.To relax the FH latency and bandwidth requirements, a part of baseband computation is performed at RRH sites.Several such split models have been proposed [33,57].We consider a split where RRHs perform low Layer 1 baseband processing, such as cyclic prefix removal and FFT-specific computation.The power consumption of C-RAN ( C-RAN ) is then: where   is the  ℎ C-RAN component's power consumption and   is the number of RRHs.Fronthaul power consumption depends on the technology, and for fiber-based ethernet or passive optical networks, it can be modeled by assuming a set of parallel communication channels as [3,27]: where   is a constant scaling factor,  FH  and  FH  represent the traffic load and the capacity of the  ℎ fronthaul link respectively.For a link capacity of 500 Mbps,  FH ,max is typically ca.37 Watts [60].

QA Qubit Count Requirements
This section describes our approach in estimating the QA qubit requirement that meet the 4G/5G cellular baseband computational demand ( §4.2).To compute this, we convert the target TOPS (Table 4) into target problems per second (PPS), then estimate the number of qubits QA requires to achieve this PPS, individually for baseband computational tasks.We formulate it as: =   /Number of operations per problem (11) where   is the total number of qubits the QA requires for the entire baseband processing, and  , is the qubit requirement for the  ℎ baseband task.  is the target problems per second,  ,, is the number of qubits per problem, and  , is the run time per problem, of the  ℎ baseband task.We next demonstrate how to compute these values for FD nl and FEC tasks with running examples.
FD nl Qubit Requirement.The FD nl task corresponds to the MIMO detection problem [2] whose objective is to demodulate the received soft symbols into bits.Solving an FD nl problem requires on average 80 × ( /64) 2 million operations for a  ×  (Z-users, Z-antennas) system5 via state-of-the-art Sphere Decoding algorithm [45].Solving the same problem using QA requires  bps ×  qubits, where  bps is the number of bits per symbol in the modulation scheme (see [51] for full derivation).Thus for a typical 5G scenario: 64 × 64 MIMO system with 64-QAM modulation (i.e., six bits per symbol),  FD nl is 30.72M(= 2457.6 TOPS/80M operations),  ,,FD nl is 384 qubits, and  ,FD nl is 42 + 3  s ( §3).Substituting these values in Eq. 10 shows that the 5G FD nl processing requires 1.2M qubits with   = 20 samples.FEC Qubit Requirement.The FEC task corresponds to channel decoding that aims to correct the bit errors that interference and vagaries of the wireless channel inevitably introduce into the user data.We consider Low Density Parity Check (LDPC) codes employed in the 5G-NR traffic channel for FEC evaluation [35].Decoding an (M, N)-LDPC code with average row weight   and column weight   in its parity check matrix via state-of-the-art belief propagation algorithm requires  + 3 2   −   + 2 2   + 4   operations per iteration [31], where M and N are the number of rows and columns in the LDPC parity check matrix respectively.Solving the same problem using QA requires  + qubits, where [47] for full derivation.Thus for the 5G's longest LDPC code with base-graph-1 parity check matrix ( = 4224,  = 8448,   = 8.64,   = 20) [33],  FEC is 600K (= 89.6 TOPS/150M operations)-for typical 20 decoding iterations,  ,,  is 21,120 qubits, and and  ,  is 80 + 3  s ( §3).Substituting these values in Eq. 10 shows that the 5G FEC processing requires 1.29M qubits with   = 20 samples.5G's FD nl and FEC tasks correspond to 75% of the baseband computation load.For the remaining 25% of baseband computational load, we project a proportionate number of qubits for their respective processing requirements.Table 5 reports the number of qubits the QA requires as a function of problem run time ( , ), showing that with  , of {45, 102, 192, 342} s, QA requires {1.6, 1.99, 6.25, 11.16} million qubits respectively to satisfy the 5G baseband computational demand.The number of samples (  ) represent the required QA target fidelity in terms of error performance-when   is 20, QA must reach ground state of the input problem in 20 anneals.Hence, QA must meet these  , and   combinations to achieve spectral efficiency equal to CMOS processing in 5G wireless networks.While we demonstrate an example scenario with 400 MHz BW, 64-antennas, 64-QAM modulation, and 0.5 coding rate, a similar methodology can be applied to estimate network-specific qubit requirements.

Power and Cost Comparison
Our methodology compares CMOS and QA processing at equal spectral efficiency outcomes.We specify the same BBU targets (Table 4) with CMOS and QA hardware, ensuring equal bits processed per second per Hz per km 2 .
Power consumption of CMOS hardware depends on its performance-per-watt efficiency and the amount of computation at hand.Technology scaling improves this efficiency from generation to generation, inversely proportional to the square of its transistors' core supply voltage (  ) [71].A 65 nm CMOS device (  = 1.1 V) has a 0.04 TOPS/Watt efficiency, from which we compute the same for today's 14 nm CMOS (  = 0.8 V) and future 1.5 nm CMOS (  = 0.4 V), via  2  scaling, and they obtain a 0.076 and 0.3 TOPS/Watt efficiency respectively [29,43,44].Using this hardware efficiency and the TOPS requirements of Table 4, we compute CMOS hardware power consumption.Additional power results from leakage currents in CMOS transistor channel, and this leakage power is set to 30% of dynamic power [29].
Power consumption of D-Wave's QA is ca. 25 kW, dominated by its refrigeration unit (see Supplementary information- [53]).Additional power draw due to the computation at hand is fairly negligible compared to the QA refrigeration power, since the QPU resources used for computation are thermally isolated in a superconducting environment.This power requirement is further not expected to significantly scale up with increased qubit numbers [53,75], due to the fairly constant power consumption of pulse-tube dilution refrigerators which are used to cool the QPU in practice [9,21,53].More general NISQ processors such as Google's Sycamore (see Supplementary information- [5]) and IBM's Rochester [40] also show a similar ca. 25 kW power consumption and a fairly constant scaling with increased qubit numbers [75].However, to maintain this 25 kW power for the entire 5G baseband processing, sufficient amount of qubits are required, all under the same refrigeration unit.This raises the question-how many qubits are allowed in a QA refrigeration unit?
To answer this question, we consider the physical size of qubits in their unit cell packaging (a die) versus the available space in the dilution refrigerator.The number of useful square dies (  ) of length   placed onto a wafer of radius   is approximately [28]: . A square die of eight qubits requires 335×335  2 QPU chip area with   = 335 m [12], and a dilution refrigerator's experimental space has a radius   = 250 mm [9].Substituting these values in the above equation gives   ≈ 1.75M, which implies ≈14 million qubits allowed in a refrigeration unit.Since qubit count estimates for 5G (cf.§4.3, §6) are well below this allowed limit, QA power consumption is 25 kW for 5G baseband processing.
Results and discussion: Applying the foregoing power analysis, Fig. 7 reports power consumption results of 4G and 5G BSs with 14 nm CMOS hardware.In Fig. 7(a), we see that the power amplifier (PA) is the dominating component of 4G BS power consumption, as identified in several prior works [3,29,32], accounting for 57-58% of the total BS power.But, as the network scales to higher bandwidth and   In comparison to CMOS, QA processing reduces C-RAN power by 159 kW (55% lower).Table 6 reports the OpEx cost savings and carbon emission reductions associated with the respective power savings, computed by considering an average $0.143 (USD) electricity price and 0.92 pounds of  2 equivalent emitted per kWh [72,73].To provide economic benefit over CMOS hardware, assuming CMOS CapEx is negligible, future QAs' CapEx must be lower than the respective OpEx savings.For instance, if QA was to be employed in a C-RAN scenario, a CapEx lower than 200K, 400K, 1M, and 2M USD will provide economic benefit over CMOS in one, two, five, and 10 years, respectively.

Feasibility Timeline and Discussion
This section presents our projected QA feasibility timeline, describing year-by-year milestones on the application of QA to wireless networks.Our approach is to compare power consumption of QA and CMOS in various base station scenarios, then compute the QA qubit requirement to equal spectral efficiency to CMOS in the same scenarios.We next project the year by which these qubit numbers become available in the QA hardware by extrapolating the historical QA qubit growth trend into future.Figures 2 and 9 report these results.Roadmap for feasibility.The processing of a base station with 10-MHz bandwidth and 32 antennas (Point 'F' in Fig. 9(c)) requires 39K qubits in the QA hardware for QA to equal spectral efficiency to CMOS, and this qubit requirement is projected to become available by the year 2026 (Figs. 2, 9(c)).However, leveraging QA for such a system leads to increased power consumption in comparison to both 14 nm and 1. .QA with at least 1.85M qubits benefit in power over 1.5 nm CMOS, and such a QA is predicted to become available by the year 2038 (Figs. 2, 9(c)).In summary, our analyses show that power advantage of QA over CMOS is a predicted 14-17 years away.Fig. 2 summarizes Fig. 9 in a feasibility timeline, showing the years by which QA enables these base station operation scenarios along with their associated power advantage/loss.

Conclusion
While the conventional assumption that CMOS hardware will achieve nextG cellular processing targets may well hold true, this paper makes the case for the possible future feasibility and potential power advantage of QA over CMOS.Our extensive analysis of current QA technology projects quantitative targets that future QAs may well meet in order to provide benefits over CMOS in terms of performance, power, and cost.While we acknowledge the practical deployment of quantum processors to be at least tens of years away, this early study informs future quantum hardware design and RAN architecture evolution.Furthermore, fundamental physical advances in the QA technology itself, which we do not leverage in the projections given in this paper, may offer even further benefits, advantaging our projected timelines.Examples of these advances include faster annealing times (< 40 ns) and/or qubits with longer coherence lifetimes (such as the qubits in IARPA's QEO and DARPA's QAFS QA chips [59]) that enable coherent quantum annealing regimes, benefiting future QA spectral efficiency [37,83].

Fig. 1 :
Fig.1: Our envisioned deployment scenario of Quantum Processing Units (QPUs) alongside CMOS units in a C-RAN datacenter.QPUs undertake heavy baseband computation, while CMOS processing manages the network's control plane.

Fig. 3 :
Fig. 3: The figure shows unit cell structures of (a) Chimera and (b) Pegasus QA hardware topologies.Nodes in the figure are physical qubits, and edges are physical couplers.
Figure 9(c) shows this qubit requirement for various bandwidths and antenna choices at 102 s problem run time.
A C-RAN with three 400 MHz-bandwidth 64-antenna BSs.

Fig. 8 :
Fig. 8: (a) Power consumption of a 5G BS where QA is used of the BBU's baseband processing.The BS power at   = {32, 64, 128} is {37, 49, 73} kW respectively.(b) Power consumption of CMOS (290 kW) and QA (131 kW) processing in C-RAN scenario with three base stations.In both (a) and (b), BBU's further computation (i.e., Control and Transfer systems) is processed by 14 nm CMOS silicon.BBU bar plots are shown with its sub-components (see legend, §4.1.1)in increasing order of power from bottom to top.The percentages (rounded to nearest integer) correspond to components labeled on X-axis.

Fig. 9 :
Fig.9: Power consumption of BBU's baseband and its associated power system using (a) 14 nm CMOS and (b) 1.5 nm CMOS hardware in various base station operation scenarios in the 5G frequency range[33].The dotted horizontal line in (a) and (b) is the QA power consumption of 25 kW.(c) The number of qubits QA requires to match the spectral efficiency of CMOS in the same scenarios.Points A-E respectively show the smallest bandwidth at which QA benefits in power over CMOS at each antenna count, and Point F shows the smallest practically feasible scenario QA enables with 39K qubits (see §6).
5 nm CMOS devices (Figs.9(a), 9(b)).Roadmap for Power dominance.From Figs. 9(a) and 9(b), we see that for a given antenna count, the lowest bandwidth for which QA achieves power advantage over 14 nm CMOS are 20 MHz bandwidth 256-antenna (Point 'A'), 50 MHz bandwidth 128-antenna (Point 'B'), and 160 MHz bandwidth 64antenna (Point 'C') systems.In comparison to 1.5 nm CMOS, such points correspond to 60 MHz bandwidth 256-antenna (Point 'D'), and 190 MHz bandwidth 128-antenna (Point 'E') systems.Fig. 9(c) shows the number of qubits required in the QA hardware to process these systems (Points A-E) with equal spectral efficiency to CMOS.The figure shows that to achieve a power dominance over 14 nm CMOS, at least 618K qubits (Point 'A') are required in the QA hardware, and this qubit requirement is projected to become available by the year 2035 (Figs. 2, 9(c))

Table 1 :
Summary of qubit requirements of QA hardware to achieve equal spectral efficiency to CMOS, and power consumption of CMOS and QA, at various bandwidths (B/W).1Theshaded/colored cells indicate the lesser of the two power requirements of CMOS and QA.

Table 3 :
The table shows the number of qubits read out in parallel by time-division (status quo) and frequency-multiplex (projected) readout schemes at various choices of QPU sizes and readout microresonator quality factors (  ).

Table 5 :
QA qubit requirement at various problem run times to achieve spectral efficiency equal to CMOS processing, in a 5G BS scenario with 400 MHz BW and 64 antennas.
Power consumption of silicon 14 nm CMOS processing in 4G and 5G base stations.BBU bar plots are shown with its sub-components (see legend, §4.1.1)in increasing order of power consumption from bottom to top.The percentages (rounded to nearest integer) show the power contribution of that particular BS component (labeled on X-axis) to the total BS power.The BS power at   ={2, 4, 8, 32, 64, 128} is {0.35, 0.71, 1.43, 34.7, 89.9, 261.3} kW, in their respective scenarios.

Table 6 :
Summary of OpEx electricity cost savings (in USD) and  2 emissions reduction (in metric kilotons) QA will achieve in comparison to CMOS in 5G network scenarios.The number of antennas in C-RAN BSs is   = 64.