Challenges and Opportunities of Near-Term Quantum Computing Systems

The concept of quantum computing has inspired a whole new generation of scientists, including physicists, engineers, and computer scientists, to fundamentally change the landscape of information technology. With experimental demonstrations stretching back more than two decades, the quantum computing community has achieved a major milestone over the past few years: the ability to build systems that are stretching the limits of what can be classically simulated, and which enable cloud-based research for a wide range of scientists, thus increasing the pool of talent exploring early quantum systems. While such noisy near-term quantum computing systems fall far short of the requirements for fault-tolerant systems, they provide unique testbeds for exploring the opportunities for quantum applications. Here we highlight the facets associated with these systems, including quantum software, cloud access, benchmarking quantum systems, error correction and mitigation in such systems, and understanding the complexity of quantum circuits and how early quantum applications can run on near term quantum computers.


I. I N T R O D U C T I O N
Quantum computers can potentially solve problems that are considered intractable on even the fastest classical computers [1]- [6]. They use a fundamentally different paradigm for performing calculations and solving problems compared with standard classical computers by using quantum bits (qubits). The speedup is achieved using quantum physics (including entanglement between qubits) to explore correlations in problems such that the correct answer emerges at the end of computation through constructive interference. Considering that quantum states are written as wave functions, classical interference is a reasonable analog: wave functions are steered to the correct answer through constructive interference (several waves add up, creating another wave with a larger amplitude). Wave functions that do not correspond to the correct answer vanish by destructive interference. This is obviously different from our standard picture of computation, wherein each fundamental information unit is definitely in either the state 0 or 1 (we refer to [7]- [9] as examples for a greater in-depth summary of quantum computing).
There is, however, a catch. The internal states of a quantum computer are fragile and susceptible to noise, introducing errors that lead to incorrect answers. Given the complexity and number of operations that are required for many typical quantum algorithms, it is believed that largescale practical quantum computing has to incorporate at least some form of QEC [1], [10], which is analogous to classical error correction. Several physical qubits are encoded into a logical qubit such that errors on the physical qubits can be detected and corrected. By applying QEC schemes, the logical qubit error rates can be arbitrarily suppressed, provided physical error rates are below a threshold to enable fault-tolerant quantum computing [11]- [14]. In this context, many quantum codes and techniques have been invented [15]. Because simulating the full dynamics of quantum computers quickly becomes intractable as more qubits are added, QEC codes have been studied assuming simplified noise models, such as the Pauli noise. These simulations, together with assumptions about what is experimentally feasible, provide estimates of what would ultimately be required to operate various quantum algorithms using fully fault-tolerant computation [16]- [19]. Millions of qubits with relatively low physical error rates are predicted to be necessary to solve difficult problems. We will not expand further on fully faulttolerant quantum computing for the remainder of this article but instead refer the reader to [7], [15], and [20].
This article will focus on quantum computing with devices that are currently available or expected to be available in the near future. Various devices comprising 5-79 qubits have been made available to the public or exist as prototypes in laboratories [21]- [25]. Such devices have been referred to as NISQ [26] systems, i.e., nonfaulttolerant devices comprising tens or hundreds of qubits. They can be classified into two categories: 1) devices constructed for a single demonstration experiment run by the team that created the device and 2) devices built to serve as general-purpose quantum systems for use by others. Designing a general-purpose system requires consideration of many factors not relevant for a one-off demonstration. Five such factors, outlined in the following, are covered in detail in Sections II-VI and represent an IBM perspective on critical near-term system aspects.
A system first needs to be designed to accommodate its intended users, enabling the functionality that they require to do their research and providing adaptability as their needs evolve. The breadth of quantum system users (physicists, computer scientists, engineers, chemists, developers, and others) requires multiple cloud systems and access interfaces for interacting with different systems at varying levels of abstraction. Examples of these include access levels for pulse and gate control and ultimately for applications and systems comprising different connectivities between the qubits.
Second, while a demonstration can rely on specialpurpose code, a system needs a complete SDK providing a set of tools that can be used to develop novel experiments and applications. It is to this end that in an open-source collaboration with the community, we have developed Qiskit [27], which consists of four fundamental elements: Terra [28], Aer [29], Ignis [30], and Aqua [31], each bringing a specific set of features to the user. Terra provides the foundation for composing quantum programs at the level of circuits and pulses, optimizing them for the constraints of a particular device, and managing the execution of batches of experiments on remote-access devices. Aer gives access to high-performance quantum simulators to help us understand the limits of classical processors by demonstrating to what extent they can mimic quantum computers. Ignis offers a set of tools to better characterize errors, improve gates, and compute in the presence of noise. Finally, Aqua is where quantum algorithms are built and ultimately used in the context of applications. It provides translators to map problems from domains, such as chemistry, optimization, finance, and AI onto problems solvable with a quantum computer.
Third, it is important to establish a roadmap for the systems. Much like roadmaps for classical systems, a roadmap for quantum systems provides a community-chosen benchmark to facilitate comparisons across systems and demonstrate progress over time. For quantum computers, many individual metrics are commonly accepted as ingredients for a better quantum system, but, as of today, a single community-wide accepted benchmark does not exist. QV has been proposed as a potential benchmark that incorporates many of the individual metrics (number of qubits, connectivity, gate set performance, and compiler and software stack performance) into a single hardware-agnostic metric [32] (see Section IV). We have shown improved QV over the past two years and strive toward continued improvements.
Fourth, to extend the computational reach of shortdepth quantum circuits, error mitigation has been proposed as a technique to increase the accuracy of measured observables [33], [34] (a quantum circuit is an ordered list of quantum logic gates, usually expressed as directed graph, and the depth expresses the length of the critical path). Error mitigation is a term used to express methods by which the impact of error can be reduced (or mitigated) without requiring full fault-tolerant quantum codes. This approach to reducing errors in the absence of full fault tolerance is still very much exploratory, but we view it as an important component of near-term quantum systems.
Fifth, quantum computers without fault tolerance will likely be limited to implementing algorithms with shortdepth quantum circuits [35]. In these algorithms, a series of quantum gates are applied, and then the qubits are measured. The outcomes of these circuits are used to compute an observable or sample from a probability distribution of interest. This limited model is believed to be computationally hard for classical machines [36], and recently, it has been shown to have an unconditional separation between classical and quantum computers [35]. Using this model, researchers have explored applications in quantum machine learning [37] and quantum chemistry [38].
From these five aspects, it is evident that developing a complete, user-friendly, cloud-accessible quantum system necessitates considering a rich landscape of design aspects. Successfully implementing all of the ingredients simultaneously to achieve this goal is no small task. This article reviews these considerations in Sections II-VI. For quantum applications, particular emphasis is placed on describing algorithms for quantum machine learning and quantum chemistry because they are examples of applications that can be mapped to short-depth circuits that are believed to be hard for a classical computer to simulate and are currently areas of great interest. As far as hardware is concerned, we only make brief mention of it in Section VII.

II. C L O U D Q U A N T U M S Y S T E M S A N D U S E R A C C E S S L E V E L S
Although experimental research in quantum computing has been active for over two decades, it was not until the mid-2010s that it became possible to physically connect a handful of superconducting qubits together to implement small multiqubit tests with sufficient fidelity for meaningful results. In 2016, IBM built a quantum processor composed of five superconducting qubits and integrated it into a system called the IBM Quantum Experience [39], available for use via cloud access. Almost immediately after launch, research articles were published based on results obtained through this cloud access to a quantum device. It demonstrated one key aspect that we believe is important for the future as well: there is already demand from physicists, scientists, developers, and many others to access and test various aspects of quantum computing even if the systems are still small, comprise only a few qubits, and suffer from noise levels worse than the fault-tolerance threshold. Over time, other groups in industry and academia have also started to offer quantum cloud services to varying degrees, either in the form Since the initial release of the five-qubit backend, the IBM Quantum Experience has hosted 18 unique quantum systems made available as backend services either to the public or to members of the IBM Q Network [43]. Over time, user executions have increased to 28 million, culminating in over 200 research articles exploring areas in quantum information science. The proliferation and quality of enabled research serve as an affirmation of the community-wide demand for a variety of quantum systems.
The qubit connectivity maps and the distribution of twoqubit error rates for a few of the available IBM backends are shown in Figs. 1 and 2, respectively. The variety of devices allows us to explore user preferences and device performance for different connectivities.
In these present-day systems, imperfections and connectivity can have a large impact on the performance of different algorithms. More connectivity allows users to explore circuits that entangle the qubits in fewer steps but often at the price of hurting gate fidelities or inducing spectator errors [44], [45], i.e., errors that can occur on qubits that are still passively connected but otherwise not directly involved in a particular quantum operation.  [46], [47], as well as a better understanding of spectator qubit errors.
As we progress through this period of near-term quantum systems, we must evolve toward co-design of the quantum circuits users want to implement, as well as the connectivity that is physically built into the systems.
Furthermore, user feedback showed a clear interest in more than one level of access to a quantum device. We consider three definable fundamental user classes for various levels of cloud access (see Fig. 3): the quantum physicist, the quantum information scientist, and the quantum developer.
The quantum physicist possesses a deep understanding of the underlying device physics and would like to explore more practical technical details, such as optimal control techniques, novel pulse-shaping approaches, techniques to quantify the underlying system Hamiltonian, and error mitigation methods. These users want more of the nitty-gritty details and the ability to examine device-level properties of the system, e.g., control over the frequency, timing, pulse shapes, and measurement integration kernels that are sent to the experiment. To properly meet this level of user access, we have defined the OpenPulse framework [48], along with a corresponding set of tools in Qiskit described later in this review. Succinctly, OpenPulse provides the bare metal access level for users.
The quantum information scientist has a deep understanding of quantum circuits and wants to explore how these circuits run on near-term devices. A circuit can be implemented in multiple ways, in terms of the fundamental gates. Finding the optimal solution is a computationally difficult task and is, therefore, an important research topic. These users are also interested in exploring errorcorrection primitives, such as parity checks and conditional operations, which depend on these multiqubit measurements to investigate how entropy is taken from the system. For this level, we have defined OpenQASM [49] and the corresponding tools in Qiskit.
The third user class, the quantum developer, wants to see how quantum applications work on quantum computers. They want to run circuits based on an application and receive the outcome as quickly as possible. They are not necessarily interested in how the circuit is implemented; they are focused on the results of the quantum computation and how it can be used in an application of interest.
In parallel, it is also important to provide to each of these users the data appropriate for their respective level. The physicist needs access to device-level specifications, while the quantum information scientist needs the error rates for the calibrated quantum gates and operations. Device specifications are the fundamental properties of the device (e.g., coherence times, qubit frequencies, and crosstalk), while error rates depend on the pulses that represent the gates and include metrics such as average single-qubit errors, two-qubit gate errors, spectator errors, assignment (or readout) errors, and readout crosstalk errors. Cloud-enabled devices typically list most (if not all) of these metrics; see the example in Fig. 4 for the backend properties needed by a user of the quantum infor-

The quantum hardware is accessible by hardware engineers. At the conclusion of an experiment, the hardware passes the readout signal in the form of readout voltages to a signal integrator. The integrator signal is represented as the I and Q quadratures of the readout signal, which get digitized at the physicist access level to a logic 0 or 1. The logical bitstream is passed to output analytics for the quantum information scientists to analyze results of the implemented quantum circuits. Finally, the answer to a full quantum application is sent back to the quantum developer. A domain expert is the expected end user of a fully developed stack. This
representative stack demonstrates how different levels of access are possible. mation scientist persona. By tracking the metrics, we can gauge the importance of any particular metric (or a combination thereof) in order to improve the overall quality of experiments.
Of course, not yet discussed is the hardware engineer who resides at the bottom of the stack. The hardware engineer designs, tests, and interfaces with the quantum device. The interface is both physical (installing the device) and virtual (coding and testing control electronics along with associated device drivers). While the hardware engineer is, of course, a critical user class, we envision this role as a more separate functional role to provide the lowest level of required infrastructure for all the users higher in the stack.

III. Q I S K I T A N D C O M P I L A T I O N
An unprecedented acceleration of research and development in quantum computing has occurred in recent years, chiefly enabled by wide public access to cloud quantum computers. The software stack plays a key role in taking advantage of these systems and enabling quantum information science as a whole [27], [51]- [54]. In this section, we discuss Qiskit, a software suite for near-term quantum computing.
We will pay special attention to the compiler as an indispensable part of any quantum computing system. Our description is focused on compilation strategies tailored to near-term noisy systems. Compiling for fault-tolerant machines is a vast area of research in itself, and we refer the interested reader to [55]- [59] for further reading.

A. Qiskit Architecture
Fig . 5 shows the overall architecture of Qiskit. As an SDK and research tool, Qiskit comprises elements that we deem important in the journey toward quantum advantage. For the sake of completeness, we review them again. The first element, Terra [28], provides the foundations and language to describe quantum computations at different abstraction levels (circuits or pulse schedules) and to compile and optimize them for specific machines. The second element, Aer [29], provides scalable and realistic simulations of quantum systems and is invaluable to understanding the complexity of different computations, as well as how they behave under certain noise assumptions. The third element, Ignis [30], provides tools to characterize quantum devices and mitigates the effects of noise on them. The fourth element, Aqua [31], is a library of quantum algorithms and translators from near-term application domains (such as chemistry and AI) to quantum circuits.
Since quantum software is so new, much is unknown about how the software stack should be configured for each particular setting. In addition, active research is in progress in all aforementioned areas. For this reason, Qiskit has a highly modular architecture, easily extensible at all levels. This includes adding new circuit optimization passes, new noise models for simulation, new algorithms, and new noise characterization and mitigation methods.
The Qiskit compiler is primarily composed of two parts: the transpiler and the scheduler. The transpiler is a circuitrewriting toolchain, designed to optimize circuits, both in the abstract and for particular backends. The scheduler converts circuits written for a given device into the sequence of pulses executed on that device.
In the transpiler, multiple circuit analysis and transformation "passes" can be strung together to yield a custom circuit optimization pipeline. A typical sequence might include unrolling the circuit gates to a particular native gate set, allocating ancilla qubits, swapping qubits so that entangling interactions match the device topology, merging consecutive gates into simpler ones, analyzing commutation relations and canceling nonadjacent gates, analyzing the circuit depth, and repeating a couple of optimization passes until the circuit depth reaches a fixed point. A "pass manager" sequences the user's desired passes, keeps internal context about the progress, and ensures that the control flow of passes is implemented correctly. Similarly, the scheduler contains passes that convert a circuit to a sequence of pulses with specific timing-for example, using an as-soon-as-possible or as-late-as-possible scheduling method-and can optimize them further using methods such as dynamical decoupling [60], [61] or optimal control [62].

B. Compiling for Near-Term Machines
Near-term quantum hardware is severely limited in what it can compute. Errors can build up rapidly during the execution of a program and can render a computation useless. In contrast to classical compilers, for which the goal is to transform a program to run faster, the primary goal of a compiler for near-term quantum computers is to combat these errors. Therefore, a good quantum compiler must ensure that an input program is translated into the most efficient equivalent of itself, squeezing the most out of the available hardware.
Some steps in the compilation process are necessary to run the program in the first place. For example,

Fig. 5. Architecture of Qiskit. Aqua and Ignis produce circuits for different tasks (algorithms and applications, or device QCVV, respectively). The IBM Q systems and Aer simulators are backends that execute quantum circuits or pulse schedules. The Terra compiler is the bridge that translates and optimizes for a given backend and comprises modular pass-based circuit optimizers (Transpiler) and pulse optimizers (Scheduler). Some example passes are shown. Efficient high-level synthesis methods, access to a library of precomputed gate and pulse equivalents, and information about device constraints and properties all increase compilation quality.
high-level program routines, such as an abstract unitary evolution, must first be synthesized into a quantum circuit [63]- [66]. A circuit must be transformed to conform to the hard constraints of a device, such as which qubits can interact with one another, or which gates are natively supported [67]- [71]. Finally, circuits must be translated into pulses that control the qubits [46], [50].
Beyond this, an optimizing compiler should focus on the soft constraints given by the physics of the device and optimize within that space. For near-term quantum computers, seemingly small optimizations, such as reducing the two-qubit entangling gate (e.g., CNOT) count by 15%, can yield dramatic improvements in the final fidelity of computation. The compilation problem, in general, is NPhard [67]. Finding optimal layouts of program qubits on the device, or finding optimal swapping routes between the hardware qubits, can be done by solving subgraph isomorphism and token-swapping problems, respectively. We may be able to find optimal solutions for small systems, but soon we need to devise effective heuristics.
An optimizing compiler must generally be aware of the set of constraints and parameters within which it is trying to optimize. To first order, these can be generic truths, such as the fact that two-qubit gates have higher errors than single-qubit gates or that qubits lose their information if the program length (i.e., quantum circuit depth) is too long. Given these constraints, general optimization objectives are defined for a quantum compiler, such as minimizing the circuit depth or the number of entangling gates. This has been the traditional approach to circuit optimization for more than a decade [72]- [75].
While effective, circuit depth and gate count are only pseudoobjectives to simplify reasoning about the quality of a compiler's optimizations. In reality, what matters is the fidelity of computation when running on actual quantum hardware. Every quantum device is different, and thus, benchmarking and characterizing the system are critical to successful compilation. As an example, it is often taken for granted that lower circuit depth is better. This has resulted in trying to parallelize gates as much as possible [76]- [78], which may yield bad results on a high-crosstalk system. Conversely, randomized compiling [79] prolongs circuit depth by inserting extra gates, yet the effect of these gates is to randomize and mitigate coherent errors, achieving a better overall fidelity.
The key takeaway is that compilers for noisy quantum computers excel when more information is made available to them from the device. With cloud-access quantum computers, the field of quantum computer science is moving toward evaluating the effect of compilation strategies on real hardware, rather than based on objectives that may not be comprehensive. In the IBM Q ecosystem, device properties are shared openly and can be benchmarked, resulting in a flurry of recent compiler innovations [80]- [84]. Pertinent hardware characteristics include, but are not limited to, qubit topology, native gate sets, gate error rates, latencies of gates, readouts and feedforward, qubit lifetimes (decoherence and relaxation), and crosstalk errors.
A key observation in compiling for noisy quantum computers is that since errors always exist, it may not always be worth performing a numerically exact compilation.
Alternatively, approximate compilation aims to approximate a unitary by some numerically close alternative, in order to potentially save significant resources [32], [85]. If the reduction in error due to the shorter alternative is more than the loss of precision in the approximation, then this tradeoff is worthwhile. Evaluating this tradeoff is again dependent on the exact characteristics of the device.
Finally, verification of the compiler becomes a serious challenge even in the near term, as verification of general circuit transformation on circuits of roughly 50 or more qubits is impractical. Consequently, expansive testing of smaller cases or formal verification methods [86], [87] will be essential.
Quantum compilers have benefited from decades of classical compiler design, yet the new domain creates new challenges and opportunities. For example, commutation relationships among quantum gates provide additional flexibility, compared to a classical program for instructions to be reordered, merged, or canceled [76]- [78], [88]. In contrast to the classical computing, in which a program can be compiled once and reused thereafter, quantum programs must often be recompiled, as device properties change over time. Finally, traditional abstraction boundaries for separating logical computation from physical implementation may need to be blurred on nearterm devices, as they may sacrifice some efficiency in favor of abstraction [89]. Designing an industrial-scale compiler suitable for the coming generation of quantum computers remains an exciting and hard task.

IV. B E N C H M A R K I N G N E A R-T E R M D E V I C E S
Benchmarking quantum systems will be a necessity to measure progress. Assuming several physical realizations of quantum computers will emerge over time, there must be a way to quantify their respective performance much like classical benchmarks. While benchmarking appears to be an obvious requirement, it is far less obvious how to devise a rigorous set of metrics applicable to quantum computers.
Important factors for formulating an appropriate quantum benchmark include the following. 1) Number of Qubits: More qubits are required to solve increasingly difficult problems, and thus, everything else being equal, the more qubits a quantum computer has, the more computational power it has. Systems with several tens of qubits can be simulated on a classical computer, and therefore, having a few qubits will not be beneficial in the long run. 2) Connectivity: How qubits are connected to one another matters. At one extreme, qubits connected on a line would require a significant overhead for any randomly selected gate between any random pair of qubits. At the other extreme, if all qubits are connected to each other, there is no additional overhead for a randomly selected gate between any random qubit pair. However, at the hardware level, connectivity matters a lot, and it greatly influences metrics such as crosstalk and fidelity. It is important to strike a balance between connectivity and overhead for a given application. 3) Error Rates: Quantum operations that feature lower errors rates are generally better. A critical component associated with error rates is spectator errors (errors on qubits that are not participating in the applied quantum gate). The spectator errors can significantly degrade or even dominate overall circuit performance. For example, while a two-qubit gate might lead to very low errors on the two qubits involved, it is possible that another qubit might undergo significant errors due to the application of the two-qubit gate. 4) Gate Set: The choice and performance of the underlying gate set are important. A large set of gates reduces the overhead to synthesize arbitrary gates or move quantum information but also requires far more complexity from a calibration and stability standpoint.

5) Compilers and Software Stack Performance: Compilers
are critical for optimal translation of circuits to the underlying hardware. A compiler needs to consider and optimize over the device connectivity and potentially even variations of gate fidelities and spectator errors across the device. IBM has devised a benchmark called QV that balances all of the abovementioned ingredients [32]. We believe that this system-agnostic metric provides a way to compare devices across different physical implementations and incorporates qualities, such as low error rates, that are ultimately necessary for a practical quantum computer.
The QV measures the largest model circuits the quantum computer can successfully run. A model circuit consists of random two-qubit gates acting on random pairs of qubits and has as many parallelized layers of these gates as it has qubits. The model circuits are compiled to the particular quantum system. A given run is considered successful if the observed measurement outcome is in the upper half of the ideal output probability distribution. To claim that the QV exceeds some value for some system, we are required to succeed for more than two-thirds of the runs on a given number of qubits.
We wish to particularly highlight the role of circuit compilation in the QV because a full quantum system is the combination of the qubits, gates, control electronics, and the software stack that optimizes for those components. The QV provides a way to benchmark the whole quantum computing system, including the optimizing components of the software stack.
Choices for universal benchmarks will evolve as the community continues to learn more about near-term devices [90]- [92], but the QV is an accessible and measurable quantity that tracks progress on current devices. We have released an open-source library for measuring QV in Qiskit. The largest QV measured thus far is 16, measured first on the 20-qubit Johannesburg system [32], and more recently on Boeblingen (see Fig. 4).

V. E R R O R M I T I G A T I O N A N D C O R R E C T I O N
For decades, researchers have understood that decoherence would limit the duration of useful quantum computation [93] and have devised many techniques for overcoming the noise. Today, we are keenly aware of the impact of decoherence and control error on the size and accuracy of quantum computations. The way forward is necessarily a mixture of approaches. Foremost, we must understand and reduce the fundamental error sources in our hardware, control systems, and environment. Beyond this, we must correct the remaining errors and/or mitigate their effect on the quantum computer's accuracy in a resource-efficient way. These are among the central research challenges for the foreseeable future.
It is by now well known that the principles of QEC allow errors to be dramatically and efficiently suppressed in theory [1], [10]- [14] so that the computational time can extend well beyond the coherence time. This requires physical error rates to be low enough and noise to be sufficiently uncorrelated. When that happens, faulttolerant gates can successfully limit the spread of errors, and QEC procedures can remove entropy faster than it accumulates. If error rates are only modestly below the threshold error rate, the additional space and time to implement fault-tolerant gates can be prohibitively large. Furthermore, topological codes [94], which are among the most well suited to planar quantum computing architectures, are expected to correct errors very well but protect qubits by encoding each of them into a large number of physical qubits.
In the near term, we are unlikely to have both sufficiently low error rates and sufficiently many qubits to implement a fault-tolerant quantum computer. Nevertheless, these near-term systems present an early opportunity to research error mitigation and error correction in real noise environments. On one hand, QEC experiments spur development, confirm predictions, and expose facts about detecting realistic errors. On the other hand, error mitigation experiments have low overhead and significantly improve computational results today, so they are eminently practical. Error mitigation can improve estimates of expectation values, which can be important in explorations of quantum advantage, for example, as eigenvalues of the molecular Hamiltonians or kernels in classification problems addressed by quantum machine learning algorithms. However, unlike QEC that removes entropy, error mitigation cannot extend the computation far beyond the coherence time.
We now focus our attention on error mitigation schemes that are more recent and less well known than error correction schemes. To date, there have been two generalpurpose error mitigation schemes developed. The first, zero-noise extrapolation, was developed independently in the works of [33] and [34], and the second, probabilistic error cancellation, was introduced in [33]. In zeronoise extrapolation, the output from a circuit of interest is remeasured under different amplified noise strengths. The measured expectation values from these noisy runs can then be recombined to extrapolate to an estimate of the expectation value at the zero-noise limit that is more accurate than the best individual run. Using measurements at an increasing number of noise strengths, increasingly higher order noise contributions to the zero-noise estimate are suppressed by extrapolation [95]. Temme et al. [33] showed that such noise amplification could be achieved by stretching the time evolution of the quantum state, under the influence of the time-dependent drives that constitute the quantum circuit. Under the assumption of time-invariant noise, the stretch factor for the time evolution is equivalent to the noise amplification factor. Beyond this assumption, no further characterizations of the noise models are required, making this extremely attractive for experimental implementations. This method was demonstrated and integrated into a variational algorithm in the experiment of [38], using superconducting qubits and allmicrowave gates. It was also employed to improve the performance of a binary classifier realized on the same device [37].
A second, general-purpose error mitigation scheme, also proposed in [33], is termed probabilistic error cancellation or quasi-probability decomposition. In this method, every well-characterized noise channel in a quantum circuit is acted upon by its inverse. While implementing the inverse noise channel is in itself an unphysical task, it was shown that an "average" error-mitigated estimate of the outcome can instead be obtained by sampling from an ensemble of noisy circuits with probabilities related to the coefficients of the inverse noise map. The variance of the error-mitigated estimate is related to the number of noisy circuits sampled and measured. In contrast to the zero-noise extrapolation technique, a key experimental challenge here lies in the characterization of noisy gates employed in the quantum circuit. For up to two-qubit experiments, this method was recently realized for superconducting qubit [96] and trapped-ion architectures [97], both employing gate set tomography for noise characterization.
In addition to these techniques, other methods have been proposed that are more problem-specific. The quantum subspace expansion method [98], [99] involves the measurement of additional excitation operators for variational ground states and, in addition to providing excited state energies, also mitigates on energy estimates. Other recent approaches to error mitigation for fermionic problems rely on the conservation of "known quantities," such as particle number [100]- [102]. Such symmetries can be enforced by using ancillary qubits to perform stabilizer checks.
Error mitigation is still in its infancy but has shown some promising first steps. As we look forward, we hope that access to near-term systems will enable new error mitigation and correction techniques at the intersection of theory and practice. Ultimately, we believe that QEC Vol. 108 and fault-tolerant design will still be necessary. Therefore, continued experiments such as demonstrations of the Bell state parity measurements [103], stabilizer measurements [45], error detecting codes [104], [105], and other codes [106] are critical for understanding how to protect encoded quantum information in the long term [107]. We anticipate the theory and practice of error correction and mitigation to continue to develop together in the future and that new ideas will emerge, particularly with respect to many well-known types of errors, such as correlated errors, leakage, and fluctuations of these errors.

VI. Q U A N T U M A P P L I C A T I O N S O N N E A R -T E R M Q U A N T U M S Y S T E M S
The relevance of a quantum computer is derived from the algorithms that can be performed on it. For some problems, such as factoring integers [2] or simulating quantum mechanics [108], quantum algorithms have theoretical guarantees to drastically outperform any known classical algorithm. It is important to state that not every problem that is challenging for a classical computer will benefit from a quantum speedup. This means that the applications for a quantum computer need to be identified individually, and a specific quantum algorithm must be developed for them. Up to this point, the set of algorithms that can be shown to outperform classical computers [109] all depend on an architecture that is fully fault tolerant. The quantum hardware that is currently available is not yet at a stage to run fault-tolerant computations. Nevertheless, making current hardware available to the research community allows for the investigation of quantum algorithms that have the potential to run on near-term quantum devices. Due to device imperfections and decoherence, we expect that such algorithms will be comprised of shallow-depth quantum circuits. To tackle a complex computational task, some of the computation that does not benefit from a quantum speedup can be outsourced to a classical computer. Examples for such quantum-classical hybrid schemes are variational algorithms for quantum many-body systems [38], [110], [111] and machine learning [37]. Such shallowdepth variational hybrid algorithms can be understood from the following picture: the classical computer tries to find the best quantum circuit, limited in size to a depth determined by the noise, to perform a particular computational task. The task could be the preparation of an approximation to the ground state of a Hamiltonian or the construction of a classifier in machine learning. This simple scheme has opened up a pathway to trying heuristic algorithms that do not come with any performance guarantees on current quantum hardware.
The development of classical algorithms has greatly profited from the wide availability of computational hardware. Many heuristic algorithms were found by trial and error and come without performance guarantees. It is, therefore, reasonable to follow an experimental route in the search for applications that could benefit from a quantum computer. However, there is an important difference between the development of classical algorithms and quantum algorithms. Not all quantum circuits can lead to quantum advantage. If the quantum algorithm can be efficiently simulated on classical hardware [112]- [118], it cannot provide a computational advantage. The advantage of quantum computers is based on the complexity of the algorithm and not on the quantum computer's ability to perform fast operations. It is, therefore, paramount to ensure that the quantum algorithm is based on a circuit that cannot be efficiently simulated on a classical computer.
To ensure the classical hardness of simulation is of particular importance when performing algorithms on a small number of qubits that are subject to noise since a quantum advantage may not be immediately apparent. The first fundamental question that arises is whether, and under which circumstances, a shallow-depth quantum circuit can provide a computational advantage. This question was recently addressed and answered [35] by demonstrating an unconditional separation in computational power between shallow quantum and classical circuits. An unconditional separation means that the complexity of the classical problem is understood and the separation to quantum can be proven. Further results [119]- [121] that are based on computational complexity assumptions show the existence of elementary quantum circuits that are likely difficult to simulate on a classical computer. While these results are encouraging, we need to continue researching quantum circuit complexity to point toward meaningful quantum applications that offer a speedup over classical approaches.
A more systematic path toward quantum applications for near-term quantum devices that exhibit a reliable advantage is based on the complexity-theoretic hardness of quantum circuits. In this approach, it is the quantum circuit that determines the application, placing the formal complexity result at the beginning of the development.

A. Quantum Machine Learning
One example of a quantum-classical hybrid algorithm that relies on quantum circuits believed to scale inefficiently for classical methods has been presented in [37]. In this work, the authors describe and implement two methods of binary classification using supervised training. These classification algorithms are related to standard SVMs. The idea in this work is to implement a nonlinear feature map that brings the data to classify into a space in which it can be linearly separated. The key aspect exploited by a quantum processor is that the feature map is implemented as a quantum circuit, mapping the initial data to the high-dimensional quantum state space, so it can be separated by linear binary classifier data. The use of a quantum feature map has also been proposed in [122]. For this algorithm to provide a quantum advantage, the quantum circuit must have transition amplitudes that cannot be estimated classically to an additive sampling error. The feature map circuit used in [37] can be related to a hardness result derived in [123], which guarantees an exponential separation in query complexity to the best classical algorithm.
Havlíček et al. [37] explore two methods to construct a binary classifier based on the hard feature map circuit. In the first method, the feature map circuit is directly followed by a variational circuit. The circuit can be used as a classifier that implements a binary measurement on the quantum feature space. This variational algorithm is, therefore, directly related to a classical SVM. The second method directly exploits the connection to classical SVMs by estimating the kernel matrix directly on the quantum computer and then using a conventional SVM. The hardware implementation of these two methods showed that even on a modest quantum processor, some sort of error mitigation [38] was needed. We have discussed a few error mitigation proposals in Section V.
A key observation of this proposal is the existence of quantum circuits that give rise to feature maps that are hard to evaluate classically, relative to complexity-theoretic assumptions. However, to obtain a quantum advantage for a practically relevant machine learning problem, a hard feature map circuit is a necessary condition. To make this sufficient, more circuits need to be explored that can be tied to complex real-world classification problems.

B. Quantum Chemistry
A second example uses quantum-classical hybrid algorithms with short-depth quantum circuits for quantum chemistry. The VQE [110] has been implemented on a number of different quantum hardware platforms [38], [110], [111], [124]- [127]. Here, the central objective is to obtain a good estimate of the ground state energy for chemistry or the general many-body Hamiltonian. Although typically fermionic Hamiltonians are considered, these Hamiltonians can be readily mapped to qubit/spin degrees of freedom by a common procedure [128]. To obtain ground state energy estimates for a target Hamiltonian, variational trial states are prepared on a quantum computer by using shallow circuits with free parameters that are experimentally adjustable. The quantum computer is used to estimate the mean energy of the target Hamiltonian by measuring the operators of the Hamiltonian directly on the quantum computer with respect to the trial state. The energy associated with each trial state is then fed to a classical computer, which runs an optimization routine that supplies a new set of parameters. The goal of the optimization procedure is to prepare a new trial state that will tend to lower the energy. This process is then iterated until some convergence condition is met.
In this approach to near-term quantum algorithms, the quantum computer's utility lies in the preparation of trial states and the measurement of associated expectation values. Depending on the trial state, these tasks may be hard on a classical computer, e.g., [130] and [131]. Preparing and measuring trial states on a quantum computer can provide additive error approximations to expectation values. However, as highlighted previously, the algorithm can only yield a quantum advantage if the circuits employed for trial state preparation are difficult to simulate classically. Most VQE implementations to date have focused on small (<10 qubit) molecular Hamiltonians in quantum chemistry. These implementations have employed circuits that implement "hardware-efficient" trial states [38], [111] or a UCC ansatz [131]. While the UCC approach offers a structured ansatz that maintains physical symmetries, the hardware-efficient circuits employ interactions that are native to the quantum hardware. Other important considerations for the practical implementation of VQE for quantum chemistry are qubit-efficient fermionic mapping schemes [128], the robustness of classical optimizers to hardware noise, and, most importantly, the effect of decoherence [33], [34], [38] and the measurement cost for the molecular Hamiltonians [132], [133].
One specific example is the combination of the previously mentioned error mitigation techniques and quantum chemistry. For VQE, the quantum subspace expansion technique also provides access to errormitigated energy estimates as well as to the excited state energies [98], [99]. As discussed previously, more general-purpose schemes, such as a zero-noise extrapolation technique and quasi-probability method, have been proposed to access noise-free estimates of expectation values. Even without significant improvements to coherence times and gate fidelities or any additional qubit overhead, the experimental implementation of the zero-noise extrapolation [38] showed an otherwise inaccessible level of accuracy in the ground state estimates of H2 and LiH, as shown in Fig. 6.

VII. H A R D W A R E
To this point, not many hardware details have been discussed. Hardware is of course required to have a system at all, but much of the system-level work can be done in a manner that is somewhat hardware-agnostic, the further away the user operates in the stack, and emphasizes the need for some modularity of the associated system components. This is particularly important as the underlying hardware may change over time, potentially significantly.
A detailed discussion of specific challenges associated with our quantum hardware of choice (superconducting qubits) is beyond the scope of this article. However, we briefly describe the devices that are publicly accessible or were used in the described experiments (e.g., the quantum chemistry [38], [111] and machine-learning experiments [37]). These devices are single-junction transmon [134] qubits coupled to each other according to the specified connectivity. Single-qubit gates are implemented using microwave pulses and benchmarked using the standard randomized benchmarking protocols (see, for example, [135]). Two-qubit gates are implemented using the cross-resonance gate [136], [137]. Qubits are measured using dispersive readout techniques [138]. For a Fig. 6. Molecular simulations of LiH using four qubits from [38] and [112]. The experimental data (black circles) are compared with the exact energy curve (green dotted line). Left: results from [111]. Right: data from [38], which uses error mitigation. The error-mitigated

estimates (black circles) in right are obtained by extrapolating the results from experiments of varying noise (colored circles) and display far superior accuracies, without significant hardware improvements to the processors used for these computations. Figures modified from [38]
and [112].
broad review of superconducting circuits, we refer the reader to [7], [8], [140], and [141] for in-depth reviews of superconducting qubits that provide extensive information about types of superconducting qubits, single-and twoqubit operations, and readout, as well as some examples of anticipated engineering challenges.
Moving forward, the engineering challenges for any quantum computer are enormous, not just for superconducting circuits. Depending on the technology, various hurdles must be overcome to implement increasing numbers of qubits. Within superconducting qubits, it is widely acknowledged that microwave engineering plays a crucial role (e.g., [141]). As chip sizes increase beyond the wavelength of interest (qubit frequencies are typically in the GHz regime), it is critical to account for parasitic microwave modes [142], microwave crosstalk, and microwave filtering [143] in order to ensure that the qubits are not unintentionally coupled to spurious modes or each other, to ensure the qubit coherence times are not impacted, and to allow for fast readout [138] all while still retaining precise microwave control (or flux biasing in the case for flux-biased operation). In addition, advanced packaging is generally considered a requirement to allow vertical signal delivery to the qubits. Therefore, there is much interest in fabrication integration, including superconducting bump bonds, thru-silicon vias, and chip stacking. A recent comprehensive article describes 3-D integration for superconducting qubits, including many of the aforementioned technologies [144]. Finally, moving beyond the qubit chip, the infrastructure to support large qubit systems needs to be developed [145]. This includes transitioning away from standard microwave coaxial lines in favor of flex coax lines [146] and low-cost solutions for microwave (or flux bias) control either at room temperature or at cryogenic temperatures [147], [148]. Finally, overall system reliability also plays a crucial role. We stress the importance of engineering systems capable of operating over prolonged periods of time. Rudimentary demonstrations are notoriously unstable and often barely capable of gathering sufficient data for a scientific publication; they are essentially physics experiments. Both external influences and internal device noise can cause parameters to quickly drift on experimental timescales, potentially rendering a device unsuitable for use in a quantum system due to the prohibitive amount of device calibrations required.
Needless to say, tackling all of the engineering challenges is a substantial undertaking and should not be underestimated. We anticipate that substantial research needs to be directed toward solving these issues while, at the same time, fostering essential basic research contributions from the broader community. We believe fundamental contributions will be critical for the success of long-term quantum computing.

VIII. C O N C L U S I O N
This article described the various challenges and opportunities associated with near-term quantum systems, highlighting the necessary components to bring practical quantum computers closer to reality. A unique interplay between hardware, hardware access, software, benchmarking, applications, and error mitigation techniques is required to develop a quantum system capable of one day executing practical calculations or simulations.
We have also laid out our software approach, which is heavily user-oriented. We feel strongly that a healthy user base will be a guiding force, helping to shape the technical direction for future quantum devices. We have presented the available toolset (i.e., various access levels, SDKs, Qiskit, and so on), and we observe appreciable demand for such tools. Integrating those tools with stable hardware is a significant effort, but we feel it is worth the challenge, as we are exploring and developing systems that have not yet been built! It is duly acknowledged, however, that the full potential of near-term quantum systems is presently unknown. While no fundamental roadblocks have yet materialized, it is possible that the application range lags behind some of the more enthusiastic expectations. We are optimistic that the near-term quantum systems will be capable of at least shedding new light on unexplored physics lying just out of reach for modern simulation tools or other derivative applications related to, for example, quantum sensing. There are many areas to study and explore, and by giving the right access and tools to researchers, we hope to accelerate the pace of discovery.