Analysis of a Programmable Quantum Annealer as a Random Number Generator

Quantum devices offer a highly useful function - that is generating random numbers in a non-deterministic way since the measurement of a quantum state is not deterministic. This means that quantum devices can be constructed that generate qubits in a uniform superposition and then measure the state of those qubits. If the preparation of the qubits in a uniform superposition is unbiased, then quantum computers can be used to create high entropy, secure random numbers. Typically, preparing and measuring such quantum systems requires more time compared to classical pseudo random number generators (PRNGs) which are inherently deterministic algorithms. Therefore, the typical use of quantum random number generators (QRNGs) is to provide high entropy secure seeds for PRNGs. Quantum annealing (QA) is a type of analog quantum computation that is a relaxed form of adiabatic quantum computation and uses quantum fluctuations in order to search for ground state solutions of a programmable Ising model. Here we present extensive experimental random number results from a D-Wave 2000Q quantum annealer, totaling over 20 billion bits of QA measurements, which is significantly larger than previous D-Wave QA random number generator studies. Current quantum annealers are susceptible to noise from environmental sources and calibration errors, and are not in general unbiased samplers. Therefore, it is of interest to quantify whether noisy quantum annealers can effectively function as an unbiased QRNG. The amount of data that was collected from the quantum annealer allows a comprehensive analysis of the random bits to be performed using the NIST SP 800–22 Rev 1a testsuite, as well as min-entropy estimates from NIST SP 800-90B. The randomness tests show that the generated random bits from the D-Wave 2000Q are biased, and not unpredictable random bit sequences. With no server-side sampling post-processing, the 1 microsecond annealing time measurements had a min-entropy of 0.824.


Introduction
Random number generation (RNG) is a very important capability in information computing.In particular unbiased random number generation is extremely important in many computing applications.Pseudo-Random Number Generators (PRNGs) are deterministic and very fast software level algorithms that can reliably generate random numbers.True Random Number Generators (TRNGs) are based on a physical property of a system that makes the random number generation inherently non-deterministic.Quantum systems have this property of non-determinism where it is not possible to know deterministically what the measured state of a quantum system will be before it has been measured.
Testing for randomness, in particular secure and unbiased randomness, is not directly possible.Instead, tests for patterns and biases that are clearly not random can be tested for [1][2][3][4][5].If a proposed RNG is tested against enough of these tests which can detect non-random data, then you can be reasonably confident in the ability of the RNG to generate uniformly random numbers.
One of the types of programmable quantum computers that have become available to test, typically as cloud computing resources, are D-Wave quantum annealers.Quantum annealing is a specialized type of quantum computation that aims to sample the optimal solution(s) of a combinatorial optimization problem, ideally using adiabatic evolution [6][7][8][9][10].Quantum annealing hardware is typically implemented using the transverse driving Hamiltonian where the system is initialized in the groundstate of the Transverse-field Hamiltonian [7][8][9][10][11][12].D-Wave quantum annealers are physically implemented using programmable superconducting flux qubits [13][14][15][16][17][18][19].Quantum annealers, and more generally quantum computers, are potentially interesting as secure entropy sources for generating random numbers because of the inherent stochasticity of measuring quantum states -there is not a deterministic mechanism to compute what the measured state will be of an arbitrary quantum state.For this reason, quantum computers, and more generally physical sources of measurements of quantum information, are True Random Number Generators (TRNGs) (or QRNGs) [4,20,21].Importantly, there exist current technologies which are secure, high bit-rate, QRNGs [22,23].
The primary reason that modern quantum annealers are not perfect random number generators is because there are a large number of sources of error and bias in the computation, for example the spin bath polarization effect [24,25] can cause sequential anneal-readout cycles to have self correlations (in time), and programmed coefficients (even if they are 0) have slightly different effective weights on the hardware [26].Furthermore, it has been shown that modern D-Wave quantum annealers have a measurable performance change over time [27,28].There have also been cross-qubit correlations observed on a D-Wave 2000Q chip [29,30].There have been studies which aim to reduce biases and noise present in minor-embedded QA computations, which in the case of reducing biases in the constraint of the graph partitioning problem results in effectively attempting to create an unbiased Quantum Annealing random number sampler, see ref. [31].Interestingly, quantum annealing (even in an ideal computation, with no noise), does not in general sample degenerate groundstates (i.e.optimal solutions of the combinatorial optimization problem) uniformly due to the transverse field driving Hamiltonian [32][33][34][35][36], meaning that using the QA sampling of the groundstates of non-trivial Hamiltonians would not be a good source of unbiased random numbers.Instead, the much simpler case of an all 0 coefficient Ising model is the most direct way to program these devices to produce random numbers (see more details in Section 2.1).D-Wave quantum annealers have been evaluated, on somewhat small problem sizes, for the possibility of utilizing them as TRNGs1 in previous studies [37,38].
Quantum random number generators in general are a topic of much interest, for example there have been several studies which examined using gate model quantum computers as random number generators [39][40][41][42][43][44], boson sampling [45], using quantum walks to generate random numbers [46,47], and device-independent secure random number generation [48].The idea of using random dense quantum volume circuits (see refs.[49][50][51] for details on quantum volume circuits) as random number generators in the gate model setting have also been proposed [52].
This paper presents the most comprehensive review of using a quantum annealer as a random number generator performed to date, totalling over 20 billion bits of qubit measurements, and testing 8 different QA device settings for how they impact the measured bits.In particular, this very large dataset allows all of the NIST SP800-22 randomness tests, and all of the NIST SP800-90B min-entropy (non-IID) tests, to be executed on the data (some of the tests have a minimum bit length requirement).This has not been able to be done before for quantum annealing random bits [37,38] or more generally for cloud accessible quantum computers.
All data from this study are publicly available as a Zenodo dataset [53].

Methods
Section 2.1 details the Quantum Annealing implementation details, and Section 2.2 details the randomness testsuites that are used.

Quantum Annealing
The computation performed by D-Wave quantum annealers is described by eq. ( 1), and eq.( 2) describes the discrete optimization Ising model that a user can program to be sampled by the quantum annealer (the quadratic coefficients are subject to the constraint of the native connectivity of the quantum annealing hardware).The functions A(s) and B(s) define the Transverse field driving Hamiltonian strength and the programmed Ising model Hamiltonian, respectively, parameterized by the variable s.In standard quantum annealing, which is the setting used in this study, s defines a linear schedule as a function of anneal time, and the strengths of A(s) and B(s) at each s step are system defined quantities.At the beginning of the anneal the A(s) term dominates, and then over the course of the anneal A(s) is reduced in strength and B(s) is increased in strength.Eq. ( 2) is a slight re-formulation of the Ising model defined in the second summation term of eq. ( 1).The goal of the quantum annealer is to find a minimum variable assignment vector z given the objective function eq. ( 2).The variable states can be either {0, 1} n , in which case the combinatorial optimization problem is a Quadratic Unconstrained Binary Optimization (QUBO) problem, or the variables can be spins {+1, −1} n , in which case the combinatorial optimization problem is an Ising model.
The D-Wave quantum annealer that is used to generate random bits is DW 2000Q LANL, the Chimera hardware graph for this device is shown in Figure 1.The simplest way to generate random bits using a quantum annealer is simply to set the user programmed coefficients for all linear terms (e.g.hardware qubits) and quadratic terms (e.g.hardware couplers) to 0, meaning that only the transverse field Hamiltonian is present in the computation (ideally), which means that the qubits are in a uniform superposition, while the computation is coherent, during the anneal.There are certainly more complicated ways that could be utilized with the goal of extracting good random bits, such as random circuit sampling on gate model devices [52] or by tuning devices biases to improve sampling of balanced partitions for the graph partitioning problem [31], however in general RNG's need to be as fast as possible and therefore minimizing the complexity of the computation is likely a good motivation to aim for.Explicitly, the Ising model (variable states are ∈ {+1, −1}) that we will sample is given in eq. ( 3).
In many QRNG systems readout time is much longer than comparable PRNG's, and for cloud based quantum annealers this is also an important aspect to consider.In the experiments we perform, the total annealing time will be varied from 1 microsecond up to 2000 microseconds (1 microsecond is the shortest annealing time available on the D-Wave 2000Q devices).The potentially relevant thing that can be investigated is whether the 1 microsecond annealing times can give high quality random bits, because this utilizes a relatively small amount of compute time (certainly compared to longer annealing times).
The main D-Wave parameters that will be varied are the annealing time, which will use 1, 10, 100, and 2000 microseconds.These annealing times span the range of allowed annealing times on the DW 2000Q LANL chip; 1 microsecond is the smallest annealing time and 2000 is the longest available annealing time.The other parameter that will be tested is turning on server side classical post processing which aims to improve sampling (although in this specific case, the sampling is being done on an all zero coefficient Ising model).In order to turn on this server side post processing, the user facing post process option was set to sampling [54].The sampling server side post-processing option performs local bit changes on the measurements (before sending the results to the user) with the goal of obtaining a post-processes set of samples that corresponds to a Boltzmann distribution with inverse temperature β, where β is β is set to a value near to the inverse temperature corresponding to the raw samples (see ref. [54] for more details).We test turning this server-side post processing option on and off.The reasoning is that we would ideally want to the quantum annealer to be able to produce unbiased random bits without this post processing, however it may be the case that the classical post processing helps reduce bias in the samples at with a small computational overhead, in which case it would be interesting to quantify this.Therefore, in total there will be 8 datasets, each using a different quantum annealer parameter choice.Each of the datasets will be strictly sequential in time -e.g. the order of the bits will not be changed by some other entropy source.This time series representation of the data is especially important since it has been shown that there are long term trends that can be observed in current D-Wave quantum annealing processors [27].Additionally, the exact ordering of the bits within each anneal (e.g.whole-chip readout cycle) is strictly based on the logical qubit indexing within the hardware, which is fixed for all samples, but is arbitrarily set.Each anneal-readout cycle is concatenated with the next anneal-readout cycle that was executed in time -no other source of entropy is present in the data.
With the goal of mitigating the spin bath polarization effect [24] self-sample correlations, all of the data is constructed by sequentially calling the D-Wave backend for a single anneal-readout cycle (i.e.instead of measuring many anneals in a single job).Each job is sampled as an Ising model, meaning that spins are the measured states (although whether the model was specified as QUBO or Ising would in principle have no impact on the results).Additionally, although the bits are all time ordered, because of network interruptions or device power losses there are gaps in the time of the sequential random bit sampling.
The parameters used for the 8 datasets is as follows: 1. Test 1 uses server side classical post-processing, and an annealing time of 1 microseconds.For all 8 datasets, the programming thermalization and readout thermalization are set to 0 microseconds so as to remove any thermalization effects (beyond thermalization that occurs after the qubits lose coherence; the qubit coherence times are estimated to be on the order of 10's of nanoseconds [55]).All other parameters are set to default.The motivation for evaluating these different parameter choices is the following.

Although in general increasing annealing times on D-Wave quantum annealers results in better sampling
success rate of combinatorial optimization problems, these long annealing times are much longer than the qubit coherence times of current D-Wave quantum annealers [55,56].Therefore, the longer annealing times are using thermalization to marginally improve the sampling success probability [57].However, in this case there is not a combinatorial optimization problem being sampled -therefore in principle it may be the case that longer anneal times accumulate more errors in the computation, in particular more biases in the random sampling computation we aim to perform.Therefore, it may be advantageous to sample using the shortest annealing times available on the hardware -whether this is true or not is the aim of the varying annealing times.Furthermore, sampling rate for random numbers is extremely important -faster random bit sampling is more useful than slower sampling.Therefore, if the smaller annealing times produce high entropy random bits this would be better than using longer annealing times.2. The server side classical post-processing is not ideal since we wish to evaluate whether the bare quantum annealing hardware can produce good random bits.However, this post processing is intended to improve sampling of combinatorial optimization problems, and in this case there is nothing to optimize with respect to energy.Nevertheless, it is an interesting question to consider whether there is a clear difference between the QA sampling with and without the server side post processing for sampling optimization.

Testing for randomness
The randomness test that will be applied to the data are all of the tests from SP800-22 Rev 1a by National Institute of Standards and Technology [58], titled A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications.This testsuite contains 15 randomness tests, two of which contain several sub-tests.In total each of the 8 datasets will be tested against 38 randomness tests, each giving a p-value output.For the purposes of maintaining consistency, and using the original NIST SP 800-22 test definitions [58], a computed P-value which is ≥ 0.01 would accept the sequence as being random, and otherwise we would consider it to be non-random.This p-value threshold criteria is applied to all of the randomness tests.In the tests where there are multiple computed P-values, such as cumulative sums where there is a forward and backward mode of operation, all P-values must be greater than or equal to 0.01 to be considered random.The serial test also outputs two P-values.
The implementation used for this analysis is the Python 3 package nistrng2 ; this package was chosen primarily for its compatibility with NumPy [59] arrays, which was necessary for the size of the datasets being tested.Other implementations that are, for example, based on casting the bits to integers does not scale well to these large dataset sizes.
The randomness test implementation details and references are not enumerated here -all details can be found in ref. [58] along with the linked open source code implementations.
In the context of verifying entropy sources, a useful measure is the min-entropy, which is a conservative measure of entropy sources.The min-entropy metric gives a clear way of determining how unpredictable a set of random variable samples is, and therefore is another way of quantifying bitstring randomness.Min-entropy is maximized for a uniform distribution, as with standard Shannon entropy [60], and minimized closer to 0 for biased distributions.In this case, we apply the NIST SP 800-90B testsuite (titled Recommendation for the Entropy Sources Used for Random Bit Generation) [61] 3 in order to compute min-entropy estimates on the quantum annealing bitstrings.In particular, the non IID testsuite is executed in order to obtain the h Original min-entropy estimates from the 10 tests in the testsuite.The NIST SP 800-90B non-IID track is intended to be applied to noise sources that do not generate Independent and Identically Distributed (IID) samples.The testsuite was executed with the help of Charliecloud containerization [62].

Results
Table 2 shows the complete randomness testsuite data for the 8 QA implementation variations.Table 1 shows the total size of each of the 8 datasets.The threshold for failing each randomness test varies depending on the test, but a p-value less than 0.01 definitely shows that the dataset fails that randomness test.The result is that there is no QA device setting that generates random bit strings that pass all of the randomness tests.Notably, the server side classical post processing did improve the random bitstring -in the sense that more of the tests passed when that post processing was applied.Also very notable is that the raw non post-processed QA data failed the monobit test, arguably the most fundamental randomness test that can be applied.This shows Lastly, Table 3 enumerates the min-entropy estimates of the entirety of the QA bitstrings for all 8 device settings, which shows that the post-processed datasets have a higher (better) min-entropy compared to the raw measurements.For each of the 8 datasets, the overall min-entropy is the minimum h Original computed across the suite of tests.The min-entropy, like standard information entropy, is maximized for a uniform distribution, and in this case (for bitstrings) the maximum entropy is 1 and the closer the entropy is to 0 corresponds to more biased samples.

Discussion and Conclusion
Even if QRNGs based on near term devices are fundamentally non-deterministic, noise present in the computation can still produce biased random bitstrings.This is what is observed in this D-Wave quantum annealer data.This is not unexpected, especially given the observed trends over time on multiple D-Wave quantum annealers [27].However, it is important to note that this type of biased random sampling is very likely to occur with other Noisy Intermediate Scale Quantum (NISQ) computers [63].It is necessary that extensive tests, such as the ones presented in this paper, must be executed in order for such quantum devices to pass the threshold of being unbiased random bit samplers.Importantly, even large testsuites can not absolutely determine that the generator is indeed random -there are only tests which can show that a bit sequence is not random, or in other words the null hypothesis can never be proven to be true, it can only be observed to fail.Indeed, it has been shown that the NIST SP 800-22 testsuite is not sufficiently rigorous for verifying randomness [2,64,65].However, it does serve as a reasonable minimum threshold test, which in this case the D-Wave 2000Q device did not pass.Interestingly, within each QA dataset there are sometimes only a few tests which failed, but on the whole many of the tests were passed with p-values much greater than 0.01.In terms of the min-entropy measure, all 4 device settings which used server-side sampling post processing had a higher (e.g.better) min-entropy compared to the raw results.Across the battery of tests on the QA bitstrings with no post-processing, the 1 microsecond annealing time bitstrings had a min-entropy of 0.824339, the 2000 microsecond annealing time bitstrings had a min-entropy of 0.847210, the 10 microsecond annealing time bitstrings had a min-entropy of 0.828144, and the 100 microsecond annealing time bitstrings had a min-entropy of 0.888519.
Evaluating this source of random numbers using more comprehensive testsuites, such as dieharder [66] would be good -the limitation is that those tools require a significant amount of data to be analyzed (more than used in this analysis), which is currently not feasible to obtain using cloud based quantum computer access.In general, we expect that longer coherence times [55,56] and lower error rates of manufactured quantum annealers would correspond to being able to produce higher quality random bit strings.
A potentially interesting analysis on this existing data from DW 2000Q LANL would be to determine if there are strong cross-qubit correlations on the chip.If such correlations exist, then this could indicate cross-talk errors from the control system.

Figure 1 : 2 . 3 . 4 . 5 . 6 . 7 . 8 .
Figure1: LANL D-Wave 2000Q hardware connectivity graph (the name of this type of connectivity graph is Chimera, which is in general a sparse but scalable hardware implementation of quantum annealing).This device has 2032 active qubits (due to hardware defects the full Chimera lattice of 2048 qubits is not active).The chip id of this device is DW 2000Q LANL.

Figure 2 :
Figure 2: Bit plot of a subset of the D-Wave QRNG measurements visually showing +1 and −1 qubit states, for the 1 microsecond annealing time and no server side post processing runs (Test 5).There are 2032 row indices, corresponding to the 2032 qubit indices, and 300 column indices corresponding to 300 time ordered anneal-readout cycles.Noticeably, there are clearly correlations in the time ordered bit sequences.

Figure 3 :
Figure 3: Bit plot of a subset of the D-Wave QRNG measurements visually showing +1 and −1 qubit states, for the 2000 microsecond annealing time and no server side post processing runs (Test 6).There are 2032 row indices, corresponding to the 2032 qubit indices, and 300 column indices corresponding to 300 time ordered anneal-readout cycles.

Table 1 :
Each dataset with different D-Wave settings contains at least 2.5 billion time ordered bits.

Table 3 :
Min-entropy estimates (h Original ) for all quantum annealing experiment binary datasets, executed on DW 2000Q LANL.