Probing Quantum Telecloning on Superconducting Quantum Processors

Quantum information cannot be perfectly cloned, but approximate copies of quantum information can be generated. Quantum telecloning combines approximate quantum cloning, more typically referred to as quantum cloning, and quantum teleportation. Quantum telecloning allows approximate copies of quantum information to be constructed by separate parties, using the classical results of a Bell measurement made on a prepared quantum telecloning state. Quantum telecloning can be implemented as a circuit on quantum computers using a classical coprocessor to compute classical feedforward instructions using if statements based on the results of a midcircuit Bell measurement in real time. We present universal symmetric optimal <inline-formula><tex-math notation="LaTeX">$1 \rightarrow M$</tex-math></inline-formula> telecloning circuits and experimentally demonstrate these quantum telecloning circuits for <inline-formula><tex-math notation="LaTeX">$M=2$</tex-math></inline-formula> up to <inline-formula><tex-math notation="LaTeX">$M=10$</tex-math></inline-formula>, natively executed with real-time classical control systems on IBM Quantum superconducting processors, known as dynamic circuits. We perform the cloning procedure on many different message states across the Bloch sphere, on seven IBM Quantum processors, optionally using the error suppression technique X–X sequence digital dynamical decoupling. Two circuit optimizations are utilized: one that removes ancilla qubits for <inline-formula><tex-math notation="LaTeX">$M=2, 3$</tex-math></inline-formula>, and one that reduces the total number of gates in the circuit but still uses ancilla qubits. Parallel single-qubit tomography with maximum likelihood estimation density matrix reconstruction is used in order to compute the mixed-state density matrices of the clone qubits, and clone quality is measured using quantum fidelity. These results present one of the largest and most comprehensive noisy intermediate-scale quantum computer experimental analyses on (single qubit) quantum telecloning to date. The clone fidelity sharply decreases to 0.5 for <inline-formula><tex-math notation="LaTeX">$M > 5$</tex-math></inline-formula>, but for <inline-formula><tex-math notation="LaTeX">$M=2$</tex-math></inline-formula>, we are able to achieve a mean clone fidelity of up to 0.79 using dynamical decoupling.


Introduction
One of the fundamental properties of quantum mechanics is that unknown arbitrary quantum information can not be cloned [1,2], more specifically it can not be perfectly cloned.This result is highly consequential, both as a tool and a hurdle that causes quantum information processing to be handled very differently from classical information processing in many domains of quantum information, including quantum networking communication [3][4][5], quantum cryptography [6][7][8][9][10][11][12][13][14][15] and quantum error correction [16][17][18].However, it was subsequently shown that in fact approximate quantum cloning is possible [19].
Since then, there have been a large number of variants of approximate quantum cloning (copying) protocols 1 [20][21][22][23][24]. Optimal quantum cloning refers to any process that approaches the theoretical upper limit of the cloning process with respect to clone quality [25][26][27][28], usually measured as the fidelity [29], of the generated clones for an input quantum state allowed by the laws of quantum mechanics.How close a cloning protocol is to the theoretical limit can be established by computing the fidelity of the prepared cloned quantum state and the original pure quantum state.Eq.( 1) defines the optimal universal (e.g.state independent) quantum cloning fidelities when copying N quantum states to M clones.Universal quantum cloning refers to a cloning process where the quality of the clone, e.g.how closely it represents the state which is copied, is independent of the quantum state that is being cloned and all generated clones would have the same ideal state overlap with the original input state [30][31][32][33][34]. State depending quantum cloning [32,[34][35][36][37], in contrast to universal quantum cloning, performs a copying process that results in clones whose quality is dependent in the state that is cloned.For example, phase-covariant quantum cloning generates perfect quantum copies of equatorial single qubit states on the Bloch sphere single qubit states [38,39].Symmetric quantum cloning refers to a cloning process which generates indistinguishable clones, and asymmetric quantum cloning generates clones of varying quality (e.g.not identical clones).Probabilistic quantum cloning refers to a cloning process where the expected clone quality is not deterministic, although on average it will provide the same clone quality as a deterministic quantum cloning process [40][41][42].Quantum cloning can be applied to discrete variable quantum information, as well as continuous variable quantum information [12,43].Quantum cloning, of all variants enumerated above, can be applied to any type of quantum information including qudits [13,[44][45][46][47][48][49], where the quantum information unit is a discrete d-dimensional system.There have been a large number of experimental realizations of different quantum cloning variants using a number of different quantum technology platforms, see refs.[10,[50][51][52][53][54][55][56][57] for more examples of physical quantum cloning experiments.Quantum telecloning, which is the focus of this paper, is a combination of a quantum cloning process and quantum teleportation [58][59][60][61][62]. Studying quantum cloning is of fundamental interest for the field of quantum information processing, and has relevance in quantum information in physical phenomena [63].
For this article, we use an existing quantum telecloning circuit method [64] along with a telecloning circuit ancilla qubit optimization technique [65] to construct and execute quantum telecloning circuits using the dynamic circuit functionality on IBMQ systems [66][67][68][69] in order to directly execute the telecloning protocol using mid-circuit measurements and real time classical co-processing.These quantum telecloning circuits are optimal, universal, symmetric, and act on qubit based quantum information systems (e.g.d = 2 quantum systems, as opposed to general d dimensional systems).Importantly, these quantum telecloning circuits [64,65] are scalable to any number of clones M for 1 → M ; the limitation is that as M increases, the resulting quantum telecloning machine (e.g.circuit) has increased gate depth, total gate count, and number of qubits used, making experimental demonstrations of very large quantum telecloning circuits difficult because of errors and noise present in the computation [70].Expanding on the capabilities of OpenQASM 2 [71], OpenQASM 3 [66] provides the capability to specify these dynamic circuits; in particular, we use mid-circuit measurements and classical conditional feed-forward operations, which subsequently cause different quantum gate instructions to be applied to the circuit while it is executing (e.g. while the state of the circuit is not measured).During the classical processing, the quantum processor could be performing additional operations, or could be idling and waiting for the result of the classical processor in order to proceed.In this case, the quantum processor is idle for a period of approximately several hundred nanoseconds [67], which could introduce errors into the computation.Because quantum telecloning circuits require native classical if-statements, conditioned on mid-circuit measurements, running quantum telecloning circuits using the dynamic circuit capability of OpenQASM3 on IBMQ devices provide a reasonable analysis of the current achievable device performance when classical conditional operations and mid-circuit measurements are used.Note that Quantinuum quantum computers also provide mid-circuit measurement and classical conditioning capability, which was experimentally demonstrated for 1 → 9 on Quantinuum H1-1 [65].
More precisely, quantum telecloning circuits with ancilla qubits, using the ancilla optimization from ref. [65], for N = 1 → M = 2, 3, 4 are executed on seven IBM Quantum computers.For M = 2 and M = 3, gate model circuit constructions are known which do not use ancilla qubits [64] and for M = 2, 3 we implement quantum telecloning circuits without ancilla qubits.Importantly, these quantum telecloning circuits [64,65] rely on preparing Dicke state unitaries, and recent advancements in optimized Dicke state preparation circuits [72][73][74] allows for the quantum telecloning circuits which utilize fewer gate operations.Specifically for the experiments we present, we utilize the optimized Linear Nearest Neighbors (LNN) connectivity Dicke state preparation circuits.Dicke states are uniform superpositions of n qubit states with hamming weight k, which interestingly turn out to be a critical building block for quantum telecloning states.
These quantum telecloning results are the most comprehensive experiments reported to date of quantum cloning, executed on quantum computers.For instance, there have been a number of experimental demonstrations of 1 → 2 quantum cloning [50], but here we present circuits which can implement arbitrary 1 → M quantum telecloning, and present experimental results for up to M = 10 on IBM Quantum fixed-frequency superconducting qubit processors [75].Importantly, we provide explicit circuit instructions for building quantum telecloning machines on quantum computers, which has not generally been presented before.We believe that these quantum telecloning circuits, and future constructions of quantum cloning circuits, can be used in quantum computations for simulating quantum information networking protocols.These experiments also demonstrate the capability of mid-circuit measurements with feed-forward classical co-processor control (also known as dynamic circuits) on IBMQ superconducting quantum processors.The use of dynamic circuits, or real time classical co-processor feedback, means that direct quantum telecloning can be executed, as opposed to processors where this classical feedback is missing and alternative methods such as deferred measurement and post selection must be utilized [64].
Data, code, and extra figures are available on a public Github repository2 .

Quantum Telecloning circuits
Algorithm 1 describes the Quantum Telecloning protocol for distributing M single qubit clones of an arbitrary N = 1 qubit state to M separate parties.For general 1 → M quantum telecloning, a telecloning state A (M −1) P C M is prepared of the form [58] where |D M i ⟩ denotes the uniform superposition over all M -qubit states of Hamming weight i with real amplitudes.Dicke state unitaries DSU(M ) and Split & Cyclic Shift unitaries SCS(m) [72] are defined as: Dicke state unitaries are constructed recursively, The observation from ref. [65] is that previously, quantum telecloning circuits with ancilla applied a Dicke state unitary DSU(M ) on the port and ancilla qubits [64]; however, only the first SCS unitary SCS(M ) acts on the port qubit, the remaining SCS(m) unitaries comprising a DSU(M − 1) unitary act only on the to-be discarded ancilla qubits.Therefore, the remaining SCS(m) unitaries can be removed from the circuit without affecting the quantum telecloning state, thus reducing the number of two qubit gates significantly (especially for large M ). Figure 1 details the quantum telecloning circuit construction for 1 → 3 quantum telecloning, using the ancilla optimization provided in ref. [65].Importantly, the circuits with ancilla qubits, including the the ancilla optimization provided in ref. [65], can be fully generalized to 1 → M telecloning.Figure 1 also describes the quantum telecloning circuit for 1 → 3 quantum telecloning circuits with no ancilla qubits from ref. [64].A telecloning circuit without ancilla qubits is also known for 1 → 2 [64].More detailed circuit descriptions, including compiled circuit examples, are given in Appendix A.
Note that the clones produced by the quantum telecloning circuits are weakly entangled, which we will describe using the entanglement measures of negativity [82][83][84] and concurrence [85,86].Importantly, the will-be clone qubits are not entangled before the Bell measurement is performed [58], but after the Bell measurement they become weakly entangled.Negativity and concurrence are two quantum information measures of entanglement that are defined on a density matrix ρ where the quantum state is not separable (e.g. the qubits are entangled) if the negativity or concurrence of ρ greater than 0 -if the measures are 0 then the states are not entangled.Concurrence is defined to be ∈ [0, 1] and negativity is defined to be ∈ [0, 0.5].The operations are defined as N (ρ) for negativity and C (ρ) for concurrence.
For a symmetric universal quantum telecloning machine the 1 → 2 quantum cloning process produces the subsystem of 2 clones which have a concurrence measure of C (ρ M =2 ) = 1  3 , and a negativity measure of . The computed entanglement measures for the quantum clones are independent of the single qubit state that is cloned.The entanglement measures were computed using Qiskit to numerically simulate the circuits classically and compute the density matrices of the 2 qubit clone sub-system.These entanglement measures are independent of whether the quantum telecloning circuit is the variant with or without ancilla qubits.The Python 3 packages Qiskit [76] and Toqito [87] helped with the numerical computation of the negativity and concurrence measures from the density matrices.Concurrence is defined only for two qubit states, whereas negativity is defined for any multi qubit state.

IBM Quantum computer implementation and result characterization details
All circuits are optimized and adapted to the IBM Quantum hardware gateset using the Qiskit [76] transpiler with optimization level=1 (which is the highest optimization setting that can be applied to dynamic circuits for the version of Qiskit used in these experiments, which was qiskit-terra==0.24.1).All circuit executions used all default settings (except the number of shots), meaning that all experiments used meas level=2.The IBM Algorithm 1 Quantum 1 → M Telecloning Protocol State Preparation: 1: A message qubit q m is prepared by a sender (e.g.N = 1) 2: A quantum telecloning state T C is constructed with -(up to) (M − 1) ancilla qubits A, 1 Port qubit P , and -M clone qubits C (symmetrized with Dicke state unitaries; sent to the receivers).Teleportation: 3: A Bell measurement is made between q m and P , and the results are communicated over a classical channel (assumed to be noiseless) to the clone holders.4: The clone holders use the two classical bits from the bell measurement to decide whether to apply X-and/or Z-gates to the clone qubits so as to construct the approximate clones: -Φ + : apply no gate -Φ − : apply Z-gate -Ψ + : apply X-gate -Ψ − : apply X-then Z-gate Result: 5: M approximate clones of q m have been generated, by the M clone holders, with theoretical maximal single qubit fidelity given by eq. ( 1).
Figure 1: Quantum Telecloning circuits for 1 → 3, using ancilla qubits (left) and no ancilla qubits (right).The Quantum Telecloning circuit with ancilla (left) use the gate reduction optimization introduced in ref. [65] and can be scaled to any value of M (number of clones), whereas the circuits with no ancilla are only known for M = 2 and M = 3.The circuit on the left shows where previously a DSU (3) unitary would be acting on the port and ancilla qubits, we only apply an SCS(3) unitary -discarding the following DSU (2) unitary -and thereby saving gate operations (especially when going to DSU (M ) unitaries for larger M ).The top wire q m defines the single qubit which we want to clone, in this case parameterized by two single qubit gates (in general these two single qubit gates are not required for the protocol to work).a/b stands for R y (2 arccos( a/b)).Qubit wires labeled A are ancilla qubits, P is the single port qubit, and C are clone qubits.The end of the circuit shows where the mid-circuit Bell measurement is made between the Port qubit and the message qubit, followed by classical conditional operations on the remaining Pauli gate rotations on the clone qubits.
Quantum processor native gateset for all of the devices used in these experiments is rz, cx, sx, x.Note that on the IBM Quantum devices, the rz gate is virtual [88] meaning that it can be implemented with an error rate of 0, so it does not contribute to the error encountered when these compiled circuits are executed.The dynamic circuit control instruction blocks are all if statements.These if statements are conditioned on the results of the Bell measurement being 1 (e.g. if the bit is 1, then the corresponding Pauli X or Z gate is applied to the clone, and if not nothing is applied), for each of the M clones (this is step 4 of Algorithm 1).This means that there are 2 classical if statements for each clone (conditioned on the two bits from the Bell measurement), and therefore for every 1 → M quantum telecloning circuit that is programmed and executed on an IBM Quantum computer, a total of 2M if statements are applied.Figure 1 shows these conditional instructions, and Figures 12,13,14 in Appendix A show the complete quantum telecloning circuits with the if-else control blocks.Note that although we do not use this feature (all classical control mechanisms are programmed using if / else statements), Qiskit [76] currently supports switch statements3 , which once available on IBM Quantum hardware could also be used to more efficiently execute the quantum telecloning protocol without as many separate classical conditional instructions.
For each of the initial pure quantum states that we prepare, the goal is to characterize how well the quantum telecloning process performs optimal universal symmetric approximate cloning.To this end, we use the standard quantum measure of fidelity, defined in eq. ( 3), which computes the state overlap between two density matrices ρ 1 and ρ 2 .The optimal universal symmetric quantum cloning bounds in terms of fidelity is given in eq. ( 1) for cloning 1 pure quantum state into M approximate quantum clones.A fidelity measure of 1 means that the states are exactly overlapping.A fidelity of 0.5 means that although there is state overlap, the overlap is no better than choosing pairs of random density matrices and measuring the state overlap -meaning that the single qubit clones do not provide a meaningful representation of the original pure quantum state.Figure 2: 7 different qubit subgraph isomorphic layouts for compiling the LNN quantum telecloning circuits to the relatively sparse heavy-hex connectivity, with 27 qubits.These layouts are mapped to a telecloning circuit with N = 1 and M = 10, which is the largest telecloning circuit that can be fit onto this architecture, without the introduction of significant overhead due to qubit swapping.The message qubit (N = 1) is colored red, the port qubit is dark blue, the 10 clone qubits are colored cyan, and the 9 ancilla qubits are colored green, using a total of 21 qubits.The unused qubits and CNOT connections are grey.This connectivity graph is identical between the IBMQ processors ibmq kolkata, ibmq mumbai, ibm geneva, ibm hanoi, ibm algiers, ibm cairo and ibmq auckland, meaning that the same compiled circuits can be used across all of the 27 qubit backends we used in the experiments.The quantum telecloning circuits without ancilla can be compiled to these connectivity graphs by simply not using the LNN qubit line that would have been used for the Dicke state preparation of the ancilla qubit state.
To compute the single qubit clone quality the quantum telecloning circuits generated, we need to perform quantum state tomography (QST).We use Pauli basis state tomography, although there are other types of state tomography that can be used.This procedure generally has a cost of 3 n for an n qubit system, but in this case we only are interested in single qubit tomography of the clone qubits.The procedure we follow is to prepare the quantum telecloning circuit for a specific state we want to clone, then at the end of the circuit we insert the basis change gates to put all of M clones into the Pauli X, Y, or Z basis, and then we measure the state of all M of the clones.We refer to this procedure as parallel single qubit state tomography [50,64,65].Detailed circuit examples of the Pauli basis parallel single qubit state tomography circuits are shown in Appendix A. We then repeat this protocol using 10, 000 shots, and then this is repeated so that each of the three basis states have been measured.Therefore, in total, for the purpose of characterizing the clone quality of a single pure quantum state, a total of 30000 samples are taken in order to reliably compute the true density matrix representing the physical state that was constructed (for each clone qubit), specifically to mitigate the finite sampling effect.Using these three Pauli basis measurements, we can then compute the density matrix of the mixed state for each of the M quantum clones, using maximum likelihood estimation [89] in Qiskit [76] Ignis (with slight modifications) with sequential least squares programming optimizer fitting.With the single qubit density matrices having been computed, the fidelity of the quantum clones can be computed using eq.(3).
Because the error rates of the different operations on the quantum hardware can vary significantly, the other method we will utilize is to map the circuits identically to different subgraph isomorphisms of the chip hardware, thus getting a range of the possible device performance since performance can vary significantly both over time and across a fixed superconducting quantum processor, see for example ref. [90].Figure 2 shows the 7 layouts that we use on the heavy-hex architecture [91] in order to go up to M = 10 quantum telecloning circuits (with ancilla qubits).This construction makes use of the LNN Dicke state preparation circuits to prepare the two Dicke states on two different linear lines of qubits on the heavy hex architecture.There are other possible layouts besides the 7 shown in Figure 2 on the 27 qubit heavy-hex lattice, however we used these because they are the easiest to implement -in particular they do not require use of SWAP networks to move ancilla qubits around the lattice.Without a reduction in the number of ancilla qubits required for the telecloning circuits, and without the use of SWAP gates to move ancilla qubits around the lattice, M = 10 is the largest number of clones that can be generated on these 27 qubit heavy hex lattices.The 7 circuit layouts shown in Figure 2 can be easily adapted to any quantum telecloning circuit, with or without ancilla qubits, for M < 10 by removing unused qubits down the nearest neighbors line for both the clone qubits (cyan nodes) and ancilla qubits (green nodes), and fixing the location of the Port and message qubit.These are the fixed layouts that are used for executing these circuits on the various IBM Quantum computers.We note in passing that while there are larger IBM Quantum devices with heavy-hex lattices available, which could implement larger circuits, our results suggest -as we will see -that at M = 10 clones most of the signal is lost as we measure fidelities very close to 0.5, therefore larger telecloning circuits on current hardware will likely not produce high quality clones.
Figure 3 (left) shows all pure quantum states which we aim to clone in the following experiments, represented by combining all of the vectors onto a single Bloch sphere.These states are computed by generating 20 linearly spaced angles in the range of R y ∈ [0, π] and R z ∈ [0, 2π], where R y parameterizes the ry single qubit rotation on qubit 0 in Figure 1, and R z parameterizes the subsequent rz on qubit 0. Using these simple qubit rotations we can reach any point on the Bloch sphere.And in particular, this range of angles allows the experiments to cover a range of states across the entire Bloch sphere, although not uniformly.
Figure 3: The left hand plot shows a Bloch sphere representation of all vectors of the pure quantum states that are cloned in the quantum telecloning circuits on the quantum hardware.The middle plot shows the corresponding Bloch sphere for those same quantum states (for a single qubit), after they have been copied by a universal symmetric 1 → 2 cloning machine, and the right hand plot shows a Bloch sphere representation of these same states after having been cloned by a 1 → 10 universal symmetric quantum cloning machine.Because the cloning is symmetric, there would be 2 and 10 identical copies of the qubits corresponding to the middle and right hand side plots.

Dynamical Decoupling
With the goal of improving the computation of the quantum telecloning circuits, we test digital dynamical decoupling sequences on IBM Quantum hardware.Dynamical decoupling is an error suppression technique that can mitigate certain types of noise on idle qubits by keeping the qubits isolated from environmental interactions using sequences of gates (or pulse sequences) that are equivalent to identity gate operations [92][93][94][95][96][97][98].For our experiments, we use the gate sequence of X-X; pairs of Pauli X gates, since the Pauli X gate is a native gate to the IBM Quantum devices we use.The sequences are scheduled using a Qiskit passmanager4 with the As Late As Possible (ALAP) algorithm.The X-X sequences are not added during the classical control if / else statement blocks -this is a capability that is not yet available on the IBM quantum hardware as of when these experiments were executed.As a point of comparison, for the M = 10 quantum telecloning circuits (scheduled using the ALAP algorithm), approximately 800 Pauli X gates (approximately 400 dynamical decoupling sequences) can be expected to be scheduled, dependent on the backend timing properties and the exact properties of the circuit (e.g.including what Pauli basis rotations are being applied for instance).For the M = 2 quantum telecloning circuits with no ancilla qubits, we can expect on average 14 Pauli X gates to be inserted by the dynamical decoupling pass.For each device (ibm geneva, ibmq mumbai, ibmq kolkata, ibm auckland, ibm hanoi, ibm cairo, ibm algiers), the telecloning circuits were compiled to 7 different compiled hardware layouts; only the best mean fidelity out of these is reported in these table entries.The results using X-X dynamical decoupling sequences are marked in bold.The top four rows show the clone fidelity results for the optimized telecloning circuits with no ancilla qubits, and the bottom 10 rows show results for the telecloning circuits with ancilla (for which there exists a general 1 → M circuits).The fourth column shows the theoretical fidelity that can be achieved by universal quantum cloning circuits (given by eq. ( 3)).

Bloch sphere representation of single qubit clones
An optimal universal symmetric quantum cloning machine has the property that the clones that are generated will retain the same vector angle, e.g. for a geometric Bloch sphere representation, but at a smaller magnitude than the original pure quantum state.Specifically, the generated clones are each a mixed quantum state.The shrinking factor of the generated universal, symmetric, optimal clones, is given in refs.[21,27,32], and is shown in eq. ( 4) for a d = 2 quantum system, i.e., a qubit.
Visually, this means that the computed density matrices of the clones can be plotted on a 3-d Bloch sphere representation.The input pure quantum states are an arbitrarily selected distribution on the Bloch sphere, and shown visually being aggregated together onto a single plot, in Figure 3 (left).The different vector colors in Figure 3 do not correspond to anything; they are intended to only make the vectors visually distinct.We expect that if the quantum cloning operation is ideal, then plotting the experimentally computed density matrices, represented as vectors on the Bloch sphere, would appear as Figure 3 but with every vector having been shrunk, towards the origin of the Bloch sphere by the factor η in eq. ( 4), but with the same angle.The middle and right and Bloch sphere plots in Figure 3 show what a single qubit M = 2 and M = 10 would look like with ideal universal quantum cloning.This representation of cloned quantum states is, as far as we are aware, a novel way to show an aggregated distribution of the 400 quantum states.This method of visualizing experimentally measured density matrices allows for a concise summary of any asymmetry or bias of the quantum states compared to the ideal case.Visually seeing the Bloch vectors gives some additional information on the density matrix -namely it gives the angle and magnitude -compared to purely looking at state overlap.Combined, state overlap (fidelity) and the Bloch vector plots, give a good summary of the measured cloned quantum states on the quantum computers.

Results
Table 1 presents the mean fidelity measures of the experimentally computed clones, across 7 IBM Quantum processors and for a variety of quantum telecloning circuit sizes and implementations (including X-X dynamical decoupling passes).Each data entry in Table 1 represents a total of 8, 400 dynamic circuits having been executed on the quantum computer, which corresponds to 84, 000, 000 dynamic circuit measurements for each entry.Appendix B contains aggregated statistics on the QPU time and queue times required to gather these measurements, along with examples of the types of server-side errors that were encountered.The empty cells in the data table represent sets of experiments which were not completed or performed on the device.This incomplete data is due to a variety of factors, including devices being shut-down before more circuits could be executed, server-side errors requiring circuit re-submission and thus increasing compute time, and the overall compute and queuing time being quite high for this ensemble of quantum cloning fidelity characterizations.See Appendix B for more detailed information.The fidelities in Table 1 are the maximum mean (the mean is taken across the 400 initial states) measured fidelity across the 7 hardware layouts (see Figure 2).Table 1 shows that the use of dynamical decoupling improves the best mean clone fidelity computations, with a single exception of M = 2 on ibm cairo.The clearest trend observed across all of the fidelity measures shown in Table 1 is that there is a steep decrease of clone fidelity as M increases, and without the circuit optimization that removes the need for ancilla qubits in the computation, the mean fidelities are lower.At M = 10, the mean clone fidelity converges to the noise limit of 0.5 for all devices on which that circuit size was tested.Additionally, the mean clone fidelity is typically far from the theoretical optimal fidelity.These results are consistent with the gate-level error rates of these devices.
Figure 4 shows the computed clone fidelity, and the computed single qubit clone density matrices (represented as aggregated vectors plotted on a Bloch sphere) for M = 2, with the optimization of no ancilla qubits and with dynamical decoupling sequences having been applied, run on ibm auckland.Figure 5 shows the results for the same experimental setup as Figure 5 (M = 2 no ancilla qubit telecloning circuits, and X-X dynamical decoupling sequences added) but on ibmq mumbai.Figure 6 shows the computed clone fidelity representation for M = 2, but with ancilla qubits used and no dynamical decoupling used, run on ibmq kolkata.Figure 7 shows the fidelity results for M = 2 run on ibm cairo with the removed ancilla qubit optimization, and no dynamical decoupling.
Figure 8 shows the computed clone fidelity for M = 3 circuits, with the removed ancilla qubit optimization, run on ibm hanoi with added dynamical decoupling sequences.Figure 9 shows the results for the same experimental settings (M = 3, no ancilla qubits optimization used, and with dynamical decoupling), but run on ibm algiers.
Figure 10 shows complete single qubit clone results for M = 4 quantum telecloning circuits, necessarily using ancilla qubits, with dynamical decoupling sequences and run on ibm hanoi.Figure 11 shows the fidelity results for those same experimental settings (M = 4, with dynamical decoupling), but executed on ibm auckland.The mean fidelity for the best used hardware layout for the experiments shown in Figures 4, 5, 6, 7, 8, 9, 10, and 11 are summarized, along with all of the other hardware experiments that were performed, in Table 1.
The theoretical properties of these quantum telecloning circuits are that the generated clones are optimal, meaning they adhere to the maximum clone fidelity of eq. ( 1), they are symmetric, meaning each of the clones that are produced is indistinguishable from each other, and that they are universal meaning that the clone fidelity is independent of the state that is cloned.We can examine the experimental results to see to what degree these theoretical properties are retained by the experiments.Across all of the fidelity heatmap plots, it can be seen that there is clear state dependence due to the noise characteristics of the quantum computers, and the clone fidelity does vary across the generated clones.This is not unexpected because of the highly variable noise sources on these devices.In particular in the clone fidelity heatmaps, there are visible trends that appear somewhat random and out of place, including, for instance, horizontal or vertical bands of high or low fidelity.The underlying cause of this instability is due to the circuits being executed on the devices at potentially far apart from each other in time -up to several months.This occurs in particular because of total QPU and job time, queue times, backend down times, and transient job errors that need to be subsequently fixed.Current superconducting quantum computers have a reasonably high variability over time in terms of gate fidelity and qubit characteristics, which has been studied on a number of different devices [90,[99][100][101][102][103], which then leads to potentially variable results for experiments that span a large amount of time.Examples of this variability due to data collection spanning a large amount of time can be seen in the fidelity heatmaps shown here (for example in Figure 8).Note that the ordering of the sub-figures in Figures 4, 5, 6, 7, 8, 9, 10, and 11 is the same between the Bloch sphere vector representations and fidelity heatmap representations of the data.This allows similarities between the fidelity heatmaps and the Bloch vectors to be seen for each individual clone qubit.

Discussion and Conclusion
We have demonstrated the largest and most comprehensive analysis of quantum cloning, specifically the quantum cloning variant of quantum telecloning, that has been implemented to date with respect to the size of the exper-  imental parameters studied, the largest size of the quantum telecloning circuit (of M = 10), and the number of quantum computers used.This demonstration was performed using the real time classical conditional operations available on IBM Quantum devices, known as dynamic circuits.Dynamic circuits allow the quantum telecloning protocol to be implemented natively on the hardware, as opposed to using post selection or deferred measurement techniques [64].We found that the clone fidelity sharply drops off to essentially noise for M greater than 5.We also found that the X-X digital dynamical decoupling sequences always improved the clone fidelity computation when it was applied to the dynamic quantum telecloning circuits.However, there exist many other dynamical decoupling schemes [95,96,[104][105][106] that could more effectively suppress errors on idle qubits than the relatively simple pair of Pauli X gates scheme we tested, especially if tailored for the noise profile of these superconducting qubit processors.
There are also other low-overhead protocols that could suppress errors in future telecloning experiments on noisy quantum computers such as randomized compiling (also known as Pauli twirling) [107][108][109].
As quantum processor hardware continues to scale, both with respect to number of qubits and lower error rates, these quantum telecloning circuit algorithms will allow ever larger quantum cloning systems to be implemented and empirically probed for purpose of testing the fundamental properties of quantum mechanics.Specifically, probing to what degree quantum cloning can be approximately performed is of fundamental interest for quantum information processing.
These experiments serve as a benchmark of current superconducting qubit processor capabilities with respect to mid-circuit measurement and real time classical feedback control mechanisms.This is a very important feature to be evaluated, since it is a critical ingredient in quantum error correction [67,[110][111][112][113].
The cloned qubit Bloch sphere vector representations show clearly that the NISQ computations are better when the qubit being cloned is closer to the states |1⟩ and |0⟩, and states at the equator of the Bloch sphere result in clones of lower fidelity.This points to a property of the superconducting qubits on the IBM Quantum computers which is interesting to observe, and is also reflected in the calibrated T1 and T2 coherence times of the devices.Similar single qubit fidelity patterns on IBM Quantum superconducting transmon qubits have been observed in  previous studies [64,114].
Given that quantum cloning machines, such as quantum telecloning, can now be instantiated on quantum computers, there is an opportunity for quantum computers to serve as a quantum networking protocol test bench, as well as a simple way to understand the capabilities of current quantum computers [115].Future quantum networking may need to utilize quantum cloning in some form, and quantum computers can serve as a mechanism to test those protocols on-chip so as to evaluate their effectiveness for use in real world quantum networks.There are not very many use cases that have been proposed for using quantum cloning in communication networks.We encourage research into use cases of quantum cloning in communication networks and information processing in general, since this is an important fundamental limit on copying quantum information.There exist several proposals for ways that quantum cloning could be used to improve aspects of quantum-classical information transmission, for example ref. [116] suggests an application of quantum cloning that can be used to enhance the transmission fidelity over a  noiseless lossy quantum channel.Ref. [117] gives two examples where quantum cloning could be used to improve quantum computation for certain tasks.As quantum error correction and quantum error suppression techniques continue to improve, and as hardware continues to scale in both fidelity and number of qubits, larger quantum cloning machines can be implemented to test these types of protocols beyond what could be verified classically, and therefore the circuit descriptions for the various types of quantum cloning will need to be developed.There are several open algorithmic questions in regards to quantum telecloning, and more generally quantum cloning specifically for quantum computation: 1.There does not yet exist a complete circuit model description for any type of quantum cloning, with the exception of quantum telecloning.Specific types of quantum cloning circuit models that would be interesting to develop include a universal quantum cloning machine, asymmetric quantum cloning, probabilistic quantum cloning, and qudit (d-dimensional quantum system) cloning.2. It is not known whether it is possible to construct a 1 → M quantum telecloning circuit that does not use ancilla qubits for M ≥ 4. 3. We believe that the 1 → 2 and 1 → 3 telecloning circuits without ancilla qubits that are used in this study are highly optimized, but there could exist further optimizations to reduce the gate count or gate depth of these circuits.4. A majority of the quantum cloning experiments that have been performed to date were for cloning a single qubit to multiple clones (e.g. 1 → M ).However, it is certainly an interesting question of how to generally clone N → M , where for instance the N qubits are entangled, and what those circuit descriptions are.This specific topic of cloning states which are entangled is has largely not been studied [118]. 5.The reverse of quantum telecloning, remote information concentration [119][120][121], also has not been implemented in a circuit model description.6. Determining if the optimized Dicke state preparation circuits in refs.[72][73][74] could be used for preparing other types of quantum cloning circuits besides quantum telecloning.

Acknowledgments
This work was supported by the U.S. applied, but it is X and then Z Pauli gates on each will-be clone qubit.The fifth stage applies single qubit gates to the clone qubits so as to put each of the clones into a Pauli X, Y, or Z basis -in this case all of the circuit figures in this section put the clone qubits into the Pauli Y basis.In the sixth stage, all of the clone qubits are measured and the results are stored in classical registers.Note that in the figures with ancilla qubits, the ancilla qubits are discarded and their states are not measured.Experimentally, when these circuits are executed on the quantum computers each circuit is executed with the three different Pauli basis and multiple circuit measurements (specifically 10, 000 samples per Pauli basis) to reconstruct the density matrices of the clone qubits that were prepared on the device.The quantum telecloning circuits in Figures 12, 13, and 14 that have ancilla qubits all use the circuit optimization introduced in [65], reducing the total number of gate operations that are used.Figure 14 uses ancilla qubits because there is no known telecloning circuit that does not require ancilla qubits for M ≥ 4 (for single qubit cloning).
In Figures 12,13, and 14 the message qubit is prepared with arbitrarily chosen R y = π 5 and R z = π 5 for demonstration purposes.

B IBM Quantum Computer Circuit Execution Details
The total compute time utilized in this study was quite intensive.A total of 8, 400 dynamic circuits were executed for each entry in Table 1.There are 47 entries in Table 1, meaning that 394, 800 dynamic circuits were executed in the entirety of the data shown in this study.This is equal to a total of 3, 948, 000, 000 individual circuit measurements.All of these circuits were run between December 2022 through February 2024.The total Quantum Processing Unit (QPU) time (this is the time taken on the device to run the circuits, not including queue times, networking communication time, etc) to obtain these measurements was 5, 493, 107 seconds (1, 526 hours).The cumulative queue wait time, across all jobs, was 24, 731, 058, 876 seconds (784 years), with a minimum queue wait time of 10.6 seconds, a maximum queue wait time of 2, 239, 317 seconds (26 days), and a mean job queue wait time of 65, 426 seconds (18 hours) with a standard deviation of 172, 050.These queue and compute times do not include jobs that were re-executed because of being in an error status.
The scale of this study meant that numerous jobs encountered a variety of server-side or job related errors.In these cases, the circuits would need to be re-executed (sometimes multiple times) so that the complete set of data could be obtained.In all cases, the errors that were encountered were stochastic -there was typically no clear cause of the errors, and the same identical job would not deterministically cause the same error.The types of errors changed based on the backend used and also changed over time due to the software on the devices being updated (along with Qiskit version releases).This is a short, and likely incomplete, list of the error messages encountered when executing these quantum telecloning circuits: 1. Internal Error.This was the most frequently encountered error.2. Error preprocessing job.3. Error queueing job.4. Unknown error. 5. Timed out waiting for job to finish.6. JSONDecodeError: Expecting value: line 1 column 1 (char 0) 7. Internal Error while executing OpenQASM 3 circuit.8. not able to get the queue name for ibmq mumbai 9. BackendPropertyError: 'Unable to process backend properties.' 10. "" (an empty string) There were also two types of errors that did not explicitly cause the jobs to be in an error state, but were nevertheless unrecoverable errors that required re-executing the job: 1. Jobs that were in a Cancelled state, likely due to their QPU time usage exceeding an expected pre-defined limit.2. Jobs that were in a completed, error free state, but when querying the job results in Qiskit, the result object datastructure was not in the correct format, and a JSON decoding error was raised: json.decoder.JSONDecodeError in Python 3      Compiled to a (LNN) hardware graph, specifically targeting subsets of a heavy-hex graph (see Figure 2).

Figure 4 :
Figure 4: Single qubit cloning results for M = 2 telecloning circuits with no ancilla qubits, executed with dynamical decoupling.Each column corresponds to the 7 different compiled hardware layouts.Bottom two rows show the fidelity heatmaps of the 2 single qubit clones, where the numerical fidelity is shown in the colormap legend below the figures.Each individual heatmap is plotting the R y and R z rotation angles to prepare the message qubit state (which is then fed into the quantum telecloning circuit).Top two rows show Bloch sphere vector representations of the single qubit state tomography computed density matrices.The x and y axis on the heatmpas in the top two rows encode the varying pure quantum states which are cloned.The heatmaps use bicubic interpolation, and each plot represents 400 fidelity measures.The Bloch sphere vector representations and the fidelity heatmap representations share the same ordering with respect to column and row positions of the clone numbers and hardware layout positions.The average clone fidelity over all message qubit states for each clone qubit and hardware layout is shown in the fidelity heatmap plot titles.Data from ibm auckland.

Figure 5 :
Figure 5: Single qubit cloning results for M = 2 telecloning circuits with no ancilla qubits, executed with dynamical decoupling.Each column corresponds to the 7 different hardware layouts.Bottom two rows show the fidelity heatmaps of the 2 single qubit clones, where the numerical fidelity is shown in the legend below the figures.Top two rows show Bloch sphere vector representations of the single qubit state tomography density matrices.The x and y axis on the heatmpas in the top two rows encode the varying pure quantum states which are cloned.All heatmaps use bicubic interpolation, and each sub-plot represents 400 separate fidelity measures.Data from ibmq mumbai.

Figure 6 :
Figure 6: Single qubit clone fidelity heatmaps for M = 2 quantum telecloning circuits with ancilla qubits, executed without dynamical decoupling pulses.Each column corresponds to the 7 different compiled hardware layouts.Bottom two rows show the fidelities of the 2 single qubit clones.Top two rows show Bloch sphere vector representations of the single qubit state tomography computed density matrices.Data from ibmq kolkata.

Figure 7 :
Figure 7: Single qubit clone fidelity heatmaps for M = 2 quantum telecloning circuits with no ancilla qubits, executed without dynamical decoupling sequences.Each column corresponds to the 7 different compiled hardware layouts.Bottom two rows show the fidelities of the 2 single qubit clones.Top two rows show Bloch sphere vector representations of the single qubit state tomography computed density matrices.Data from ibm cairo.

Figure 8 :
Figure 8: Bloch sphere vector representations of the computed density matrices (top 3 rows) and single qubit clone fidelity heatmaps (bottom 3 rows) of the single qubit clones for M = 3 with no ancilla, executed with dynamical decoupling.Each column corresponds to the 7 different compiled hardware layouts.Each row corresponds to the 3 different single qubit clones.The rows and column ordering of the sub-figures is the same between the Bloch sphere vector representations and the clone fidelity heatmaps.Data from ibm hanoi.

Figure 9 :
Figure 9: Bloch sphere vector representations of the computed density matrices (top 3 rows) and single qubit clone fidelity heatmaps (bottom 3 rows) of the single qubit clones for M = 3 with no ancilla, executed with dynamical decoupling.Each column corresponds to the 7 different compiled hardware layouts.Each row corresponds to the 3 different single qubit clones.The rows and column ordering of the sub-figures is the same between the Bloch sphere vector representations and the clone fidelity heatmaps.Notice that several of the Bloch vectors are clearly outlier density matrices, due to noise drift on the device.Data from ibm algiers.

Figure 10 :
Figure 10: Bloch sphere vector representations of the computed density matrices (top 4 rows) and single qubit clone fidelity heatmaps (bottom 4 rows) of the single qubit clone fidelity heatmaps of the single qubit clones for M = 4 (necessarily with ancilla qubits), executed with dynamical decoupling.Each column corresponds to the 7 different compiled hardware layouts.Each row corresponds to the 4 different single qubit clones.Data from ibm hanoi.

Figure 11 :
Figure 11: Bloch sphere vector representations of the computed density matrices (top 4 rows) and single qubit clone fidelity heatmaps (bottom 4 rows) of the single qubit clone fidelity heatmaps of the single qubit clones for M = 4 (necessarily with ancilla qubits), executed with dynamical decoupling.Each column corresponds to the 7 different compiled hardware layouts.Each row corresponds to the 4 different single qubit clones.Data from ibm auckland.

Table 1 :
Cloning fidelity computed on 7 IBM Quantum devices, averaged over the 400 different message qubit states and the M clones.
Table 1 shows that in general, it seems that the newer QPU generations resulted in improved clone fidelities compared to older generation QPUs.
Department of Energy through the Los Alamos National Laboratory.Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001).This work was supported by the NNSA's Advanced Simulation and Computing Beyond Moore's Law Program at Los Alamos National Laboratory.This research used resources provided by the Los Alamos National Laboratory Institutional Computing Program, which is supported by the U.S. Department of Energy National Nuclear Security Administration under Contract No. 89233218CNA000001.We acknowledge the use of IBM Quantum services for this work.The views expressed are those of the authors, and do not reflect the official policy or position of IBM or the IBM Quantum team.E.P. thanks the folks at the 2023 IBM Quantum Internal Developer forum for helpful discussions on dynamic circuits and dynamical decoupling in Qiskit.This work has been assigned the LANL report number LA-UR-23-29397.The detailed quantum telecloning circuits shown in Figures12, 13, and 14 are all divided into 6 distinct segments using barriers, representing different stages of the quantum telecloning protocol and the parallel single qubit state tomography.The first stage prepares the telecloning state, which is comprised of Dicke state unitaries (and possibly SCS unitaries); this is the computationally intensive stage and is step 2 in Algorithm 1.The first stage is also where the message qubit is introduced (in this case, generated by parameterized R y and R z single qubit rotations), which is step 1 in Algorithm 1.The second stage prepares a Bell state, using a CNOT and a Hadamard gate, between the port qubit and the message qubit.The third stage measures the state of the port qubit and message qubit, and stores the two classical bits into two classical registers.Combined, the second and third stage comprise the Bell measurement, which is step 3 in Algorithm 1.The fourth stage uses a classical co-processor to execute classical channel conditional operations in order to optionally apply Pauli Z and or X gates (this is step 4 in Algorithm 1), represented is if-else control blocks.The if-else control blocks visually do not show what gates are (potentially) Figure12:1 → 2 quantum telecloning circuits with no ancilla qubits (top diagram) and with 1 ancilla qubit (bottom Compiled to a Linear Nearest Neighbors (LNN) hardware graph, specifically targeting subsets of a heavy-hex graph (see Figure2).