Artificial Intelligence-Assisted Design and Virtual Diagnostic for the Initial Condition of a Storage-Ring-Based Quantum Information System

Developments in Artificial Intelligence (AI) are helping to solve complex physical problems that otherwise may be too computationally demanding to solve using traditional approaches. Universal Approximation Theorems tell us that we can model any physical system if we can approximate the system with some continuous function (i.e., compact convergence topology and algorithmically generated sets of functions, such as the convolutional neural network), whether for an arbitrary depth or arbitrary width neural network. We consider the problem of solving a set of <inline-formula> <tex-math notation="LaTeX">$N$ </tex-math></inline-formula> coupled algebraic equations as <inline-formula> <tex-math notation="LaTeX">$N$ </tex-math></inline-formula> becomes very large and apply machine learning (ML) to solve this problem for any value of <inline-formula> <tex-math notation="LaTeX">$N$ </tex-math></inline-formula>. The physical problem we are focusing on is to model the equilibrium positions of ions in an ion trap. A storage ring quantum computer could contain well over tens of thousands of ions. Quickly determining the equilibrium positions will be important to minimize the time to target and observe each ion. As each ion serves as a single qubit, this is important for setting and measuring the individual qubit states. The phonon modes from a collection of ions acts as another qubit, useful for gate operations. Measuring the phonon modes, where ions are oscillating around their respective equilibrium positions also means understanding the equilibrium positions very well. Turning all of this into a virtual diagnostic allows real time prediction and comparison to ensure unique definition of each ion.


I. INTRODUCTION
Quantum information systems (QIS) refers to developing technologies, such as quantum communication and quan-The associate editor coordinating the review of this manuscript and approving it for publication was Derek Abbott . tum computing (QC) [1]. QIS promises to revolutionize our current communication and computational paradigms; this has prompted a race between countries and institutions alike towards building practical QIS. The United States, for example, launched in 2018 a national initiative on developing technologies that will help enable QIS [2]. The technological leap enabled by QIS comes from harnessing the quantum properties of atomic, electronic, and photonic systems, to perform calculations that are virtually impossible even with state-ofart computational systems.
In classical computing, information is encoded and streamlined as binary bits that can exist in one of two mutually exclusive states, generally represented by the digits 0 and 1. More complex logical gates can be formed from this binary set to process complex information and to perform calculations. The computational power of classical computers comes from their ability to perform numerous consecutive tasks very fast. Quantum computing is fundamentally different in that it exploits the principle of superposition. The principle of superposition in quantum mechanics (QM) is conceptually different than in classical physics [3]. Here we describe it in the context of the QM mathematical framework. In QM, it is postulated that measurements can affect the state of the system and they are mathematically described with Hermitian operators 1 [4]. The operator M has eigenvectors |m k and eigenvalues m k . If the general state of a system is represented by the state-vector |ψ and we perform the measurement M, then it is useful to re-write the initial state as a linear combination of the eigenvectors of M [4]: Performing the measurement M on the system will force the initial state-vector to ''collapse'' into |m k with probability proportional to the modulus square of the coefficient a k . Equation (1) can be interpreted as the original state of the system existing as a superposition of all the possible states corresponding to the measurement M, the eigenvectors of M. In the context of QC, a quantum bit, or qubit, exists as a superposition of the normal states represented by |0 and |1 . Measuring the qubit will result in either 0 or 1. Note how this situation is different from classical computing, where the state of the classical bit is either 0 or 1 at any specific time, but not both of them. QM deals with the smallest physical systems: atoms, electrons and photons [5]. Consequently, different QIS can be built based on different qubits, e.g., electronic spin, photon polarization, and trapped ions. For a thorough review on the current state of QC technology, see [6]. This work concerns a particular QIS: the storage ring quantum computer (SRQC) [7], which uses an unprecedented long chain of ions as qubits.

A. THE STORAGE RING QUANTUM COMPUTER
Ions have been trapped and studied for years using Paul [8], [9] and Penning traps [10]. These experiments use electromagnetic fields and laser pulses to isolate and manipulate ions [11]. Similarly, storage rings use electromagnetic fields to accumulate and to store beams of charged particles in a closed trajectory. The SRQC is effectively an ion trap based QC that can potentially store thousands of ions in a rotating frame. Using ions as qubits is particularly interesting since they have longer coherence time 2 [7]. The SRQC has many advantages over conventional ion traps, e.g., it can store a significantly higher number of ions and it opens up the possibility of storing multiple ion chains.
The SRQC concept has two pairs of inner electrodes to tailor a quadrupolar electric field that helps to focus the beam transversely. In the longitudinal direction, the beam is not bounded, but radiofrequency potentials can be introduced to break the beam into many parts, opening the possibility to parallel computing [7]. The dynamics of individual ions in the beam is driven by external fields and by interactions with other ions. The statistical motion of the ions forming the beam is the thermal energy of the beam [12]. For SRQC to be feasible, we need to reduce the beam thermal energy to a level where it is possible to identify and manipulate the external and internal quantum states of individual ions. Laser pulses can be used for beam cooling. For example, Doppler cooling [13], a process in which an ion absorbs and re-emits photons, results in a lower energy state of the ion [11]. By cooling the ion beam below the Doppler limit, but not to the point of the Lamb Dicke limit [7], the thermal motion is reduced so that the dominant interaction force is the Coulomb interaction, and a crystalline beam is formed [14]. If the crystalline beam is cooled enough and it is possible to identify phonon modes, then an Ion Coulomb Crystal (ICC) is formed. The SRQC will store an ICC, and use both the external and internal quantum states as qubits. The external modes are the phonon modes, and the internal modes are the hyperfine spin states [7].
The use of a storage ring for trapping an ion crystalline beam was investigated in LMU Munich in 2002. The experiment, PAul Laser cooLing Accelerator System (PALLAS), stored a beam of 24 Mg + , and cooled it enough to produce a crystalline beam [15]. For the SRQC, we expect to produce homogeneous beams of 24 Mg + or 7 Li + [7], and cool them to form an Ion Coulomb Crystal (ICC). 3

B. ARTIFICIAL INTELLIGENCE FOR SRQC
The use of Artificial intelligence (AI) in support of experimental facilities has been identified as a competitive technology [16]. The use of AI-enabled controllers, for example, has resulted in enhanced beam quality and performance at large-scale particle accelerators [17]- [19]. Here we discuss possible applications of AI to SRQC [20]- [23], particularly on manipulating an unprecedented large number of ions and on storing multiple ICC. One example of a task that the SRQC should perform regularly is a reset of the ICC to its baseline state: if the ICC is in an arbitrary excited state, the lasers should be accurately triggered to lower the energy state of the 2 Since it is virtually impossible to completely isolate the qubits from the rest of the Universe, quantum decoherence is a limitation of QC and there is an overarching quest to achieve longer coherence times. 3 The ICC abbreviation will be used to denote either single or multiple crystals. VOLUME 10, 2022 ICC, i.e., phonon excitations are suppressed and individual ions return to their equilibrium positions.
Random variations in the different sub-systems of the SRQC will produce different ICC every time. The simplest difference is in the number of ions forming the ICC. 4 We envision a physics informed [24], virtual diagnostic tool [25], [26], that can accurately predict when a specific ion is approaching the SRQC window so that the corresponding laser can be triggered to excite the ion. A virtual diagnostic tool can be enabled by multiple supervised Machine Learning (ML) algorithms, such as Neural Networks (NN), and the accuracy of the ML model depends on the quality of data used for training, i.e., there should be a good amount of data that correctly represents the expected solutions, particularly as the number of ions becomes large.

II. ION COULOMB CRYSTAL
In the crystalline beam state, individual ions are locked into their corresponding equilibrium positions and can exhibit vibrations around this point. In the ICC state, the ion vibrations are small enough that the phonon modes of the crystal become dominant. The phonon modes are important for QC [27]. In general, the ICC is free to form a crystal in 2 or 3 dimensions and exhibits topological phase transitions [28]- [30]. If the ions are strongly bounded by the transverse fields, then the ions form a longitudinal 1D ICC, or an ion chain, where the distance between adjacent ions is determined only by the trap and Coulomb potentials. Qualitatively, we expect the behavior of the ion chain to be as follows: ions near the center of the chain experience a quasi-symmetric Coulomb interaction from both sides, becoming equidistant by a minimum distance u min . However, the ions closer to the edges of the chain are less bounded by Coulomb interaction and are spaced farther apart. We will restrict the rest of our discussion to the 1D ICC, which is the ideal ICC case to have in SRQC.

A. THE EQUILIBRIUM POSITIONS OF AN ION CHAIN
In the SRQC, it will be useful to distinguish between individual ions in the ICC. We use ordered labels: if z k (t) is the position of ion k in the ICC, then z k+1 (t) > z k (t). The potential V for an ion chain with N ions is [9], [30]: where the first term corresponds to the trap potential and the second term to the Coulomb interaction between any pair of ions. In (2), m is the mass of the ions, e the fundamental electric charge, 0 the permittivity of free space, and ω the trap frequency, which quantifies the strength of the trap potential [9]. The equilibrium positions of the ions in the chain, where z 0 k corresponds to the equilibrium position of ion k, are such that they minimize the potential (2), i.e., When evaluating explicitly for the potential V in (3), it is convenient to introduce a length scale l that groups all the constants, where l 3 = e 2 /4π 0 mω 2 . This results in dimensionless coordinates For a given N , the equilibrium position u k of ion k in the ICC can be determined by simultaneously solving the nonlinear system of algebraic equations: for k = 1, . . . , N .

B. NUMERICAL SOLUTION
We solve (5) using a nonlinear equation solver based on a minimization of least squares with the Levenberg-Marquardt method [31]. For representative values of N , Fig. 1 shows the corresponding solutions u 1 , . . . u N , where the vertical axis is the equilibrium position u k , and there is a common horizontal axis for comparison. Fig. 2 shows the normalized equilibrium positions u k /u max , with u max the position of the last ion in the chain. Fig. 3 shows the separation between adjacent ions, where the ions near the ICC center converge to a minimum separation value [9] u min ≈ 2.018 N 0.559 .
and the ions near the edge of the chain are spaced farther apart from each other.  Finally, Fig. 4 shows the time needed to solve (5) as a function of N .
This numerical approach quickly becomes impractical as the computation time grows as ∼ N 3 , which is expected for matrices with N 2 elements. 5 This motivates the exploration of alternative methods to determine the equilibrium positions of ions in an ICC. This is important for SRQC, which will manipulate long ICC formed by thousands of ions.
For this work, we used two different CPU computation systems to expedite the calculations; a workstation (Intel Core i9, 8 cores, 3.6 GHz), and the THETA supercomputer at the Argonne Leadership Computer Facility (ALCF) 6 (Intel Xeon Phi 2nd generation, various core allocations). We find similar computation times with these two processors, as shown in Fig. 4. 5 Matrix inversion typically needs O(n 3 ) floating point operations, where n is the number of elements of the matrix. 6 The ALCF is a U.S. Department of Energy User Facility. It provides an invaluable computational resource that enables a multidisciplinary scientific and engineering community with the expertise and supercomputing resources to support large-scale projects in order to solve some of the world's most complex and challenging scientific problems, including those involving artificial intelligence.

III. RAPID CALCULATION OF THE ICC BASELINE STATE USING A NEURAL NETWORK
One alternative calculation approach to solve (5) faster is to use ML tools like NN. NN are non-linear mappings between the input variables and the target output [32]. NN have a fixed topology 7 with layers of interconnected unit cells. Each unit cell corresponds to a non-linear function, called activation function, and the connections between unit cells are characterized by a weight. Training a NN means finding appropriate weight values so that the NN correctly reproduces the target output of the training data examples [32]. Once trained, the NN should accurately predict generalized solutions not present in the training set. To populate a training data set for the NN, we solve (5) for different cases of N using numerical methods [31]. For the NN to learn the relevant data features, the training data set needs to be representative of the range of solutions that we are interested for SRQC.
The solution vector corresponding to an ICC formed by N ions has an equal number of coordinates. This means different values of N on the training set have different size, and cannot be directly accommodated into a fixed-sized NN.
A way around this is to introduce a fitting model with a fixed number of parameters to approximate the numerical solutions. With this approach, the NN now predicts the fitting parameters corresponding to N . Fig. 5 illustrates the approach for rapid calculation of the equilibrium state using NN and a fitting model. This approach effectively reduces the calculation time at a cost of an approximation error. This approximated solution should be acceptable as long as the following physical condition is met: the error on the approximated solution needs to be smaller than the physical distance between adjacent ions [27]. This ensures the right ion is excited with the laser pulse.

A. TRAINING ON A PARAMETERIZED MODEL
Here we describe a NN implementation using PyTorch [33] that quickly produces the solutions to (5) given a limited amount of explicitly calculated solutions. This NN takes the 7 The topology of a NN is described by hyper-parameters, these are the number of layers, number of unit cells per layer, types of activation function.  Calculate the solution to (5) using a numerical solver, then introduce a fitting model that describes the solutions through a fixed set of fitting parameters. Finally, train a NN to predict the fitting parameters corresponding to solutions that have not been calculated numerically. number of ions N as input and returns a set of five fitting parameters. We exploit the symmetry of the solutions, see Fig. 1 or Fig. 2, by training the NN to predict the five fitting parameters that accurately fit the positive half of the solution, reducing the NN training time.

1) MODEL FITTING
We use the non-linear least-squares method fitting package lmfit [34] to approximate the positive side of the solutions shown in Fig. 2. We use a model with quadratic and exponential terms to fit the non-linear behavior far from the chain center. Fig. 6 shows the resulting normalized fitting parameters as a function of the number of ions N . Using a polynomial fit alone is not ideal since it gives oscillating values around the points at the edge of the curve, which increases the error in the fitting. Similarly, trying to extrapolate to different values of N from the model g N alone does not perform as well as using NN [35]. The NN is effectively a model describing more complex features than the fitting model g N .

2) DATA PROCESSING
After fitting the solutions in the training data set according to the model (8), we find that some data points change sig- nificantly with respect to the general trend. If this change is more than 50% relative to the adjacent point, we remove it from the data set. The resulting data set is then split randomly so that 70% of the data points are used for training, and the remaining 30% are used for validation of the NN, i.e. to test that the trained NN model correctly predicts the examples on the validation set. Fig. 6 also shows which data points were used for training and for validation.

3) NEURAL NETWORK AND PREDICTIONS
We propose a single layer feed-forward NN that takes the single input N into a hidden layer with 64 unit cells and then into an output layer with 5 output cells, one for each of the fitting parameters ( a N , b N , c N , d N , f N ). We choose the hyperbolic tangent tanh as the activation function. This gives almost identical results as the sigmoid function, while the commonly used Rectified Linear Unit (ReLU) function is not well suited for shallow networks. 8 The loss function to  be minimized is the mean square error (MSE), and we use the built-in Adam method for optimization [36], which generally works good in regression problems. Additionally, we tried a two hidden layer NN, which results in longer training time and no significant improvement in prediction.
Once the NN model is trained, we can make predictions on the fitting parameters that correspond to the input N of our choice. Figs. 7 and 8 show the results for N = 1000 and N = 2000, correspondingly. The top plot in both figures shows the numerical solution, compared to the resulting approximated solution that uses the NN-predicted fitting parameters. The bottom plots in both figures show the three errors, defined as the following simple differences: between predicted and true, between fitted and true, and between fitted and predicted solutions. It can be noted that the errors are more significant at the edge of the chain and are more accurate at the center. Similarly, the error is larger when the fitted model is used, i.e., when reconstructing the real solution. Fig. 9 shows the error distribution for the two examples of reconstructed solutions. The corresponding sum of squared errors for N = 1000 and N = 2000 is 3.41 and 8.65 respectively. This error becomes larger for increasing values of N , possibly due to less available training data in this regime.

IV. DISCUSSION
Quantum computers based on ion traps are limited by the small number of ions that can be used as qubits. The storage ring quantum computer can potentially store thousands of ions to form an Ion Coulomb Crystal in a closed trajectory. For SRQC, we anticipate unique challenges associated with the unprecedented large number of ions in the ICC. For example, on timing the laser pulses to efficiently cool the ICC beam to its equilibrium energy configuration, from which ion manipulations can be initiated.
The solutions to the equilibrium positions of the ICC can be solved numerically for an arbitrary N , but they quickly become lengthy and impractical as N 1, taking from hours to days. Here we described a simple NN that is able to predict the equilibrium positions of ions in the ICC, given a limited amount of numerical solutions available for training. Our NN model can predict multiple solutions within the range of the existing training data. Training the NN takes approximately two hours on a laptop (Nvidia GeForce GTX 1050 GPU). With the NN trained, prediction of the fitting parameters and reconstruction of the equilibrium positions takes only a few seconds, thus proving a valuable resource for determining multiple solutions that have not been calculated using numerical methods, e.g., all the possible solutions in between the ones used for training the NN, which took long times as illustrated in Fig. 4. The error in the NN prediction grows as the requested solution drifts away from the training data set. We are looking into extrapolating the NN prediction range to numbers beyond the range of available training data. We are also interested in expanding our model to include deviations from the ideal 1D ICC. this manuscript. Thanks also to Dr. Michael Papka and David Martin of the Argonne Leadership Computing Facility for expanding our knowledge of high-performance computing and for the invitation for the involvement in the AI for Science town hall [16] that sparked additional discussions on how to utilize the ALCF for the SRCQ design. They also acknowledge Element Aero for computational resources.
CLIO GONZÁLEZ-ZACARÍAS received the B.S. degree (Hons.) in physics from the University of Puebla, Mexico, and the joint master's degree in complex systems science from the University of Warwick, England, and Gothenburg University, Sweden. She is currently pursuing the Ph.D. degree with the Neuroscience Graduate Program and the Viterbi School of Engineering, University of Southern California (USC). She accepted an internship at the Laboratory of Neuroimaging, USC, where she started doing research in the human brain using MRI. Specifically, she worked with epileptic patients and did post-processing of neuroimaging data using different computational techniques. Her current research is in collaboration with the Children's Hospital Los Angeles, looking into the microstructural white matter damage on patients with chronic anemia. In 2010, she received the Scholarship from the University of Puebla to work on her dissertation on quantum chaos. In 2012, she received the Full Scholarship from the European ERASMUS-MUNDUS Program to conduct her master's degree. She has 13 years of teaching experience in different organizations and three years of research experience at COMSATS University Abbottabad. She taught object oriented programming, introduction to computer, and digital image processing. Currently, she is working to train CNN for laser wave-front optimization and modeling the relationship between input and output parameters of the femtosecond laser mainly in order to optimize pulse temporal width. The model was developed in MATLAB by means of feed-forward back-propagation neural network using second-, third-, and fourth-order input phase, and hole position, width, and depth as input parameters. The output parameter was the pulse temporal width. Other output parameters can be modeled using the same approach. Furthermore, she worked as a Research Scholar with the University of Illinois Chicago, from 2017 to 2018, with Dr. Rashid Ansari. Her research interest includes digital image processing. She served as a Technical and Management Consultant on the successful FERMI free-electron laser project with the Sincrotrone Trieste. She has myriad archival and conference papers and technical documents, and holds two U.S. patents and an international trademark. She recently served as the Co-Lead for the Department of Energy's report-Basic Research Needs for Compact Accelerators for Security and Medicine for the Computing, Controls and Design Technical Group. Her interests are many and include particle accelerator systems; laser systems; the use of artificial intelligence in controls, modeling, and prediction of complex systems; sensors and detectors; and applications of these technologies in science, security, and defense. She is a fellow of the American Physical Society (APS) and SPIE, a Senior Member of the Optical Society of America (OSA), and a member of the Italian Optical Society (SIOF). In 2010, she was presented a Letter of Commendation by the Chief of Naval Research for her technical efforts. In 2018, she received the IEEE Nuclear and Plasma Sciences Society's Particle Accelerator Science and Technology Award. She serves as a reviewer for several journals and served on the Editorial Board of the Physical Review Accelerators and Beams (American Physical Society) and IEEE ACCESS. She recently served as a Senior Guest Editor for IEEE TRANSACTIONS ON NUCLEAR SCIENCE and an Associate Editor for IEEE PHOTONICS JOURNAL. She also serves as a Technical Reviewer on projects worldwide, including as the Chair for the Program Advisory Committee (APAC) of the Brookhaven National Laboratory's Accelerator Test Facility (ATF). She also serves as the Director of Knowledge Transfer for the National Science Foundation's Center for Bright Beams, Cornell University, and an Advisor for the Associate Laboratory Directorship for Physical Sciences at Los Alamos National Laboratory on accelerator science, technology, and engineering, and next generation x-ray light sources for national security applications. Furthermore, she served on a NATO Panel for sensors and electronics. VOLUME 10, 2022 KEVIN BROWN (Senior Member, IEEE) is currently a Physicist with the Brookhaven National Laboratory (BNL); an Adjunct Professor with the Electrical and Computer Engineering Department, Stony Brook University; and the Head of the Control Systems for the Relativistic Heavy Ion Collider (RHIC), BNL. He is the co-inventor of the storage ring quantum computer, a quantum information system based on particle accelerator and particle beam storage ring technologies. He began his career in experimental high-energy spin physics and moved into accelerator physics when BNL's Alternating Gradient Synchrotron (AGS) was converted (in the mid-1980's) to provide polarized beams for targeting polarized protons on polarized targets to study spin analyzing power in high-energy collisions. He was a member of the AGS Accelerator Physics (AP) Group for many years where he was a member of the Design and Commissioning Team, NASA Space Radiation Laboratory (NSRL). He was responsible for AGS extraction systems for fixed target experiments, including the G-2 experiment that ran from the AGS. He was a part of the Spin Dynamics Team that designed and commissioned the polarized proton systems for RHIC and he was a member of the RHIC Commissioning Team. Before moving to lead the Control Systems for RHIC, he was a member of the Collider-Accelerator Department AP Group, serving as a Scheduling Physicist for RHIC and an RHIC Run Coordinator. His current research interests include quantum information systems, the use of machine learning (ML) and artificial intelligence (AI) in design and simulation of accelerator systems, ML and AI in accelerator control systems, and developing the infrastructure for control system simulations.
TRUDY BOLIN received the B.S. degree in 1995 and the M.S. degree in physics from the Illinois Institute of Technology, in 1997. She is currently pursuing the Ph.D. degree with the Electrical and Computer Engineering Department, The University of New Mexico, Albuquerque, NM, USA. She has 12 years of synchrotron experience as a Beamline Scientist with the Advanced Photon Source, Argonne National Laboratory, and 18 years of experience at the facility. She served as a Senior Researcher with the Department of Electrical and Computer Engineering Group, Colorado State University, from 2015 to 2017. She has attended numerous sessions of the United States Particle Accelerator School. Since 2019, she has been a Research Scholar with the Electrical and Computer Engineering Department, The University of New Mexico. She is currently using numerical simulations to design RF cavities for hard x-ray free-electron lasers. She is serving as a member for the NA-PAC 2022 Local Organizing Committee.