Quantum Machine Learning for Next-G Wireless Communications: Fundamentals and the Path Ahead

A comprehensive coverage of the state-of-the-art in quantum machine learning (QML) methodologies, with a unique perspective on their applications for wireless communications, is presented. The paper begins by delving into the fundamental principles of quantum computing, and then goes through different operations and techniques that are involved in QML deployments. Subsequently, it provides an in-depth look at various methods peculiar to quantum computing, such as quantum search algorithms, and discusses their potentials towards maximizing the performance of wireless systems. The integration of quantum-based learning models into the existing machine learning methodologies, such as within the frameworks of unsupervised learning and reinforcement learning, are then examined. Taking the viewpoint of wireless communications, diverse studies in the literature that employ QML-based optimization methods are also highlighted. Finally, to ensure the applicability and feasibility of QML for optimizing wireless systems, potential solutions for deployment challenges are addressed.


I. INTRODUCTION
T HE NEXT generation of wireless communication networks is expected to meet an unprecedented volume of demand for wireless as a main commodity in a plethora of economic sectors.In fact, not only the number of network clients, including users' mobile terminals and Internet-of-Things (IoT) devices, is expected to increase significantly in the upcoming years [1], but also new use cases and services with stringent quality-of-service requirements will continue to emerge.For instance, requirements of up to 1 terabit of peak data rate, over 1 gigabit of user data rate, and less than 1 millisecond of end-to-end latency, will need to be supported by 6G [2], [3], [4].Concurrently, the growing concern about the impact of energy consumption on the environment calls for a higher energy efficiency of wireless communication networks, mandating 6G to achieve a hundred times higher energy efficiency compared to that of the previous generation [3], [5].
These rigorous demands call for innovative approaches in radio access technologies.Among many promising directions, several technologies have been extensively explored to satisfy the service requirements of future wireless systems.In particular, reconfigurable intelligent surface (RIS) has been regarded as a key enabling technology [6].Chief among the envisioned advantages of RIS deployment is the enhancement of the coverage of the base stations and access points by manipulating the radio propagation, mitigating the presence of blockages along the way.In addition, extralarge antenna arrays, each employing a massive number of antenna elements, facilitating an expansion of the Fresnel zone [7], can enable diverse communication scenarios, such as those that exploit near-field communication links [8], [9].Moreover, beyond terrestrial networks allow various nonterrestrial communication modes to be utilized.For instance, aerial base stations, typically implemented via unmanned aerial vehicles (UAVs) [10], [11], can be considered as alternative transmission platforms due to their lucrative benefits including rapid deployment, flexible coverage, support for temporary events, infrastructure augmentation, and surveillance capabilities [12].

A. CHALLENGES RELATED TO EMERGING WIRELESS TECHNOLOGIES
With the emergence of these new wireless technologies come unprecedented challenges that need to be addressed to ensure successful deployment and operation.In the forthcoming discussion, we will first outline overarching challenges that arise with the advent of emerging wireless technologies, followed by a discussion of specific solutions based on artificial intelligence (AI) that can effectively tackle these challenges.

1) TOWARDS NEXT-GENERATION WIRELESS COMMUNICATIONS
Currently, 5G is in its roll-out phase, whilst 6G is expected to be deployed by the 2030s [13].In comparison to its 4G predecessor, 5G offers a three-fold increase in spectral efficiency, along with improved latency and reliability, allowing for massive user connectivity and a diverse range of services such as tactile Internet and augmented & virtual reality [14], [15].The next leap, the forthcoming 6G aims to achieve over a ten-fold increase in spectral efficiency and more than a ten-fold improvement in energy efficiency, compared to 5G [15].With the advent of emerging technologies such as Terahertz communications, 6G is poised to possess a substantial advantage in terms of system bandwidth, with over 300 MHz of bandwidth [15].6G is also expected to support an even higher level of device connectivity, with a connection density of over 10 7 devices per square km [14].Even more fascinating, 6G will also change the paradigm of coverage, due to the advent of non-terrestrial networks, making a departure from area-based coverage to 3D wireless coverage [14].The aforementioned factors are the driving force behind the development of emerging technologies for various aspects of wireless communications, especially that the traffic growth will not only originate from users' equipment but also from machine-type communication devices [16].More elaborations on the vision of next-generation wireless communications are presented in [14], [15].A summary of the evolution is illustrated in Fig. 1.

2) INTERPLAY BETWEEN EMERGING TECHNOLOGIES
The interplay between the aforementioned enabling technologies can provide performance gains in terms of capacity, energy efficiency, and reliability, and more.For instance, previous studies [17], [18] have proposed integration scenarios involving RISs and non-orthogonal multiple access (NOMA) to enhance spectral efficiency via RIS-enabled signal partitions among different user terminals.However, using several technologies concurrently can result in significant increase in the computational complexity associated with the optimization of the underlying parameters.For example, in the case of integrating RIS and power-domain NOMA, multiple parameters such as RIS phase shifting, NOMA user pairing, and NOMA power allocation, need to be jointly optimized to minimize the inter-user interference and enhance the system performance.

3) GROWING SCALE OF THE WIRELESS SYSTEMS
Moreover, the expanded scale of wireless communication systems can lead to a rise in the signaling and computational overhead.This is especially true for RISs and extra-large antenna arrays, as they may require a significant number of pilot signals for precise channel estimation when conventional estimation techniques, e.g., maximum-likelihood approaches, are employed.In particular, RISs and extra-large antenna arrays may demand high numbers of pilot signals for accurate channel estimation, if conventional estimation techniques such as those relying on maximum-likelihood approaches are utilized.

4) DYNAMIC WIRELESS COMMUNICATION ENVIRONMENTS
Furthermore, the ever-changing nature of wireless communication environments necessitates flexible resource allocation strategies.In addition, considering the growing number of wireless clients, including user terminals, IoT nodes, and smart vehicles, it is essential to effectively allocate the limited network resources for the massive number of network devices in real time, while ensuring communication reliability and satisfying the latency constraints in complex deployment scenarios [19].

B. AI FOR NEXT-GENERATION WIRELESS COMMUNICATIONS
In light of the challenges mentioned earlier, there is a growing need to embrace different approaches for optimizing wireless communication systems.Fortunately, in recent years, significant progress has been made in the fields of AI and machine learning (ML), accelerating the transformation of wireless communication systems into natively intelligent systems capable of overcoming various challenges [20], as discussed in the following sub-sections.

1) AI FOR THE INTEGRATION OF DIFFERENT WIRELESS TECHNOLOGIES
Firstly, AI can facilitate the interplay between various emerging wireless communication technologies, for instance in the design of the user grouping when leveraging the integration of RISs and NOMA [18].In this context, the work in [21] employed ML to support the interplay between RISs and MIMO (multiple-input multiple-output).Specifically, deep learning based approaches were utilized to optimize the RIS phase shifting and the MIMO hybrid precoders.In addition, AI can be applied to maximize the performance of heterogeneous wireless communication systems, in which different multiple access techniques, modulation schemes, and transmission technologies are used to meet the service demands of multiple user terminals.

2) AI TO TACKLE SCALABILITY ISSUES
Secondly, AI-based approaches can mitigate the scalability issue by providing estimations based on a limited amount of information.Deep learning, for example, can be employed in channel extrapolation, thereby reducing the need for a large number of pilot signals [22].In addition, in distributed wireless communication networks that employ multiple access points, a decentralized or a distributed AI framework can be adopted to optimize the operation parameters of the network, thereby avoiding the computational and signaling bottlenecks that generally occur in a centralized AI framework [23].

3) AI FOR ADAPTING TO DYNAMIC ENVIRONMENTS
Thirdly, AI can serve as a key tool for adapting to the dynamic wireless landscapes while preserving satisfactory levels of performance, which can be a demanding task when performed via conventional analytical approaches [24].For instance, in drone-based wireless communications, a moving UAV acting as an access point needs to dynamically adapt its transmit power and precoding in order to maintain coverage and minimize interference.For this purpose, AI can have a global observation of the wireless network, e.g., the coordinates of the network nodes, the instantaneous channel conditions, and the number of available transmitters, while considering alterations in the communication network, e.g., the movements of the end-users' terminals and the alterations of the propagation environment, in order to dynamically optimize the communication variables such as the transmit precoding and the power control, with the aid of different AI-based approaches such as online learning and reinforcement learning [24].
Nevertheless, the computational complexity of a classicalbased learning model/algorithm generally grows with the dimension of the input data, e.g., channel state information, as well as with the complexity of the learning model, e.g., the number of layers composing the model, and with the number of iterations [25].Although a single inference upon a trained learning model could be processed within an acceptable computational time, the parameter training of the model may require a high number of iterations composed of a high number of inferences.This leads to an extended training duration and limits the applicability of high-dimensional learning models, especially for time-sensitive applications such as those of real-time wireless systems with ultra-reliable low latency communications (URLLC) constraints [26].

C. QUANTUM-BASED AI FOR NEXT-GENERATION WIRELESS COMMUNICATIONS
Quantum computing, on the other hand, [27], [28], can provide considerable computational benefit to address the problems of next-generation wireless communications.In particular, the increasing numbers of devices and transceivers' antennas lead to a substantial expansion of possible combinations, potentially causing computational overhead for signal detection and channel estimation, among other things.The 3D coverage, due to the deployment of non-terrestrial networks, might introduce challenges in beamforming, localization, and sensing.Beyond those factors, 6G development calls for consideration of a vast array of performance factors.To effectively manage future wireless systems, we will need to consider variables such as rate fairness, transceiver availability, energy consumption, and user throughput, among many others, all of which dictate significant computational burdens.
In contrast to classical-based computations, which process a string of classical bits, each containing a value of a computational basis (either 0 or 1), quantum-based computations assume quantum bits, a.k.a."qubits", as the smallest unit of computation, each representing a superposition of computational bases enabled by a property called quantum superposition.Via the processing of multiple qubits, quantum-based computations can leverage other quantum properties such as quantum entanglement, which allows the state of a qubit to alter the state of another qubit, and quantum parallelism, which enables simultaneous information processing using a number of inter-connected qubits [29].The following discusses the motivation behind the utilization of quantum-based AI for wireless systems.

1) PERFORMANCE BENEFITS PRESENTED IN THE LITERATURE
In the viewpoint of quantum-based ML, the quantum properties can be leveraged to achieve various benefits.In particular, leveraging performance gains enabled by quantum computing, prior studies have shown that quantum-based learning models can attain faster training convergence [30].Quantum-based ML methods have also been shown to yield more accurate predictions compared to their classicalbased counterparts [31].Furthermore, quantum-based ML models have showcased comparable performance to those of classical learning models while using less learning experiments [32].

2) AVAILABILITY OF QUANTUM PROCESSING PLATFORMS
The availability of general-purpose quantum processing units has supported recent studies on quantum-based ML.
In particular, IBM [33] and Google [34], among other companies, have successfully operated multi-qubit quantum processors.For instance, an IBM quantum computing processor named "Condor", slated for deployment in 2023 by the time of writing, is expected to be able to process more than a thousand qubits [35].In parallel, D-Wave has made its specific-purpose quantum processing units commercially available since 2011 [36].These milestones have spurred researchers to explore quantum-based ML on functioning quantum computing platforms, targeting various use cases.For example, a recent study [37] employed the D-Wave quantum computing platform for quantum annealing to optimize vector perturbation for transmit precoding in MIMO systems.
Nonetheless, in considering the potential benefits of quantum-based ML, we identified a noticeable scarcity of literature exploring the applications of quantum-based ML methodologies that are specifically tailored for the upcoming wireless systems, as detailed in the subsequent discussion.

D. EXISTING SURVEYS ON QML UTILIZATIONS IN WIRELESS SYSTEMS 1) QML FOR GENERAL APPLICATIONS
Prior survey works have covered various QML schemes and their application in various scenarios [39], [40], [41], [42].The authors in [40] presented an overview of different quantum learning models, such as quantum neural networks and quantum perceptrons.The said study also covered alternative quantum learning frameworks such as quantum adiabatic learning, which employs continuous interference operation instead of discrete interference operation, consisting of sequential quantum gates.In [41], the authors explored various quantum-based models including quantum Hopfield networks, and focused on optimizing parameters for QML schemes.Besides, the authors in [42] discussed potential applications of QML in different scenarios such as object detection and control.However, the research landscape reveals a noticeable gap in the utilization of QML techniques for optimizing wireless communication systems.So far, the majority of examples in the literature have focused on quantum-based approaches applied to image processing tasks.

2) QML FOR NEXT-GENERATION WIRELESS SYSTEMS
Several surveys covered various applications of QML for the optimization of wireless communication systems.In particular, [27] provided an early outlook on how to employ QML for future wireless networks.Interestingly, the study also addressed the increase in computational requirements w.r.t. the size of learning models, e.g., in terms of the number of layers of a neural network.Nonetheless, readers might benefit from a more detailed explanation about how to apply QML for wireless applications, as certain aspects, such as input encoding, deviate from the conventions of classical ML.In [27], the authors discussed the potential use of ML methods for different layers of the wireless communication protocol stack, including the physical layer (e.g., RIS phase shifting and spectrum allocation), data-link layer (e.g., for latency constraint), network layer, and application layer, and described how QML could provide benefits such as the reduction of the computational complexity.As discussed in [28], it is worth considering the possibility of incorporating quantum-based optimization techniques as short-and long-term objectives in future wireless communication systems.To realize this, the possibility of integrating quantum computations and future radio access networks, e.g., for resource management, was also discussed in [28].Meanwhile, the authors of [39] explored the potential role of QML in optimizing 6G communication systems, and the authors of [38] highlighted the potential applications of QML in improving channel estimation and enabling multi-user communications.A brief comparison between the survey content of this paper and the above-mentioned works is presented in Table 1.

E. THE PAPER'S CONTRIBUTIONS
The surveys mentioned above mainly focused on exploring the potential of utilizing QML in wireless communications.However, we assert that there is a pressing need for more comprehensive discussions that also delve into specific use cases to demonstrate the practicality and efficacy of QML for enhancing wireless communications.Building upon the aforementioned discussions, the contributions of this work can be summarized as follows: • This study addresses the possible gains of employing QML in optimizing wireless communications while also discussing the existing limitations.Additionally, the study fills the research gap by addressing the lack of surveys in the current literature on the applications of QML specifically focused on optimizing wireless communications.• To effectively convey the benefits of QML in the context of wireless communications, this paper presents a comprehensive tutorial designed to inspire researchers to apply QML techniques to maximize the performance of wireless systems.This paper covers various techniques in QML, including encoding methods for classicalvalued inputs, strategies for obtaining QML outputs, and parameter optimization techniques.The fundamental principles of quantum computing, such as quantum superposition and quantum gates, are also discussed to assist readers who are drawn to this captivating research area.• This study not only explores the potential of using QML for optimizing wireless communications but also demonstrates the implementation feasibility through various use cases.Deployment challenges and solutions are also discussed.The following content of the paper is structured as follows.In Section II, the basic concepts of quantum computation and QML are presented.In Section III, different types of QMLbased optimization methods are discussed.In Section IV, various use cases of QML applications for wireless communications are addressed.In Section V, the challenges with applying QML to wireless communications are covered.The paper is concluded in Section VI.
Notation: The operations of norm, conjugate, transpose, and conjugate transpose are indicated by • , (•) * , (•) T , and (•) † , respectively.Vectors and matrices are denoted as bold lower-and upper-case characters, respectively.N (μ, σ 2 ) and CN (μ, σ 2 ) indicate normal and circular normal distributions, respectively, with mean μ and variance σ 2 , while U (x 0 , x 1 ) symbolizes the uniform distribution within x 0 and x 1 .vec(•) denotes the vectorization operator.The hyperbolic tangent operation is indicated by tanh(•).The notation of Var(•) stands for the variance of a random variable.An identity matrix is denoted as I, and ⊗ denotes the Kronecker operator.Finally, j = √ −1.Key Terms: Here, we define four key paradigms of the quantum technology: • Quantum Computing: Quantum computing exploits quantum bits, i.e., qubits, each of which can represent both 0 and 1 in superposition.When multiple qubits are involved, this property facilitates quantum systems to leverage quantum parallelism, enabling them to solve complex computation problems.

II. FUNDAMENTALS OF THE QUANTUM MACHINE LEARNING PROCESS A. QUANTUM BITS, SUPERPOSITION, AND ENTANGLEMENT 1) QUANTUM BITS AND QUANTUM SUPERPOSITION
The fundamental distinction of quantum computing from classical computing lies in the utilization of qubits, as the smallest units of computation.In contrast with a classical bit, which can only represent either the computational basis of |0 = 1 0 T or |1 = 0 1 T , each qubit can hold a superposition of orthonormal bases of |0 and |1 .Therefore, as shown in Fig. 2, the state of each qubit can be represented in a multi-dimensional space, known as Hilbert space.In the case of binary representation [43], each n-th qubit can be represented as where |ρ 0 | 2 and |ρ 1 | 2 are related to the probabilities of obtaining 0 and 1 during observation, respectively, and where θ and ϕ are the polar and azimuth angles for angular representation, respectively.Here, the Dirac's ket notation, as in |q n , represents the state in a column vector form, while the Dirac's bra notation, such as q n |, corresponds to the complex conjugate of the ket notation.
Employing Multiple Qubits: The adoption of a greater number of qubits, the core elements that constitute quantum computing, could significantly improve the quantum where ρ n is associated with basis |n .The Kronecker product is denoted by '⊗'.As it can be observed from ( 2), the number of computational bases increases exponentially with the number N qubit of qubits, thereby emphasizing the computational gain that can be obtained.
Quantum Entanglement: Further, the state of entangled qubits cannot be decomposed as individual qubits, as the measurement of one qubit affects another.As an example, an entangled two-qubit state can be expressed as [44].As a result, it is not possible to decompose an entangled state as a tensor product of individual states of different qubits [40].
2) QUANTUM GATES These gates are the building blocks of quantum operations that can be applied to qubits to alter their states.Some examples of quantum gates include Hadamard gates, denoted as , which can be used to introduce quantum superposition.For example, applying a Hadamard gate to , respectively, can also be utilized to introduce radian rotations by φ around the x, y, and z axes, respectively.In particular, a Pauli X gate can be used to switch the amplitudes corresponding to different basis states.For example, applying a Pauli X gate to a quantum state There are different rotation gates that allow for state rotation around x, y, and z axes, denoted by and R z (θ ) ≡ e −j θ /2 0 0 e j θ /2 , respectively.These rotation gates can be used to implement radian rotations by the variable θ , which makes them useful in a variety of parameterized quantum operations.
Furthermore, various controlled gates can be employed for connecting different qubits and facilitating quantum entanglement, as the state of one qubit can influence the state of the other qubit.In particular, controlled X gate, controlled Y gate, and controlled Z gate, denoted by respectively, act on two qubits, one being the control qubit |q c and the other being the target qubit |q t .
Most quantum gates used in quantum computing are unitary operations, which means that they can be represented as unitary matrices.A quantum operation U can be considered unitary if it satisfies the condition U † U = I N , where I N is an identity matrix with N = 2 N qubit diagonal entries, with N qubit being the number of qubits used for the quantum operation.This condition ensures that the gate preserves the norm of the quantum state and is reversible, meaning that the original quantum state can be recovered from the output state by applying the conjugate of the gate, which is useful in many quantum computing applications, including quantum communication protocols.

3) QUANTUM MEASUREMENTS
When classical values are required from a given quantum circuit, quantum measurements can be performed to obtain the expected output of the quantum system.Presenting a quantum measurement as the state projection onto the zaxis, given O 0 and O 1 as the projection operators for the computational basis states |0 and |1 , respectively, the probabilities of obtaining classical bits 0 and 1 from the measurement can be specified by leveraging the Born's rule.That is, where ψ init is the initial state of the quantum system, e.g., |ψ init = |0 , and U is a unitary quantum operation.
Once a quantum measurement is performed on a particular qubit, its quantum state is collapsed into one of its basis states, either |0 or |1 , based on the probabilities presented before, and the state of the qubit prior to the measurement cannot be retrieved.Eventually, assuming M as the measurement operator, the classical-valued output can be obtained as o = M U |ψ init ψ init |U † .Specifically, using the mentioned z-axis measurement, the output can be acquired as In the context of QML-based optimization for wireless communication systems, it is necessary to perform quantum measurements to convert the quantum states of the quantum system into classical-valued wireless variables, e.g., power control coefficients, that can be interpreted by the classical components of the communication system.

B. THE TRAINING PROCESS IN QML
To put the discussed concepts of quantum computing into use, this part provides a comprehensive overview of the various steps involved in a typical QML process, as illustrated in Fig. 3, while covering different aspects of the QML framework such as data processing and parameter training.

1) INITIALIZATION
Prior to the training, initialization processes involving data preparation and qubits preparation, are typically performed to improve the learning convergence and accuracy, and ensure consistent performance over a wide range of data sets: (a) Data Preparation: The acquired data needs to be preprocessed, e.g., by using (i) dimension reduction to reduce the computational complexity, and (ii) data normalization to avoid bias towards particular values that have larger/smaller magnitudes.In the context of wireless communications, sparse channel data can be pre-processed to reduce the data dimension [45].
In particular, when supervised learning is assumed, it is necessary to prepare reference points beforehand (Section II-B1 will further elaborate on this).The collected training data can be divided into separate batches, which can be subsequently utilized for different training phases or allocated among different learning models, e.g., in the instance of distributed learning [46].
(b) Qubits Preparation: For the quantum system, the states of the qubits need to be prepared beforehand [47].
Typically, the states of N qubit qubits can be prepared as basis states, such as |0 ⊗N qubit .Each of these qubits can also be prepared as a superposition of states, e.g., ) ⊗N qubit , which can be attained by applying a Hadamard gate H to each qubit in the basis state, which can be expressed as H ⊗N qubit |0 ⊗N qubit .

2) FEATURE PROCESSING WITH ENCODING OPERATION
The goal of encoding is to map classical-valued data feature into Hilbert space [48].For instance, an encoding operation to process input vector x, composed of N input input elements, can be realized by using a set of parameterized gates [48]: where f pre indicates pre-processing operation.As expressed in (3), to cover all the input feature elements, the number of qubits for the encoding process can be defined by N encode qubit = N input .We map each pre-processed input, represented as f pre (x i ) in (3), to the state of its associated qubit.Specifically, employing the angle encoding methodology [44], and initializing each qubit as |0 , the state of each i-th qubit can be altered as Such encoding operations can also be expanded as parameterized quantum operations, used for different machine learning tasks.In particular, the study in [49] employs trainable quantum embedding operations to classify input data features into different clusters.

3) QUANTUM PREDICTION MODELS
In QML, quantum prediction models can be defined as quantum operations that are compiled using different quantum gates, and designed to learn patterns from the training data and provide estimations on new data.A range of quantumbased prediction models has been proposed in the literature, e.g., quantum perceptrons, quantum convolutional neural networks, and quantum graph neural networks, each utilizing different combinations of quantum gates to perform ML tasks.These models can be specified as follows.(a) Quantum Perceptrons: Quantum perceptrons are quantum-based learning models that draw inspiration from biological neurons, similar to the classical perceptrons, and learn from the training data by iteratively adjusting their parameters commonly referred to as weights.
as its set of N param parameters, where N param can be set as N param = N input , a generalized quantum perceptron can be expressed as [50] where indicates the activation operation, comparable to the activation function in classical perceptron, to model non-linear responses of the assumed objective function.Here, ψ percep signifies the state of the j-th qubit corresponding to ψ percep ≡ ψ [1]  percep percep , specified in (4), while |ψ out , which is initialized as |ψ out = |0 , is the designated output state.In addition, multiple quantum perceptrons can be compiled together to form a quantum neural network capable of performing even more complex ML tasks.It is worth noting that a number of inter-connected quantum perceptrons can be assembled to form a quantum neural network [30].(b) Quantum Convolutional Neural Networks: Quantum convolutional neural networks hold an advantage when processing high-dimensional data, thanks to their capacity to reduce the dimensionality of the processed input, such as during the optimization process of a MIMO system.Each of them processes two distinct layers, called convolution and pooling layers, in an alternating manner, which can be expressed as where L is the number of layers, and where U [l]   pool and U [l]  conv are the l-th pooling and convolutional layers, respectively, for l ∈ {1, . . ., L}. Next, we provide a detailed description of the convolution and pooling layers: (i) Convolution layers, which are employed as kernels to extract the features of the embedded training data and change the initial quantum states towards the desired states (particularly towards the Hamiltonian of the optimization problem), can be expressed as [51] U [l]  conv and (ii) Pooling layers, which are employed to reduce the dimensionality of the processed information.Issues may occur when we decode the information processed in the quantum system into the classical system, as highdimensional quantum states now need to be presented as high-dimensional classical-valued matrices.Quantumbased convolutional neural networks can solve such dimensionality issues by utilizing pooling layers to reduce the dimension of the processed information [52].
It can be expressed as (c) Quantum Graph Neural Networks (QGNNs): Such neural networks can be used to process the relations between different data features that are presented as quantum states.For example, a QGNN can be used to reveal the relations between the coefficients of channels pertaining to the user terminals in a wireless system.These relations can be represented as a graph with several nodes and vertices.In quantum circuits, these nodes and vertices can be realized using parameterized quantum gates [53], [54].Let N node be the total number of nodes.N [i]  neighbor is the number of nodes neighbouring each i-th node, i ∈ {1, . . ., N node }.As shown in [53], [54], a QGNN operation can then be expressed as In this instance, the value of each i-th node is represented by the term R y (θ i ), where θ i is the parameter associated with that specific node.In addition, each j-th neighbouring node of the said i-th node is denoted by R y (θ i ), where θ i,j corresponds to the parameter associated with the j-th neighbour.Moreover, in (8), a set of Cx(•) gates is employed to provide connections between different qubits.

4) OBTAINING CLASSICAL-VALUED OUTPUTS VIA QUANTUM MEASUREMENTS
In order to bridge quantum-and classical-based computations, measurement operations can be performed.If the QML model optimization is done classically, one needs to obtain the classical value as the output of the QML operation.In particular, given output state ψ out , measurement on the quantum system considering basis state |0 can be projected as ψ out A † 0 A 0 ψ out .

5) OPTIMIZING THE PREDICTION MODELS
In general, before performing the gradient-based optimization processes of the QML model via training, we need to first define the calculation of the training loss, and then identify how to obtain the gradient of the loss.Both are required within algorithms aiming to adjust the parameters of the given learning model towards better prediction capability, such as those with gradient-descent approaches.
Training Loss: The goal of the optimization process of the QML is the minimization of the defined training loss, which is designed to showcase the capability of the given QML model to estimate the given optimization variables.Given N data as the number of features of the training data, the training loss for a quantum-based model, denoted as L, can be defined as the measure of the difference, referred to as the fidelity, between the desired outputs and the actual outputs of the model, as follows [55]: , where ψout j is the desired output state, ψ out j is the instantaneous output state of the quantum-based model, and θ is the set of parameters of the quantum-based model.To mitigate the issue of over-fitting and to prevent the model from becoming excessively specialized to the given training datasets, which can negatively impact the generalization performance on new datasets, regularization terms can be introduced in the loss/cost function, which can now be expressed as , where λ θ 2 is the regularization term that employs L2 regularization with λ being the regularization parameter, e.g., λ = 0.001, to penalize overly large parameter values [56].

III. QUANTUM MACHINE LEARNING METHODS FOR OPTIMIZING WIRELESS SYSTEMS
Since we have covered the general pipeline of QML processes, we will now proceed to the analysis of various quantum-based optimization methods that have been proposed in the literature, delving into their respective procedures and elaborating on their potential utility for wireless communication systems.

A. QUANTUM ALGORITHMS FOR OPTIMIZING WIRELESS SYSTEMS
The upcoming subsection will explore a range of quantum algorithms that exploit the unique properties of quantum mechanics, such as superposition and entanglement, and are, therefore, exclusive to quantum computing.Examples include quantum algorithms that perform better than classical algorithms in solving some problems, such as quantum search algorithms.

1) ORACLE-BASED OPTIMIZATION METHODS
Oracle-based algorithms, e.g., the Grover's algorithm and the Simon's algorithm [60], [61], employ Oracle operators to identify the intended solutions.In classical computing, to determine whether a certain entry satisfies the desired condition, typically, the classical processor needs to evaluate each input in sequence.On the other hand, in quantum computing, a quantum Oracle can evaluate the condition for all possible entries simultaneously, leveraging quantum parallelism through super-positioned quantum states.Consequently, Oracle-based quantum algorithms may offer significant computational advantages over classical methods.In particular, Grover's algorithm provides a significant computational advantage over classical search algorithms, allowing for the search of an entry among M items in a time complexity of O( √ M), as opposed to the O(M) complexity of typical classical search algorithms [60].
To enhance clarity and comprehension, the forthcoming explanation will discuss the inner working of a quantum Oracle, specifically in the context of the aforementioned Grover's algorithm [60].Given M entries of possible solutions, let each state included in |1 , . . ., |x , . . ., |M be the representative of a corresponding index of a solution.Let us now assume the indexes of the data entries to be represented as a superposition of quantum states, expressed as Here, each index has equal probability, hence signifying that each of the indexes holds an equal likelihood of being the desired entry.Subsequently, an Oracle operator, denoted by U oracle , is applied upon the state | , thus transforming the state as U oracle | ≡ (−1) f (x) | , where f (x) is a function that results in f (x) = 1 only for a particular index state |x that meets the search criterion, thereby allowing an operation of phase flip on the amplitude of the desired index state to negative.A reflection operator, denoted by U ref , is then adopted to amplify the amplitude of the desired index, thus enhancing its chance of being identified as the desired solution, while suppressing the amplitudes of the rest of the indexes.
The above-described process is iterated for approximately π /4 √ M times to proportionally increase the amplitude of the desired index state compared to the rest of the states.Eventually, quantum measurement is performed to obtain the index of the desired entry.
Applications for Optimizing Wireless Communications: The potential of Oracle-based algorithms to significantly enhance the efficiency of search processes makes them promising candidates for accelerating search-based tasks such as those of maximum-likelihood based MIMO detectors, as in the study of [62], which often require searching through large spaces of possible solutions.Moreover, the authors of [63] and [64] cover the applications of different Oracle-based algorithms, among other quantum algorithms, for the purpose of optimizing wireless communications.

2) VARIATIONAL QUANTUM EIGENSOLVER (VQE) BASED OPTIMIZATION METHODS
VQE algorithms are commonly employed to determine the state that corresponds to the ground energy of a quantum system.Such problem may be converted into the QML task of finding the parameter set θ that can lead to energy minimization of the quantum system at hand.Let us define U VQE (θ ) as the utilized VQE operator, with θ as its parameter set.The best set of parameters can then be identified by solving θ = argmin θ 0|(U VQE (θ)) † OU VQE (θ )|0 , where O denotes the observable that is Hamiltonian [65], [66].
Applications for Optimizing Wireless Communications: The task of minimizing the energy of a particular quantum system can indeed be transformed as the problem of minimizing a cost function associated with the considered performance metric of the wireless system.Specifically, this cost function can be formulated as a quadratic unconstrained binary optimization (QUBO) function [65].Such transformation allows for the application of VQE algorithms in addressing diverse wireless communication problems.One such problem is channel detection based on maximum likelihood, which has been examined in [66].

3) QUANTUM APPROXIMATE OPTIMIZATION ALGORITHM (QAOA) AND ADIABATIC QUANTUM COMPUTING
The QAOA harnesses the synergistic capability of quantum and classical computing, integrating a quantum circuit composed of a set of unitary operators to estimate the desired quantum states, alongside a classical optimization algorithm tasked with finding the optimal parameters for the quantum circuit.This combined approach allows QAOA to efficiently solve optimization tasks, positioning it as a promising methodology to tackle a variety of combinatorial problems encountered in the field of wireless communications [67], [68].
Applications for Optimizing Wireless Communications: In [67], QAOA is employed to solve the channel decoding problem in digital communication channels.For this purpose, formulation of the Hamiltonian cost function based on the channel decoding problem is presented, with the objective of generating codeword that minimizes the Hamming distance.Using the QAOA, which aims to minimize the energy cost of the given Hamiltonian function, an optimized codeword can be obtained.In [68], QAOA is employed to solve a scheduling task in a satellite-assisted wireless system, aiming to minimize the occurrence of overlapping coverage areas.To achieve this objective, the scheduling task is reformulated as a combinatorial problem known as the max-weight independent set (MWIS) problem, which can be efficiently solved using the QAOA.

4) OTHER APPLICATIONS OF QUANTUM TECHNOLOGIES
In addition to quantum-based optimization methods, other promising quantum technologies hold prospects for wireless communications.One such example is quantum cryptography, which utilizes the principles of quantum mechanics to provide secure communications.Moreover, quantum communication protocols can provide direct transmission of quantum states between different parties.In particular, the authors of [69] provided interesting communication cases using drones with quantum processing capabilities.By leveraging quantum entanglement, a communication network can be established using drones, such as UAVs, as the nodes for quantum-based communications.

B. INTEGRATION OF QUANTUM-BASED OPERATIONS WITH EXISTING ML METHODOLOGIES
As depicted in Fig. 4, quantum-based prediction models, each of which leverages quantum operations to process estimate outputs, hold the potential to empower the existing ML methodologies, often classified as supervised learning, unsupervised learning, and reinforcement learning.These methodologies have progressed along with the advancement of wireless systems [70], [71], [72], [73], [74].In particular, some distributed ML approaches, such as ensemble learning and federated learning, have incorporated NOMA and over-the-air computation to augment resource efficiency [75].Hence, they can exploit the computational gain offered by quantum computing, since factors such as the increase in the number of devices generally translate into spikes in the computational complexity when classical computing approaches are used.

1) SUPERVISED LEARNING
Overall, the training process of a supervised learning framework works by comparing the outputs of the learning model and the references from the training datasets.Accordingly, the selection of suitable parameters for a learning model relies on maximizing the likelihood between the outputs of the model and the corresponding references from the acquired dataset.By adopting this approach, some of the popular schemes in supervised QML are discussed as follows.
(a) Quantum-Based Support Vector Machines: Similar to classical-based support vector machines (SVMs), a quantum-based SVM can be employed to classify a number of multi-dimensional data points by defining a multi-dimensional separator called hyperplane, that is drawn in a manner that maximizes the margin, which corresponds to the distance between the hyperplane and the closest data points belonging to each class, referred to as the support vectors.Nonetheless, quantum SVMs differ from classical SVMs in that they encode the data points as the quantum states [76], leveraging the inherent properties of quantum computations such as quantum parallelism and superposition, which can potentially lead to computational speed-up [77].

Applications for Optimizing Wireless Communications:
Thanks to their innate capability to handle highdimensional input data, quantum SVMs with kernel methods can facilitate accurate predictions for decoding problems in wireless systems that entail highdimensional channel information, such as massive MIMO systems [78], [79].

(b) Quantum-Based Decision Trees and Random Forests:
Quantum decision trees employ tree-like prediction models, in which the branches of the trees represent the decision based on inputs while the leaves return corresponding outputs.Distinguished from classical decision trees, quantum decision trees are processed as quantum-based operations involving a number of quantum gates.In particular, the quantum decision tree method presented in [80] determines node splitting based on the quantum entropy, which captures the uncertainty inherited in the quantum system, instead of using stochastic variables as in their classical counterparts.Furthermore, a quantum random forest can be constructed as a compilation of multiple quantum decision trees, with each tree presented as a quantum-based model processing quantum states.

Applications for Optimizing Wireless Communications:
Low-complexity quantum decision trees and quantum random forests can be applied to relay selections in ultra-dense networks [81], which are characterized by a high number of access points and thereby require realtime decisions over a vast space of possible solutions.

2) UNSUPERVISED LEARNING
Unsupervised learning methodology, as opposed to supervised learning, has the potential to discover the underlying patterns in the data without requiring the inclusion of predefined reference points, commonly referred to as labels in the training dataset.
When dealing with unlabeled data set D = {x i } N i=1 , the optimization algorithms face the challenge of autonomously estimating the desired solution y i corresponding to each input x i .In tasks such as classification or segmentation, this selfguided process is commonly known as clustering.It aims to identify clusters among data points, and each cluster can then be interpreted as a certain class or label denoted by y i .Note that unsupervised learning can encompass generative tasks, such as those in generative adversarial neural network frameworks, which have also been explored as quantum algorithms [82], [83].

QUANTUM-BASED CLUSTERING ALGORITHMS
In clustering tasks, the k-means algorithm is arguably one of the most notable algorithms.It is an iterative, non-deterministic algorithm, or heuristics, that can solve NPcomplete problems with commendable accuracy in different cases [84].
The unsupervised learning algorithm, commonly known as k-means, is widely utilized in various problem domains.It typically has a complexity of O(N data ) per iteration given N data data points, which is desirable.Nonetheless, it may have limitations in terms of flexibility and may be more effective when applied to well-structured data sets.The algorithm takes data points compiled as v = {v i } N data i=1 ∈ R η as inputs, with η indicating the data dimension.It aims to partition them into k subsets named clusters, according to a similarity measure defined a priori, typically the Euclidean distance between points.It then produces a set of k cluster centers, referred to as centroids. 1During deployment, the alternation between the following two steps takes place until convergence is achieved: First, each data point is assigned a label to indicate that it belongs to a certain cluster, determined based on its proximity to different centroids.Second, each centroid is adjusted by taking the average of the data points assigned to the corresponding cluster, ensuring it to accurately represents the cluster.
In addition, quantum Gaussian mixtures employ probability distribution to cluster the data points, while the probability density values of the data points are presented as quantum wave functions, yielding possible performance gains thanks to quantum parallelism [86], [87].

3) REINFORCEMENT LEARNING
Unlike supervised and unsupervised learning methods, reinforcement learning (RL) methods continuously learn and adapt to changing conditions by leveraging ongoing feedback from the environment, which in this case is the wireless system, and applying a set of actions, e.g., altering the transmit precoding, in accordance with the state of the wireless system, e.g., the channel conditions.The action taken by the RL model results in a cumulative reward, which serves as a measure indicating the learning progress of the learning agent.To provide an example in a practical wireless scenario, an additional reward point of +1 can be awarded if there are no outages at any mobile terminals in a considered system, while any outage could result in a penalty, e.g., −5.Following is a discussion of quantumbased learning techniques that employ the aforementioned methodology.(a) Quantum-Based Q-Learning: Q-learning methods are commonly utilized in RL to deduce optimized Q-tables, which serve as mapping tables that return the ideal action A (t) given the current state S (t) .In general, considering the Bellman equation, the update process of the Q-table can incorporate the expected future reward, and can be expressed as [88], [89], [90]: 1.As starting points, k initial centroids can be selected randomly or using efficient heuristics as those in k-means++ algorithm [85].
Q S (t) , A (t) ← Q S (t) , t) , A (t) , (9) where and γ indicate the learning rate and the discount rate, respectively, and Â(t) is the ideal action for time t.In wireless communication systems, Q-learning can be used to determine the transmit precoding in beam tracking, for instance, given channel information of the moving user as the state [88], [89].Moreover, trainable deep learning models can function as Q-tables, allowing the handling of continuous, complex-valued input variables, such as channel information.In this regard, quantum-based learning models can also be employed [90].(b) Quantum-Based Experience Replay: The experience replay methodology utilizes a record of the previous training sequences, each called buffer, as a reference to aid the current training process.By randomly taking samples from this buffer during the training process, the learning model can circumvent the possibility of over-fitting to the current experience.To do so, an experience replay buffer stores a record of past data points, generally presented as a tuple of information comprised of elements related to the prior learning iterations.For each t-th time step, it can be expressed as t) , A (t) , R (t) ; S (t+1) }, in which S (t) , A (t) , and R (t) respectively denote the state of the environment, the action applied upon the environment, and the acquired reward [91], [92], [93].In addition, S (t+1) denotes the state at the (t + 1)-th time step.Due to this advantage, several works assumed the use of experience replay to maximize the reward.For example, the authors in [91] exploited experience replay buffer in a quantum-based RL framework.It is worth noting that experience replay buffers have been utilized in various classical RL frameworks as well.For instance, the work of [92], [93] prioritizes stored experiences based on their relevance to the learning process, e.g., giving higher priority to the frequently occurring experiences.(c) Quantum-Based Actor-Critic Methods: Unlike other RL methods that use a single model, e.g., Q-learning, the actor-critic methods employ two different learning models: the actor, which is responsible for determining the actions to take using a policy function, and the critic, which is utilized to evaluate those selected actions by examining the current state using a state-value function.Such configuration allows for different benefits such as enhanced capability in handling continuous action spaces.In classical ML, conventional learning models, e.g., classical deep neural networks, are typically used to represent the actor and the critic [94].However, in real-time wireless communication scenarios with high data processing requirements, e.g., larger state spaces in optimizing the energy efficiency of an IoT network with a high number of nodes, classical actor-critic methods may encounter constraints regarding the processing time, as two separate learning models (actor and critic neural networks) need to be trained, which can pose a challenge in learning convergence [95].To mitigate this issue, variational quantum circuits, which leverage the fundamental characteristics of quantum mechanics, such as superposition and entanglement, can be used for the actor and critic models, e.g., as done in [95], offering potential benefits such as prediction accuracy thanks to the enhanced learning capability of the quantum models.
Applications for Optimizing Wireless Communications: Different quantum learning models can be employed in actorcritic frameworks, taking advantage of the continuous action space offered by the quantum models to handle optimization problems pertaining to dynamic wireless environments.For example, in the recent work [91], quantum-based actor and critic models were involved in the optimization of the flying trajectory of UAVs to maximize the QoS of the UAV-based wireless communication, and showed increased learning convergence compared to the classical model.

IV. APPLICATIONS OF QUANTUM MACHINE LEARNING IN WIRELESS SYSTEMS A. QML FOR PHYSICAL LAYER
Due to its advantage in efficiently solving complex optimization problems in polynomial time, quantum-based methods can be utilized in a variety of physical-layer optimizations, including high-dimensional optimization spaces as those encountered in massive MIMO systems.
Leveraging the said advantages, quantum-based optimization methods have been employed to efficiently manage the transmit power allocation in MIMO systems, enabling optimal allocation of the limited resources in scenarios requiring high data speeds under complex channel conditions.For instance, the work of [96] presented a quantum-based decomposition of Vandermode matrix, which holds potential in a range of applications involving high-dimensional computation, such as signal recovery and channel modeling for MIMO systems.
A quantum bacterial foraging optimization algorithm was proposed in [97] for optimizing the power coefficients and tilt angles of the transmit antennas in MIMO communication systems.The results demonstrated superior performance in solving combinatorial optimization problems compared to classical bacterial foraging optimization algorithms, especially in the case of parallel non-gradient optimization, in which multiple variables with unknown gradient functions are involved.
Another quantum-inspired heuristic algorithm, termed quantum-behaved particle swarm optimization (QPSO), was employed in [98] towards maximizing the data rate of a railway wireless communication system.The study considered factors such as transmit power and spectral allocation, as well as practical restrictions such as Doppler shift caused by the train's velocity.
Another notable application of QML for physical-layer design is to accurately estimate the channel state information, which is crucial for reliable data transmission.Quantum algorithms, such as the quantum support vector machine (QSVM) [79], can be employed to improve channel estimation.QML algorithms can utilize the unique properties of quantum systems to enhance the accuracy and efficiency of channel estimation tasks [99].

B. QML FOR SIGNAL INTELLIGENCE
Owing to their ability to efficiently solve complex optimization problems in polynomial time, quantum-based optimization methods have emerged as promising approaches to enhance signal intelligence in emerging wireless communication models which typically involve high-dimensional optimization spaces, such as in massive MIMO systems, and high-order signal modulation.The subsequent discussion outlines how different quantum-based methods can be beneficial for signal intelligence, including enhanced accuracy and improved data processing time.

1) QUANTUM-BASED METHODS FOR SIGNAL PROCESSING AND DETECTION
The process of signal processing could also benefit from different quantum-based algorithms.In particular, quantum search algorithms can be adopted in signal detection, as it is done in [100], which utilized a modification of the Dürr & Høyer algorithm, a quantum search algorithm based on Grover's algorithm described in Section IV-B1, to determine possible signal realizations in MIMO systems, based on the maximum-likelihood principle.In addition, the work of [101] incorporated a modified Dürr & Høyer algorithm for multi-user detection, designed for direct-sequence spreading in space-division multiple access (SDMA).In other instances, quantum-based heuristic algorithms have been applied to address the issue of a high peak-toaverage power ratio (PAPR).Specifically, the study of [102] employed a quantum-inspired evolutionary algorithm that has a low computational complexity, and is capable of reducing the PAPR in orthogonal frequency-division multiplexing (OFDM) systems.The work of [103] expands the application of quantum-inspired evolutionary algorithms by utilizing them in multi-objective PAPR reduction.

2) QUANTUM-BASED TECHNIQUES FOR SIGNAL CLASSIFICATIONS
Thanks to their capabilities in handling complex combinatorial problems, quantum-based methods are particularly well-suited for the classification of signal constellations.For example, in the study of [104], a quantum annealing-based optimization method was utilized to classify signal constellations in a large-scale MIMO system.To this end, the maximum-likelihood-based classification problem at hand was initially converted into a quadratic unconstrained binary optimization (QUBO) problem, and then solved by using a quantum annealing methodology.The proposed approach has been adopted for different modulation techniques, such as quadrature phase-shift keying (QPSK) and quadrature amplitude modulation (QAM).The work of [105] extended this approach to accommodate higher order modulation, namely, 64-QAM.

C. QML FOR HIGHER LAYERS
Different quantum-based optimization methods have been specifically crafted for the data link layer, to provide secure communication links and error corrections, among others.

1) QUANTUM TECHNOLOGIES TO ENHANCE COMMUNICATION SECURITY
As quantum computing platforms advance, thereby becoming more capable of exponentially faster calculation compared to classical processors, they pose a threat to many of the cryptographic techniques currently in use, as these techniques rely on computationally hard mathematical problems, such as factorization of large numbers, which serve as the basis of the Rivest-Shamir-Adleman (RSA) encryption methodology.The work in [106], [107] suggests that quantum routines such as Shor's algorithm have the capability to provide the solution to factoring problems and thereby, in theory, can break the popular encryption approach, posing threats to institutional services, many of which rely on RSA for securing information such as user passwords.
In response to this concern, different counter-measures have been made in the field of cryptography, including the development of quantum-based cryptography, such as quantum key distribution (QKD).In particular, QKD allows a secure exchange of secret key used for the encryption and decryption of a message, as any eavesdropping attempts on the key exchange will inevitably introduce errors that can be detected by both the sender and the receiver, alerting them to the threat [108].
Interestingly, QML can also be employed to enhance the security aspects of wireless networks.In particular, the authors of [109] addressed the utilization of QML to mitigate quantum-based security attacks on wireless communication networks.Moreover, in [28], blind quantum computing, which allows transmission of information to a remote quantum processor using an encrypted quantum state, was advocated to maintain the confidentiality of the data.

2) QUANTUM ERROR CORRECTION TECHNIQUES
In addition to providing enhanced security, quantum technologies are particularly useful for error corrections, especially in quantum-based communications, which rely on the quantum mechanism such as quantum entanglement to transmit information, or in quantum-inspired communications, which use classical systems to mimic the behavior of quantum systems.In particular, a quantum-based error-correction technique called Shor's code can correct one error in a set of nine physical qubits [110].The method encodes each logical qubit into different physical qubits, and detects errors by identifying the occurrence of bit-flip or sign-flip of the transmitted qubit.

D. QUANTUM CONVOLUTIONAL NEURAL NETWORK FOR MAXIMIZING ENERGY EFFICIENCY IN RIS-AIDED MULTIPLE ACCESS
To showcase the feasibility and merits of using QML to maximize the performance of wireless communication systems, let us consider the problem of energy-efficiency maximization of RIS-aided multiple access as a particular use case.Thanks to RISs, we now have enhanced control over wireless propagation, enabling us to effectively manage the transmission for different user terminals by mitigating interference and obstructions.
Several studies considered quantum-based optimization methods for RIS assisted systems.In [111], a quantum Ising model was adopted to minimize the number of required time slots in various RIS-enabled communication scenarios, taking the RIS phase-shift coefficients as the optimization variables.The authors of [112] employed a quantum optimization scheme based on quadratic unconstrained binary optimization to maximize the power of the received signal in different scenarios; different numbers of base stations (BSs), RISs, and user terminals were assumed [112,Tab. 1].In that work, the D-Wave hybrid solver demonstrated a significantly lower computational time, especially for a larger number of RIS elements, in comparison to the Nelder-Mead simplex scheme [112,Fig. 10].Besides, the authors of [113] employed Ising models for optimizing RIS parameters.
With regard to the integration of RIS and multiple access, various works have exploited RISs to enhance the performance of the next-generation multiple access techniques, such as those of NOMA and rate-splitting multiple access (RSMA) [114].In [115], RIS was deployed to assist RSMA, whereas different variables were also optimized to enhance energy efficiency, defined by the ratio between the sum-rate and the total consumed power.In [116], it is shown that the deployment of RISs within an RSMAbased system allows for various performance benefits, such as enhanced spectral efficiency.In [117], the outage probability of a RIS-assisted RSMA system was investigated.Further, the authors of [117] examined the occurrence of outage in a simultaneous transmitting and reflecting RIS (STAR-RIS) aided system, evaluating the influence of the RIS phase-shift and RSMA power splitting coefficients.Besides, various studies explored the utilization of RIS in multiple access protocols, e.g., NOMA [118].The work of [18] introduced a multiple access method based on the integration between RIS and NOMA.Also, the work of [119] examined the capacity of a STAR-RIS aided NOMA system.

1) APPLICATION SCENARIO
Now, let us consider a wireless communication system that employs uplink power-domain NOMA for K user terminals, with each having L antennas, to transmit messages to the BS equipped with M antennas.Due to the presence of blockages in the service area, there is a lack of possibilities for establishing direct LoS links between the user terminals and the BS.To overcome this challenge, a RIS is exploited to provide indirect links from the user terminals to the BS through passive beamforming.
Optimization Objective: Due to the growing concern of energy consumption and environmental sustainability [120], our focus here is on the maximization of the energy efficiency of the communication system.The optimization problem can be formulated as follows: (10) where R sum denotes the sum-rate, P is the total power consumption of the system, P Tx k and λ k denote the maximum transmit power and the NOMA power ratio coefficient of the k-th user terminal, respectively, W = {w T k } K k=1 ∈ [0, 1] L×K is the power control matrix, w k denotes the transmit precoder of each k-th user, ∀k ∈ {1, . . ., K}, = diag(τ ) ∈ C N×N denotes the phase-shift matrix pertaining to the passive beamforming with τ = {e iφ i } N i=1 denoting the phase-shift elements of the RIS.In (10), constraint C1 signifies that each i-th phase-shift element defines its phase from the available discrete phase set E = { 2π(r−1) /N opt }2 s r=1 , where s denotes the number of classical bits used to process phase shifting.The constraint C2 ensures satisfying the power budget of each user terminal.
Channel Model and Input Feature: Considering the utilization of a RIS to allow indirect links from the user terminals to the BS, cascaded propagation channels are assumed, where h k and H denote the channels from the k-th user to the RIS, and from the RIS and to the BS, respectively.Therefore, the end-to-end channel between the k-th user terminal and the receiving BS, via the RIS, can be expressed as H k = Gdiag(τ )h k [121].Herein, the normalized Ĥk is utilized as the classical input, denoted by x in = vec( Ĥk/ Ĥk ) ∈ C 1×KN Tx , for our quantum-based learning model.
Utilizing Quantum Convolutional Neural Network for Dimensional Reduction: The approach to solve the problem specified in (10) is described as follows.First, the phaseshift matrix is optimized using the approach presented in [122], where the phase-shift coefficients are selected from set E to satisfy constraint C1.Subsequently, to account for the increase in input dimensionality due to the utilization of the RIS, we utilize a quantum convolutional neural network The performance of the proposed design in terms of training convergence is demonstrated in Fig. 5. Here, the training approach described in Section V is assumed, with the loss function defined as L = − log(μ) = − log(R sum ) + log(P).The initial value of each parameter of the quantum model is set as π /2.The numerical simulation was conducted under the following set-up: K = 2, L = 1, M = 4, and N = 8; the calculation of power consumption P follows that of [123], while the computation of channels G and h k , ∀k ∈ {1, . . ., K}, follows those of [117] and [124].
In Fig. 5, the notion of "quant.var.+ conv" indicates the integration of a quantum convolutional layer and a quantum variational layer, while the term "quant.conv" indicates the utilization of the convolutional layer alone. 2The results show that the combination of variational and convolutional layers can produce a lower training loss, although this configuration requires a higher number of parameters.Indeed, the integrated model has 16 parameters, while the model with a single convolutional layer has 8 parameters.The figure also demonstrates that the employed quantum convolutional neural network is able to reduce the input dimension significantly.For instance, considering the aforementioned parameters and quantum convolutional neural network with one layer, the process has been streamlined from using N qubit = 8 qubits for input encoding to performing measurement on two qubits as the output, thereby enabling more efficient measurement and parameter optimization.Despite the availability of higher numbers of qubits in today's quantum computing systems, the parameters of the learning models, one of which yields the result shown in Fig. 5, are still predominantly trained using classical algorithms.Introducing a substantially large number of parameters could potentially impose a computational burden on the classical system (this concern will be discussed in Section V-B2).Nonetheless, quantum systems involving large numbers of model parameters will be featured in future works.

V. CHALLENGES IN QUANTUM MACHINE LEARNING
Moving forward, we will discuss about the challenges related to the training processes and the deployment of QML algorithms, particularly with regard to their viability for optimizing wireless systems.

A. TRAINABILITY OF THE QUANTUM LEARNING MODELS
Numerous QML techniques employ model-driven estimators, which are implemented as a sequence of quantum operators with adjustable parameters, to generate desired solutions.As an illustration, given the information about prior spectral occupancies, the quantum-based model can be employed to minimize uplink collisions between multiple devices, as the authors of [125] did with classical models.Thus, our task is to maximize the accuracy of the parameterized model via training.
Therefore, the ability of the model to be trained using the available dataset becomes paramount.Although complex QML models involving a high number of qubits and quantum gates can represent a wide range of optimization solutions, they may encounter a trainability issue, which is referred to as the barren plateau phenomenon, limiting their performance.
For better understanding, we can consider the gradient of the training loss as a particular random variable, which can become very large as the number of qubits increases, and resulting in the vanishing gradient problem.The phenomenon of barren plateau which occurs during the model training can be expressed as Var N ) [126], [127], where ∇ L( ) describes the gradient of the loss function w.r.t. the parameter set , O is the Hermitian representation of the state of the quantum system at hand, V symbolizes a fixed Hermitian, and Tr(•) is the trace operator.Based on the study of [128], the variance of the loss gradient can be approximated as Var(∇ L( )) ≈ 2 −N qubit , where N qubit is the number of entangled qubits employed by the system.
In light of the above examination, one can observe that the ability of the quantum-based model to process different outcomes will be reduced as the scale of the quantum processing system increases.Unfortunately, for most wireless scenarios, we do not have the luxury of downsizing the quantum model, especially in the case of heterogeneous networks.Faced with this dilemma, one strategy is to reduce the number of parameters to be optimized during a given training phase.This approach has been adopted in various methodologies, including layer-wise training [129], [130].Still, it is non-trivial to develop quantum learning models that can be trained effectively to solve different ML tasks.This may involve improving existing algorithms, such as variational quantum algorithms, and identifying potential advantages that can be attained by employing quantum systems.

B. QUANTUM COMPUTING PROCESSORS
The advancements in quantum computing have opened the doors for its deployment strategies in communication networks.It is possible to imagine that scalable quantum processors could be placed close to microcells, empowering machine-to-machine communications.In particular, quantum processors that are co-located with local servers can be utilized to maximize the reliability of massive machine-type communication (mMTC) systems [131].
Nonetheless, quantum computers are still in their early stages of development and face some hardware limitations, which include high error rates, short coherence times, and limited connectivity between qubits.These factors make it challenging to implement and scale QML algorithms effectively, and need particular R&D efforts to reap out the full benefits of QML.

1) DECOHERENCE AND NOISE IN QUANTUM SYSTEM
The current generation of quantum processors is prone to quantum decoherence and quantum noise, which can have adverse effects on their accuracy.Quantum decoherence refers to the loss of the ability in maintaining quantum properties such as superposition and entanglement.Quantum noise refers to the random fluctuations that can affect the quantum state, arising from factors such as thermal fluctuations.The authors of [132] investigated the impact of decoherence on the performance of QML, in particular for performing classification tasks on an image dataset, showing that decoherence has a negative influence on accuracy.Developing robust error mitigation techniques, including error correction and error suppression, is therefore indispensable.
Interestingly, the presence of noise in a quantum system can be leveraged for the benefit of the optimization process.In particular, the authors of [133] advocate taking advantage of noise in Oracle-based quantum optimization (cf.Section V-B1), potentially making optimization processes more efficient.Moreover, the authors of [134] propose a quantum-based ML framework that can effectively handle noise and decoherence, while even utilizing noise as a random variable to mitigate over-fitting, which can otherwise hinder its ability to perform classifications on different data sets.Furthermore, the authors of [135] reported the utilization of properly adjusted noise in quantum systems to enhance the performance of quantum annealing methodologies, specifically in relation to convergence time.Nonetheless, it should be noted that a higher noise level in quantum annealing could compound errors in certain wireless applications, such as multi-user MIMO detection [136].This amplification of errors may, in turn, affect its detection accuracy.

2) QUANTUM-CLASSICAL INTEGRATION
This issue can arise from the integration of quantum prediction models into the existing AI frameworks, as the quantum model may still rely on classical computers to optimize the parameters of quantum-based models, which can limit the computational advantages that can be achieved by quantum computing.One potential solution involves the utilization of quantum-based algorithms to optimize the parameters of the quantum prediction models, as shown in [137], which proposed a quantum-based algorithm for optimizing the parameters of classical neural networks, opening up the possibility to extend their utilization for quantum neural networks.Nevertheless, the development of scalable quantum algorithms for parameter optimization remains an active area of research.Here, it is important to note that most of the high-dimensional data about wireless systems, such as signal constellation and sparse channel information, is conventionally stored as classical memory, and thereby requires encoding operations to be processed by QML models.Furthermore, as of the time of writing, there is still a lack of studies on the implementation of quantum systems for optimizing RF hardware in practical conditions-apart from those studies assuming simulated wireless environments, e.g., [91], [138].In addition, in order to efficiently integrate classical and quantum systems and take advantage of their complementary strengths, the development of hybrid algorithms and software frameworks is essential.
Regarding current achievements in wireless communications, classical technologies have made remarkable contributions in supporting signal processing, enabling high data rate transmissions, and facilitating robust error correction [4], [72].However, there are certain computational problems in next-generation wireless communications that will pose challenges for classical computing systems, such as tackling large-scale optimization problems and handling post-quantum security [108], [109].These problems can be addressed using quantum computing processors, which provide computational efficiency and have the unique ability to handle quantum communication protocols.

C. QUANTUM HARDWARE
Quantum computers are still in their early stages of development and face hardware limitations.These limitations include high error rates, short coherence times, and limited qubit connectivity [34], [110], [139].Such factors make it challenging to implement and scale QML algorithms effectively.In addition, quantum computers that are available today are prone to errors due to various noise sources, such as decoherence and gate imperfections.The effects of noise can significantly impact the accuracy and reliability of QML algorithms.Therefore, developing robust error mitigation techniques, including those of error corrections and error suppressions [108], is crucial to ensure the correctness of the quantum computations.
Wireless systems can benefit from employing quantum computations, with examples such as quantum annealing utilizations for low-density parity check (LDPC) decoding [140].Quantum computing functionalities can complement edge and central computing platforms, offering quantum-aided task offloading for different purposes in wireless communications [141].Nevertheless, current quantum computing hardwares typically require extensive physical space and added equipment, such as for cryogenic purposes.These factors could affect their practicality and necessitate additional human resources for operations [142].
Due to the aforementioned limitations, it is anticipated that early quantum computing for wireless communications deployments may take the form of cloud computing features to facilitate the functioning of many BSs in certain regions [42].The advent of quantum fog computing, which can manage several macro and micro BSs, may represent the next major development.It can be argued that the ideal implementation of quantum edge computing occurs when it is utilized at the edge level.This approach allows a specific macro BS to directly access low-latency quantum computations, thereby providing immediate support for time-sensitive tasks such as device localization and channel estimation [23].In addition, the concept of task offloading between a central quantum computing unit in the cloud and its subsidiary quantum computing processor in the edges or fogs is indeed intriguing.Specifically, the cloud can effectively manage centralized tasks such as the activation and power control of BSs in a given area.On the other hand, the quantum computing units in the fog can handle ad hoc tasks, such as facilitating device-to-device communications.

VI. CONCLUSION
This paper provided a comprehensive overview of the art of QML methods and their applications in wireless communication systems, exploring various aspects of QML, such as quantum-based operations and techniques, and discussing their merits in meeting the efficiency requirements of wireless communication systems.It has been shown that different quantum-based models can also empower existing ML methodologies, such as supervised, unsupervised, and reinforcement learning methods, emphasizing the fact that QML techniques should be positioned synergistically with their classical counterparts to meet the requirements of future wireless communication systems.The paper has highlighted the versatility of QML in addressing diverse optimization tasks in wireless communications by featuring various QML applications, such as resource allocation, signal processing, and security enhancement.The paper has also covered challenges that need to be addressed to allow practical applications of quantum-based optimization for wireless systems and networks, such as those related to training, quantum computing hardware, and integration.Tackling these challenges eventually calls for interdisciplinary collaboration across the fields of quantum computing, machine learning, and wireless communications.

FIGURE 1 .
FIGURE 1. Towards the next-generation of wireless communications.

FIGURE 2 .
FIGURE 2. Quantum state representation in Hilbert space, using Bloch sphere.The surface of the sphere represents all possible states of arbitrary qubit with state |ψ , with the poles denoting the two computational bases, i.e., |0 and |1 .

FIGURE 3 .
FIGURE 3. Overview of a generic QML process.

FIGURE 5 .
FIGURE 5.The training convergence of the quantum-based learning model employing quantum variational and convolutional layers, aiming to maximize the energy efficiency of an uplink RIS-assisted NOMA system.