Hybrid Quantum-Classical Convolutional Neural Network Model for Image Classi  cation

— Image classi  cation plays an important role in remote sensing. Earth observation (EO) has inevitably arrived in the big data era, but the high requirement on computation power has already become a bottleneck for analyzing large amounts of remote sensing data with sophisticated machine learning models. Exploiting quantum computing might contribute to a solution to tackle this challenge by leveraging quantum properties. This article introduces a hybrid quantum-classical convolutional neural network (QC-CNN) that applies quantum computing to effectively extract high-level critical features from EO data for classi  cation purposes. Besides that, the adoption of the amplitude encoding technique reduces the required quantum bit resources. The complexity analysis indicates that the proposed model can accelerate the convolutional operation in comparison with its classical counterpart. The model’s performance is evaluated with different EO benchmarks, including Overhead-MNIST, So2Sat LCZ42, PatternNet, RSI-CB256, and NaSC-TG2, through the TensorFlow Quantum platform, and it can achieve better performance than its classical counterpart and have higher generalizability, which veri  es the validity of the QC-CNN model on EO data classi  cation tasks.


Hybrid Quantum-Classical Convolutional Neural Network Model for Image Classication
Fan Fan , Yilei Shi, Member, IEEE, Tobias Guggemos, and Xiao Xiang Zhu , Fellow, IEEE Abstract-Image classication plays an important role in remote sensing.Earth observation (EO) has inevitably arrived in the big data era, but the high requirement on computation power has already become a bottleneck for analyzing large amounts of remote sensing data with sophisticated machine learning models.Exploiting quantum computing might contribute to a solution to tackle this challenge by leveraging quantum properties.This article introduces a hybrid quantum-classical convolutional neural network (QC-CNN) that applies quantum computing to effectively extract high-level critical features from EO data for classication purposes.Besides that, the adoption of the amplitude encoding technique reduces the required quantum bit resources.The complexity analysis indicates that the proposed model can accelerate the convolutional operation in comparison with its classical counterpart.The model's performance is evaluated with different EO benchmarks, including Overhead-MNIST, So2Sat LCZ42, PatternNet, RSI-CB256, and NaSC-TG2, through the TensorFlow Quantum platform, and it can achieve better performance than its classical counterpart and have higher generalizability, which veries the validity of the QC-CNN model on EO data classication tasks.

NOMENCLATURE
qL Qubits for the spatial information.qC Qubits for the color information.qK Qubits for the index of the kernels and feature maps.qR Qubits for the values of the feature maps.

I. INTRODUCTION
I N THE Earth observation (EO) domain, image classication is an active research eld, contributing to deriving land use and land cover information from the remote sensing imagery [1].However, due to the advances in remote sensing technologies, EO has irreversibly arrived in the big data era.Given the rapid growth of the data and the complexity of machine learning models for analysis, the required computation capacity has already been a primary barrier to the use of classical machine learning algorithms to automatically comprehend remote sensing images.Leveraging the power of quantum computing might overcome this challenge in the future since it is expected to efciently solve problems that are prohibitively expensive for classical computers [2].Investigating how to apply quantum computing in remote sensing is important and prospective.
Quantum machine learning (QML) is an interdisciplinary eld that integrates machine learning and quantum computing.Numerous contributions from researchers attempting to exploit its potential for various tasks have been made to the eld, such as data processing [3], [4], classication [5], [6], [7], segmentation [8], optimization [9], and quantum entanglement indication [10].In the noisy intermediate-scale quantum (NISQ) era [11], applying QML for image classication is highly suitable.The reason is that quantum computing can speed up the complicated computation for image comprehension, and the classication output is simple but decisive, which is suitable for the probabilistic result from QML algorithms [12].Several QML models have been proposed to classify images, using either quantum annealer [13] or parameterized quantum circuits (PQCs) [14].However, whether quantum neural networks (QNNs) (a subeld of QML) can outperform their classical counterparts remains an open question.Moreover, only a few studies regarding applying QML to classify remote sensing images have been explored [15], [16].Therefore, further investigations on QML for image classication are necessary and benecial, especially in remote sensing.
However, current quantum machines are not fully faulttolerant, and only a few qubits are supported, which restricts the applications of QML algorithms in practice.Most proposed quantum algorithms can only be veried via simulation on a small scale.Despite that, the signicance and necessity of the research on QML algorithms should not be underestimated because it is fundamental for further practical applications when more advanced quantum devices are available.
Regarding remote sensing image classication, deep learning has been widely investigated [17], [18], [19].Among them, the convolutional neural network (CNN) is essential for feature extraction due to its performance and adaptability.However, only a few proposals related to the quantum version of CNN have been made.We seek to investigate a quantum CNN model that can not only leverage quantum computing to extract critical features for classifying remote sensing data but also be qubit-efcient to meet the constraints of available qubits in quantum machines or simulators in the NISQ era.
Inspired by the study introduced in [20], a new hybrid quantum-classical CNN (QC-CNN) is developed, in which the quantum part is a parameterized circuit to extract essential features from images, and the classical part conducts the classication accordingly.In addition, our model exploits the amplitude encoding technique for image classication tasks, which requires relatively fewer qubits than using computation basis encoding.
To evaluate the effectiveness of our model, we used various EO benchmarks in our experiments with the TensorFlow Quantum (TFQ) platform [21], i.e., Overhead-MNIST [22], So2Sat LCZ42 [23], PatternNet [24], RSI-CB256 [25], and NaSC-TG2 [26].The experimental results suggest that the QC-CNN model outperforms its classical counterpart and exhibits greater generalizability.Furthermore, in our experiments, we also studied our model's performance with different quantum gates, measurements, model structures, and noise effects to gain deeper insights into our model's properties and validity.
Regarding the efciency, due to quantum parallelism, the proposed model can perform the elementwise product efciently and speed up the feature extraction process by simultaneously transforming all desired quantum states.
The main contributions of this work are given as follows.
1) This work introduces a new hybrid QC-CNN for multicategory image classication, which can effectively extract critical features from images by using quantum circuits and achieve superior classication performance to classical CNN models.
2) This work presents a quantum convolution layer that can reduce the number of qubits and simplify the model's structure for classication by applying amplitude encoding.3) This work investigates the impacts of quantum gates, measurement strategies, the structure of the model, and the noise effects on the classication performance.This article is structured as follows.Section II introduces basic concepts of quantum computing, and Section III presents related work for image classication with quantum computing.Section IV details the structure of the QC-CNN model.The experimental evaluation of the model's performance is presented in Section V.Then, we analyze the scalability and efciency of the model in Section VI.Finally, the conclusion and future work are discussed in Section VII.

II. BACKGROUND
A quantum is the minimum discrete unit of any physical entity, and it has many special phenomena, such as superposition and entanglement, which can be utilized to perform computation tasks.This computation methodology refers to quantum computing.
Qubit is the elementary concept in quantum computing, an analogous concept of bits in classical computation.The specialty of a qubit lies in that the qubit state is a linear combination of the basis states (e.g., |0⟩ and |1⟩), which is called superposition, as shown in (1), where the coefcients α and β are complex numbers indicating the amplitudes of the quantum state, and the states |0⟩ and |1⟩ are computational basis states.When measuring a qubit in the computational basis, it will collapse either to the state |0⟩ or |1⟩ with the probability |α| 2 and |β| 2 , respectively, Another quantum property is entanglement.The entangled qubits are correlated, and the state of one qubit affects the state of the other.The entangled quantum state cannot be separated into two states, which means that the entangled state |ψ AB ⟩ cannot be written as a tensor product of its component state |ψ A ⟩ and |ψ B ⟩. Thus, To perform quantum computing, the quantum circuit model is one of the most popular models, which generally consists of three sequential parts.

A. Information Encoding
The input of any quantum algorithm is a quantum state.Classical data need to be encoded in such a state rst.There are two basic encoding techniques: computation basis encoding and amplitude encoding.Specically, the computation basis encoding maps the classical data into a basic quantum state of a quantum system, so it requires the classical data to be converted into the binary string form and uses the corresponding quantum basis state to represent it.As for the amplitude encoding, the classical data will be indicated by the amplitude of a quantum state.The classical data have to be normalized to guarantee that the sum of the squared amplitudes of the quantum state is equal to 1.
Based on these encoding techniques, various methods for encoding images to quantum states have been explored [27].

B. Quantum State Transformation
Transforming quantum states is a critical step of quantum computing, playing a pivotal role in various computation tasks.After information encoding, a sequence of quantum gates, acting on one or more qubits, is utilized to manipulate the input state and convert it into another state through entanglement, rotation, and so on.The transformed quantum state is expected to be suitable for subsequent computation purposes.
The quantum gate is an analogy of logic gates in classical computers, but the difference is the gate for quantum computing must be unitary, which preserves the normalization and reversibility of the quantum system.
In this article, the following elementary quantum gates are involved: RY gate (rotation around the Y -axis), Hadamard gate (rotation around the X + Z -axis with π ), X gate (rotation around the X -axis with π ), U3 gate (rotation with three Euler angles), and CU3 gate (controlled version of the U3 gate).Their matrix representations are shown in Fig. 1.

C. Measurement
Quantum state measurement takes place at the end of the quantum circuit, which obtains the information from the quantum state to classical data.The obtained output, e.g., the expectation values, can be treated as the result of quantum computing.In this work, we use the expectation values as the extracted features from the input image for further classication processing.
Regarding QML, PQCs are commonly used as a hybrid approach, such as [28], [29], and [30].Specically, a PQC is composed of xed quantum gates, but the parameters of these gates are trainable.They will be optimized during the training process in the classical machine, but the quantum state's transformation and measurement will be performed in the quantum machine.

III. RELATED WORK
In recent years, there has been a growing interest in exploring how to incorporate quantum computing into image classication amidst the current limitations of quantum hardware.This section gives a brief overview of relevant studies, starting with those using either quantum machine learning or quantum deep learning for image classication.Then, recent contributions involving quantum computing in the EO domain are discussed.

A. Quantum Machine Learning
Quantum image classication has attracted great attention from researchers in the QML community, and various studies have been conducted.
Rebentrost et al. [31] and Havlíček et al. [32] proposed a quantum support vector machine (QSVM) for classication tasks.Ostaszewski et al. [33] introduced a quantum model using the principal component analysis (PCA) to classify images.Besides integrating SVM and PCA algorithms with quantum computing, Ruan et al. [34] focused on using the K-nearest neighbor (KNN) algorithm for image classication.They presented a method to compute the Hamming distance and utilized it in their proposed QKNN model to realize a good analog for the classical KNN algorithm.Dang et al. [35] employed the quantum minimum search algorithm in their quantum KNN model to speed up the similarity search processing without the negative inuence on the classication accuracy.

B. Quantum Neural Networks
QNNs have also been investigated for image classication and recognition.CNN, which can automatically extract the high-level critical features from images by applying various lters in its sequential structure, plays an important role and shows promising performances.Researchers also attempted to implement the classical CNN with quantum computing.
Cong et al. [36] presented a QCNN containing successive convolution layers and pooling layers.Their proposed model is structured by combining multiscale entanglement renormalization ansatz and quantum error correction, and they have explicitly illustrated the potential of the proposed model in phase classication for quantum physical systems.Herrmann et al. [37] experimentally implemented the QCNN for recognizing topological quantum phases.Regarding employing this model to classify images, Chen et al. [38] proposed a fractal scaling down dimension reduction algorithm to reduce the image's features and then applied the QCNN model afterward for image classication.Lü et al. [39] used a quantum state preparation model to approximate the input image's quantum state and applied the QCNN model for the classication.The effectiveness of the introduced models was veried in their experiments.
Kerenidis et al. [40] proposed a quantum algorithm for applying deep CNNs, and it can achieve a similar accuracy on the MNIST dataset compared with the classical CNN.Wei et al. [41] demonstrated another basic framework of a quantum convolution neural network, which uses fewer parameters to achieve comparable classication performance as the classical CNN for digit recognition tasks.Hur et al. [42] presented fully parameterized QCNNs for classical data clas-sication.When classifying images, they rst apply classical algorithms, such as PCA and Autoencoder [43], to preprocess input images and extract preliminary global features.Afterward, their proposed model continuously extracts higher level features from these preliminary features by applying a sequence of convolutional circuits and pooling circuits for classication.
Henderson et al. [44] introduced a new type of transformation layer (quanvolution layers) using a random quantum circuit in their proposed quanvolutional neural networks to obtain meaningful local features for classication purposes.Their experiments show that the proposed QNN model can outperform purely classical CNN on the MNIST dataset in terms of accuracy and efciency.However, the random quantum circuit results in an unrepeatable operation.More recently, Matic et al. [45] and Chen et al. [46] independently proposed novel hybrid CNNs, in which they both use PQCs instead of a random one as the convolutional kernel to get This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.
values of the feature maps for different classication tasks.Riaz et al. [47] used a strongly entangled circuit without any additional trainable parameters as a kernel to transform image features for classication.
Li et al. [20] proposed a quantum deep CNN based on a quantum parameterized circuit for image recognition.Their experiments on the MNIST and GTSRB datasets veried the model's validity.They use computation basis encoding to encode kernels' values and the image's color information.Thus, to accurately encode a pixel value from 0 to 255, eight qubits are needed.In addition, to achieve the convolutional operation, converting between amplitude encoding and computation basis encoding using quantum phase estimation [48] algorithms is necessary.Thus, their proposed model not only requires more qubits to encode images compared with utilizing the amplitude encoding technique but also needs extra quantum gates and computation costs.
However, note that most of the previous research focuses on binary classication tasks.To handle multicategories, Zeng et al. [49] presented a hybrid QNN that applies a ladder-like PQC to encode the input image and transform the features and a classical dense layer to handle multicategory classication tasks.However, their model extracts features without considering the spatial information of the input image.In addition, their model uses one qubit to encode only one feature.Thus, when classifying large images, the number of needed qubits in their model will increase signicantly.

C. Applied QML and QNNs in the Remote Sensing Domain
As for applying QML/QNNs to classify remote sensing images, some researchers focused on using quantum annealers for classication [50], [51], [52].Besides that, studies based on quantum circuits also have been conducted to analyze remote sensing data.Still, many of the related proposals exploit classical algorithms for feature extraction, and quantum circuits are basically applied for the nal prediction.
For example, Gawron and Lewiński [53] presented a neural network based on a quantum circuit to classify multispectral images for land cover mapping, which uses the classical PCA algorithm to reduce the image's features.Zaidenberg et al. [15] introduced a hybrid model to classify the EO data, in which a CNN model is used to extract features from the images, and a quantum circuit is applied for the nal classication task.Sebastianelli et al. [54] proposed a circuit-based hybrid QNN to analyze remote sensing images for land use and land cover classication.Their model uses a classical CNN derived from the LeNet-5 to extract high-level features from images and applies a quantum layer implemented with a quantum circuit for the nal classication.Abdel-Khalek et al. [55] adopted an Inception ResNet to extract features from high-resolution imagery and a QNN for classication.The effectiveness of their model was proven in their experiments.
In addition, Otgonbaatar and Datcu [56] introduced a hybrid model to analyze polarimetric synthetic aperture radar images.Their model performs pixelwise classication, which extracts the features from the Stokes parameters of the pixel from the image using a quantum circuit for analysis.

IV. METHODOLOGY
In this article, a hybrid QC-CNN is proposed, aiming to classify images with the supervised deep learning method by exclusively exploiting the amplitude encoding technique.As shown in Fig. 2, the proposed QC-CNN model contains two sequential sections.The quantum section is expected to extract important features from images efciently, and the classical section performs the nal classication accordingly.
Our model consists of four types of layers sequentially: 1) encoding layer; 2) quantum convolution layer; 3) measurement layer; and 4) dense layer.The utilized qubits in the model can be categorized into four groups regarding their encoded information, represented by different colors in Fig. 2: qL qubits (white) for the input image's spatial information, qC qubits (gray) for the image's color information, qK qubits (green) for the kernels applied in the quantum convolution layer, and qR qubits (yellow) for the generated quantum feature maps.The functions of these layers will be explained in the following, and Nomenclature claries the denitions of the main notations involved in this section.

A. Encoding Layer
Encoding an image in a quantum state is a crucial step for our model.Various quantum image representation methods have been investigated and developed [27].Note that the choice of the representation method will affect not only the number of needed qubits and quantum gates in the encoding layer but also the structure of the further quantum layers.The proposed model adopts the Flexible Representation of Quantum Images (FRQI) [57] to encode gray-scale images into quantum states, which can represent images using a small number of qubits, as it uses amplitude encoding to encode the color information of all pixels as follows.
Every pixel of an image with 2 n × 2 n pixels is represented by its (x, y)-coordinate.By using a quantum state |l x,y ⟩ for the respective coordinate, one requires 2n qubits (qLs) in superposition, and this is achieved by applying Hadamard gates.The respective quantum state is where |l x,y ⟩ indicates the (x, y)-coordinate of a pixel as a binary string form as |x n−1 , . . ., x 0 y n−1 , . . ., y 0 ⟩.For the color information of a gray-scale image, we use a single qubit (qC), of which the amplitude indicates the pixel values.In particular, for each pixel of the image, one RY (θ )-gate is used for the qC's amplitude transformation, where the gray value [0, 255] of pixel (x, y) is mapped to θ x,y = [0, π/2], such that the qubit state is Entanglement allows binding the pixel's location with its color information stored in the two registers |l x,y ⟩ and |c x,y ⟩.This is achieved by Multicontrolled-RY(θ )-gates where the control-bits encode the pixel's (x, y)-coordinates, and the nal quantum state for image representation is which encodes the image with 2 n × 2 n pixel values using 2n + 1 qubits.In contrast, with computation basis encoding, we need 2n + 8 qubits to accurately encode this image.Fig. 3 illustrates one example circuit of the implementation of FRQI on a 2 × 2 image.The dot markers in the circuit represent the controlled state: white dots for |0⟩ and black dots for |1⟩.In this way, the encoding layer translates a classical image into a quantum state.

B. Quantum Convolution Layer
The quantum convolution layer in the QC-CNN model aims to achieve the convolutional operation, which plays a critical role in the proposed model to extract features and generate corresponding feature maps.In this layer, the kernel size and the convolutional stride are set to the same to perform the fast dimension reduction for the generated feature maps, and they should be modied according to the input image's size as classical CNN models.In this article, the QC-CNN model with 2 × 2-sized kernels and the convolutional stride of 2 is exampled for clarication.
Given one kernel to perform the convolutional operation for feature extraction, the kernel will be located at different places on the input image.At each location, a two-step process will happen: 1) perform the elementwise product between the patch of the input image and kernel and 2) sum the output of the products for each patch to obtain the value of the feature map.
To realize this transformation in the quantum circuit, we introduce a qR qubit and apply a series of controlled rotation gates to it.The respective quantum state after the elementwise product can be generally represented as (5), where |r x,y ⟩ indicates the weight of the kernel for the corresponding pixel and | f 1 ⟩ represents the output of the elementwise product process Specically, given the example illustrated in Fig. 4, when the convolutional stride is two, and the kernel size is 2 × 2, to apply this kernel (different colors indicate different weights) on the given image, the pixels at the specic locations should be transformed with the same weight.In Fig. 4, the location of the pixels marked with the same color should apply the same weight.Thus, to conduct this elementwise product over the input in the quantum convolution layer, we need to identify the pixels with the shared weight in the quantum domain.
As stated in [20], when setting the kernel size as 2 × 2 and the convolutional stride as two, the rst qL in |l x ⟩ and |l y ⟩ can be employed to specify pixels for different weights.Since |l x ⟩ and |l y ⟩ can be written as |x n−1 , . . ., x 0 ⟩ and |y n−1 , . . ., y 0 ⟩, respectively, the states with the same |x 0 ⟩ and |y 0 ⟩ states should apply the equivalent weight.For the case shown in Fig. 4, this 4 × 4 image needs four qLs to encode its spatial information.The states |x 1 x 0 y 1 y 0 ⟩ with the specic |x 0 ⟩ and |y 0 ⟩ should have the equivalent weight (illustrated with the same color).Thus, to specify the location of the pixels with the shared weight for the convolutional computation between the image and a kernel, only |x 0 ⟩ and |y 0 ⟩ should be focused.As a result, to perform the convolutional operation over the entire input, only four gates are required due to the four possible states of |x 0 y 0 ⟩, regardless of the input's spatial size.In contrast, classical convolutional operations over the input rely on the sliding window technique, leading to a quadratic increase in computation with the spatial size of the input.A detailed discussion can be found in Section VI-A.
To realize the elementwise product for the convolution operation in the quantum circuit, besides considering the pixel's location, the pixel's value encoded in the qC (|c⟩) is also important.For this reason, in our model, we apply the U3 gates with three controllers (i.e., |x 0 ⟩, |y 0 ⟩, and |c⟩) on the qR qubit, which can rotate the qR with three Euler angles based on the pixel's value and its location.The rotated qR can be described by (6), in which |0⟩ R and |1⟩ R are basic quantum states for qR, and k 0 x,y and k 1 x,y are the values of one weight variable of the applied kernel.To identify a 2 × 2 kernel, four weight variables should be identied After the elementwise product, we obtain the quantum state | f 1 ⟩.To obtain the feature map, we sum the resulting values of the elementwise product for each patch, as illustrated in Fig. 4. The generated feature map | f 2 ⟩ can be indicated as in which |l x ′ ,y ′ ⟩ indicates the spatial information of the feature map and | f x ′ ,y ′ ⟩ encodes the value of the feature map, which can be represented as Furthermore, (8) can be written as follows based on (3) and (6): in which α, β, and γ are the amplitudes of obtained quantum states.Note that, when the state of qC is |0⟩ C , the applied U3 gate will have no effect on qR.Hence, the amplitude of the quantum state |x 0 ⟩|y 0 ⟩|0⟩ C |1⟩ R equals 0, and it is omitted in (9).The unnormalized amplitude of |1⟩ R in (9) can be computed following (10), where k 0 x 0 ,y 0 and k 1 x 0 ,y 0 indicate the values of the weight variables for the kernel as introduced before, and θ x ′ x 0 ,y ′ y 0 refers to the value of the corresponding pixel Hence, the convolutional operation is performed in the quantum circuit, and the generated feature map (| f 2 ⟩) is successfully encoded using |l x ′ ,y ′ ⟩ and | f x ′ ,y ′ ⟩ with the preserved entanglement.This feature map can be treated as the input of the next quantum convolution layer.After several convolution layers, the feature map (| m ⟩) with the required dimension is generated for further classication.
To apply multiple kernels in one quantum convolution layer, qK qubits will be used as additional controllers for the applied U3 gates, and d qKs can maximally prepare 2 d kernels for each layer.Before working as the controllers of the U3 gates, these d qKs rst are initialized as |0⟩ ⊗d , and then, a Hadamard gate is applied on each qK.Thus, we prepare a quantum state |k⟩ that can be written as |+⟩ ⊗d to indicate the index of the applied kernels and also the generated feature maps.
Fig. 5 demonstrates one example of the quantum circuit for one quantum convolution layer with two kernels to process a 4 × 4-sized input image.In the end, two generated feature maps with the size of 2 × 2 are expected accordingly.The degrees in the U3 gates in this circuit example indicate the values of the weight variables for the kernels, which will be optimized during the training process.

C. Measurement Layer
The measurement layer in the model is used to obtain the feature maps from the quantum states to the classical states.Besides that, this layer will also atten the obtained feature maps into a 1-D feature vector for the dense layer.For this purpose, the interested quantum state and the operator for the measurement need to be specied.The expectation values are taken with respect to the measurement operator M based on the quantum states | m ⟩ indicating the generated quantum feature maps following The obtained expectation value, E(M), will be treated as one classical feature value for classication.
Since, in this layer, the quantum state embedding the feature maps' information is of most interest, only the corresponding qLs and qR are desired.To obtain all the generated quantum feature maps, the qKs should be also taken into account.
For example, given the state | m ⟩ for k quantum feature maps with the size of m × m, it can be represented as where |l x m ,y m ⟩ indicates the location information of the feature map; | f x m ,y m ⟩ embeds the values of the feature map; and |k⟩ indicates the index of the feature maps.To simplify, the target quantum state | m ⟩ can also be rewritten as in which |i⟩ represents the index of the extracted feature and | f i ⟩ embeds the value of the ith extracted feature.Fig. 6 represents an example of the measurement layer, in which two generated feature maps with the size of 2 × 2 are expected.Thus, eight features will be obtained.Given the desired quantum state | m ⟩ in a Hilbert space, we consider three types of measurement operators formed by Pauli-X, Pauli-Y, and Pauli-Z operators, and we measure the state | m ⟩ in the X-basis, Y-basis, and Z-basis, respectively.Specically, we dene a set of operators with respect to one orthonormal basis for measurement, and each operator M is Hermitian.Regarding one specic basis formed by states |b⟩, we conduct the measurement of the quantum state | m ⟩ in the computational basis.For each interested state |b i ⟩, the operator M i can be dened by |b i ⟩⟨b i |.The expectation value E(M i ) according to (11) is the ith obtained feature.
In this way, the feature maps' information will be obtained from the quantum states to the classical data, and it will be utilized for the nal classication.

D. Dense Layer
With the classical output from the measurement layer, a classical dense layer is utilized for the nal classication.The neurons in this layer are deeply connected, and each neuron encodes one expectation value.In this layer, suitable activation functions (e.g., softmax for multicategory classication and sigmoid for binary classication) will be applied in the end to achieve the nonlinear transformation and output a probability distribution to indicate the classied category.

E. Multicategory Classication
To deal with multicategory classication tasks with QML (e.g., classifying the image into g categories), there are generally two common strategies.One is to train g binary classiers, and the structure is shown in Fig. 7(a).However, this method will inevitably increase the trained parameters and training time signicantly.The other method illustrated in Fig. 7(b) employs g qubits in the model's last layer to indicate the image's category, which requires relatively fewer trainable parameters but increases the needed qubits for classication.
To improve the training efciency and reduce the qubits, we propose to use a classical dense layer with a suitable activation function (e.g., softmax for multicategory classication) for the nal prediction [see Fig. 7(c)].Thus, the proposed model is a hybrid model containing a quantum section for the feature extraction and a classical section for the nal classication output.

F. Training Process
The proposed model is based on PQCs.The rotation angles of the quantum gates in the quantum convolution layers are regarded as the trainable parameters, which will be optimized in the training process by classical algorithms (e.g., the Adam algorithm [58]), but the quantum state's evolution and measurement are conducted on a quantum computer.More precisely, the training process is composed of the following steps.
1) The training set of the classical images after the normalization process can be identied as I = {I 0 , . . ., I n }, and each element indicates one training classical image.The categorical label y i for an arbitrary image I i will be transformed to [y 0 i , y 1 i , . . .
] for a g-category classi-cation problem using one-hot encoding technique (note that the superscripts here represent indices instead of exponents).
2) For one input image, the model outputs a probability distribution, given as f (I i , ) = ỹi , where ỹi is a vector in ℜ g and  denotes the trainable parameters in the model.
3) The cross-entropy loss function is used to compare the output against the label (L( ỹi , y i )), and the cost value will be averaged over each batch from the training dataset.4) The trainable parameters  will be optimized and updated during the backpropagation of gradients by applying the Adam algorithm [58].As for gradient calculation for the trainable parameters in the quantum circuit, there are several techniques, for instance, the adjoint method [59].This training process will be repeated until the parameters are optimized.
V. EXPERIMENTS To evaluate the performance of our model on EO data classication tasks, we conducted experiments using multiple EO benchmarks and compared our model with different deep learning models.
Data Preparation: In this study, we experimented with ve different EO datasets, i.e., Overhead-MNIST [22], So2Sat LCZ42 [23], PatternNet [24], RSI-CB256 [25], and NaSC-TG2 [26].Due to the limited computation power of current quantum simulators, we reduced the size of the datasets in our experiments by only focusing on a subset of categories for each benchmark.Furthermore, we downscaled the labeled images in all the datasets to a size of 8 × 8 × 1 with different techniques: Lanczos algorithm [60] and convolutional autoencoder [61].Note that the objective of our experiments was to This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.

TABLE I
EXPERIMENTAL DATA FOR PERFORMANCE EVALUATION assess our model's effectiveness.To avoid overpowering the autoencoders, we added Gaussian noise as the perturbation to the output of the autoencoders and evaluated our model's performance with the noisy input.The details about the data preparation process for our experiments are provided in the following, and the summarized information can be found in Table I.
Overhead-MNIST [22] contains overhead view images (28× 28) of ten kinds of entities (e.g., "car," "ship," and "plane").There are 8519 training images and 1065 testing images, which are all gray-scaled.In this study, we used all the 5098 images labeled in 6 categories ("car," "ship," "plane," "harbor," "helicopter," and "oil gas eld") for training and 637 images from these categories in its test dataset for evaluation.From the training images, we randomly selected 800 samples (approximately 15%) to create a validation dataset, and the remaining samples were used for training.To downsize these labeled images, we used the Lanczos algorithm.
So2Sat LCZ42 [23] consists of 17 local climate zone (LCZ) labels of around half a million Sentinel-1 and Sentinel-2 image patches with the size of 32 × 32 in 42 cities.For this dataset, we used the intensity of the rened LEE-ltered VV channel from the Sentinel-1 data as the input.Then, we adopted the Lanczos algorithm to downsize the input patches.Regarding the categories, we focused on three LCZ labels ("compact middle-rise," "large low-rise," and "dense trees") and randomly selected around 5000 labeled patches in the cities of Berlin and Munich from these categories to build a balanced dataset.The sampled images were split into three sets: 70% for training, 15% for validation, and 15% for testing.
PatternNet [24] has high-resolution imagery covering 38 different classes, and there are around 800 samples of size 256 × 256 pixels in each class.For this dataset, we utilized a convolutional autoencoder to reduce the features of the input images to the size of 8 × 8.In the experiment, we focused on three classes ("coastal mansion," "parking lot," and "swimming pool") and used all the provided samples.For evaluation, we randomly separated these samples with a ratio of 70 :15 :15 as the training, validation, and test datasets.
RSI-CB256 [25] is a global-scale dataset, having more than 24 000 images in 35 categories.In our experiments, we set "dry farm," "mangrove," "residents," "snow mountain," and "storage room" as our target categories and randomly selected 800 RGB-image samples with a size of 256 × 256 for each category.The labeled images were downsized to 8 × 8 using a convolutional autoencoder and were also divided into three sets: 70% for training, 15% for validation, and 15% for testing, with no overlap in between.
NaSC-TG2 [26] is an EO benchmark dataset for natural scene classication, which has around 20 000 samples with the size of 128 × 128 in ten classes.For our experiments, we concentrated on three target classes ("forest," "residential," and "snowberg") and randomly selected 1000 samples with RGB channels for each class to build a balanced dataset.Then, we utilized a convolutional autoencoder to reduce features to 8 × 8 for our experiments.In the end, we randomly separated the prepared samples into the training, validation, and test datasets with the ratio 70% :15% :15%.
Model Preparation: In the experiments, we evaluated our model with two quantum convolution layers, each with two kernels for feature extraction.The main focus of our experiments is comparing our model and the classical CNN model since it serves as the classical counterpart of our approach.Specically, we selected two CNN models as the competitors.The rst CNN model with six kernels (CNN-6) has a similar model structure as our model, and each convolution layer applies two lters.However, compared with this CNN model, our model will extract more features in the measurement layer for classication when measuring the quantum feature maps on multiple bases even though only two kernels are applied in each quantum convolution layer (24 features for the QC-CNN model versus eight features for the CNN-6).Thus, the other CNN model with 14 kernels (CNN-14), which extracts the same number of features for nal classication and has a similar number of trainable parameters as our model, was also considered for performance comparison.
In addition, we also included another three CNN-based models and two quantum models for evaluation.
With respect to the selected CNN-based models, Block CNN [62], ResNet [63], and DenseNet [64] have signicantly deeper structures and more parameters than our model.To ensure a fair comparison and account for the small input image size of 8 × 8, we simplied these models' structures and applied fewer layers and kernels for image classication.Specically, the Block CNN model comprises two blocks, a global average pooling layer and a fully connected layer.Each block contains three convolutional layers with the same kernel size, and every layer has two lters in the rst block and four lters in the second block.The ResNet model used in this study consists of two residual blocks, followed by a global average pooling layer and a fully connected layer that produces the nal output.The convolutional layers in the residual blocks were congured with two and four lters, respectively, and the kernel size in the model was set to 2 × 2. The DenseNet architecture starts with one convolution layer and then follows two dense blocks separated by one transition layer.Each dense block includes two convolution layers, with two lters being utilized in each layer.
As for the quantum models, QCNN [54] and QNN [44], they are also hybrid, but quantum computing plays different roles in classication in these models.We evaluated them with the same input with the size of 8 × 8. Table II summarizes the pipelines of these models in the experiments.Different This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.

TABLE II
PIPELINES OF THE QC-CNN MODEL AND OTHER COMPARED QUANTUM MODELS from our model, the quantum algorithms in these models encode and process either a small number of high-level features or local patches with low-level features.Specically, the QCNN [54] model uses a deep learning model to extract high-level features rst, and the quantum algorithm encodes and transforms these features, followed by a classical dense layer for classication.QNN [44] uses a quantum circuit to extract local features from every patch of the image with the sliding window method.Then, a classical CNN model is used to perform the nal classication.The different ways of using quantum computing in these models result in different requirements for quantum resources.
Quantum Simulation Settings: In our experiment, we used the TFQ platform [21] to develop and train our model.As one of the widely used frameworks for quantum deep learning, it enables the usage of several types of simulators.Interested readers may refer to the work [65] for a detailed comparison among different frameworks.
Regarding the quantum machine used in our study, we adopted a noiseless simulator from TFQ for training, which outputs the analytic results after quantum computing.The reason for selecting this simulator is that the goal of our experiments is to verify the validity of our quantum algorithm, and this noiseless simulator written in C++ is faster than other simulators on the platform.As for performing backpropagation, we applied the adjoint differentiator provided by TFQ.It is compatible with the analytic output and the adopted simulator.In addition, it computes the gradients faster than others.However, note that this differentiation technique cannot currently be easily realized on a real quantum machine.
Experimental Settings: To train the aforementioned models, unless otherwise stated, we set the epoch number as 500 and the batch size as 100, and the cross-entropy loss function was used.We trained our model using the Adam optimizer [58] with a learning rate of 0.03 and the competitors with their default learning rates.Each training was repeated three times, and we calculated the average classication accuracy of the trained models with the lowest validation loss value during the training process.The resulting values, along with their corresponding standard deviations, were used for comparison and discussion purposes.
Eventually, we carried out four different experiments to evaluate the model's performance: 1) analysis of general classication performance; 2) analysis of quantum gates and measurement operators; 3) analysis of the structure of the QC-CNN model; 4) analysis of the noise effects on the QC-CNN model's performance.Specically, the rst experiment was designed to assess the classication performance of the QC-CNN model on different datasets, in which all the prepared datasets were used.The subsequent experiments aimed to investigate the properties of the quantum component in our model.To ensure that the powerful machine learning-based approaches for feature reduction, such as autoencoders, will not diminish the impact of the studied quantum properties, we only used the Overhead-MNIST and So2Sat LCZ42 datasets for these three experiments.

A. Analysis of General Classication Performance
We present the classication accuracy achieved by different models in Table III.The table demonstrates that our model outperforms both CNN models (CNN-6 and CNN-14) in terms of test accuracy for all ve EO datasets.Furthermore, despite having a simpler structure and fewer trained parameters, our model achieves comparable classication performance to the competitor with the highest test accuracy among the compared CNN-based deep learning models and quantum models for every used dataset (with a performance difference of at most 0.013).In some cases (e.g., Overhead-MNIST), our model demonstrates superior performance among all competitors.
Moreover, as shown in Table III, the difference in classication accuracy between training, validation, and testing sets for our model is relatively small compared to other models, suggesting that our model has less overtting and higher generalizability than others.
In addition, a one-sided Wilcoxon signed-rank test [66] was performed between our model and the competitors over ve EO datasets.The null hypothesis states that the classication performance (the averaged test accuracy) of our model is equal to or worse than that of the competitor over ve EO datasets, and the alternative hypothesis suggests that the average test accuracy of our model over ve EO datasets is greater than the competitor.As shown in Table IV  of the alternative hypothesis, which suggests that our model outperforms its classical counterparts.As for other competitors with more complex structures and parameters, there is insuf-cient evidence to support the alternative hypothesis, and its implication aligns with the previous nding based on Table III.It is important to note that, in computer vision, it is not common to use the p-value to examine the signicance of the improvement, as a tiny percentage of improvement in, e.g., image classication, would lead to unprecedented practical usage.This is also the situation for EO.Thus, although we carried out this experiment for the sake of completeness, the readers are recommended to make the assessment based on the actual application scenario.
To conclude, in comparison with the CNN model (our model's classical counterpart), our model could extract critical features with fewer kernels from the input image, and it can achieve better classication performance and higher generalizability.As for the classical deep learning model with a more complex structure (e.g., ResNet), our model can have a similar performance.However, there is no guarantee that it can always outperform these classical models.In addition, it is worth mentioning that the quantum simulator used in the experiments was noiseless, so the experiments were conducted in an ideal condition.The model's performance will be compromised when adopting a noisy device or simulator.

B. Analysis of Quantum Gates and Measurement Operators
There are various types of quantum gates and measurements that can be applied in our model, as introduced in Section IV, for example: This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination. 1) rotation gates in the quantum convolution layer (e.g., U3, RX, or RY); 2) operators in the measurement layer (e.g., Pauli-X, Pauli-Y, or Pauli-Z).To evaluate the effects of different rotation gates and measurements on classication performance, several experiments were conducted.Specically, we used all the samples from two categories ("Car" and "Plane" in the Overhead-MNIST; "Compact Middle-rise" and "Dense Trees" in the So2Sat LCZ42) in Table I to prepare two balanced datasets.Different types of quantum gates in the quantum convolution layers and the operators in the measurement layer have been tested, and the results are demonstrated in Table V.
1) Rotation Gates in Quantum Convolution Layer: The U3 gate makes a single qubit rotate with three Euler angles, whereas the RX and RY gate let the qubit rotate around the X -axis and the Y -axis, respectively.Thus, using U3 gates can achieve more complex rotation.As shown in Table V, adopting U3 gates can generally achieve higher accuracy for image classication, but note that the model using U3 gates has a threefold increase in the number of the trainable parameters compared with the model using RX or RY gates.
In addition, using RX or RY gates in the model can sometimes also achieve comparable performance as using U3 gates.For example, when adopting the X-measurement for the Overhead-MNIST data and the Z-measurement for the So2Sat LCZ42 data, the performance difference regarding classication accuracy when applying different gates in the quantum convolution layer is limited.
2) Operators in the Measurement Layer: In our experiments, we evaluated four measurement strategies, i.e., X, Y, Z-measurement and XYZ-measurement, to obtain the features from the quantum state by applying Pauli-X, Pauli-Y, and Pauli-Z operators.Specically, for the rst three strategies, we measure the quantum state in the X-basis, Y-basis, and Z-basis, respectively.As for the XYZ-measurement, we concatenate the values obtained based on the previous three strategies and use them together in the further classical dense layer.
As can be seen in Table V, the experimental results indicate that the model applying the XYZ-measurement can generally outperform others, but this measurement also extracts more features for the successive dense layer compared with others.
Similarly, the model with the measurement on one basis can have comparable performance to the one using the XYZ-measurement in some cases.For instance, to classify the Overhead-MNIST dataset and the So2Sat LCZ42 dataset, the X-measurement can reach a similar performance to the XYZ-measurement despite the adopted rotation gates.

C. Analysis of the Structure of the QC-CNN Model
To evaluate the impacts of the model's structure on the classication performance, we experimented with our model having a different number of kernels and quantum convolution layers using two datasets.For the Overhead-MNIST, we used all the samples from the categories "Car," "Ship," and "Plane" to build a balanced dataset.Regarding the So2Sat LCZ42, we used all the data introduced in Table I.

D. Analysis of the Noise Effects on the QC-CNN Model's Performance
To evaluate our model's ability to handle the noise, we tested our model's performance given different types of noise and compared it with its classical counterpart.
For the noise in the data, to avoid the inuence of the procedure for dimension reduction, we added the Gaussian noise to the downscaled input images.As for the noise in the model, we involved the noise at the end of the circuit for measured qubits.Specically, the noisy model will additionally add one gate from Pauli-X, Pauli-Y, and Pauli-Z gates with a certain error rate on each measured qubit.In our experiments, we considered the error rate of 0.01, 0.05, and 0.10.
Considering the time needed for simulating the noisy quantum model, we simplied the tasks.In this experiment, we focused on three category classication tasks.For the Overhead-MNIST, we dened "Car," "Ship," and "Plane" as the target categories.As for the So2Sat LCZ42, we used all the categories in Table I.To train the models, we randomly selected 600 images from the training samples in these target categories for each dataset.To evaluate and compare the models' performance, we utilized all the validation and test data from the target categories listed in Table I.The models were trained with an epoch number and a batch size of 50.
The experimental results can be found in Table VII.As shown in the table, our model can have better performance compared with the CNN model when dealing with the noise in the data.Regarding the noisy model, the misclassication rate increases with the error rate, which is expected.However, the usage of the trainable classical dense layer increases the resilience of our model against the noise effects.

VI. DISCUSSION
The scalability and efciency of the model also play an important role in the model's estimation.The efciency indicates the speed of the quantum algorithm for classication.As for the scalability analysis, the number of qubits needed to scale the model is discussed.

A. Network Efciency Analysis
We analyzed our network's efciency from two perspectives: the number of required quantum gates and the number of trainable parameters for classication.As an example for our analysis, we chose the QC-CNN model that has m sequential quantum convolution layers, and each layer applies 2 k kernels with a size of 2 × 2 and a convolutional stride 2. The efciency comparison between our model and its classical counterpart for feature extraction can be found in Table VIII.
1) Number of Quantum Gates: The number of quantum gates used in a quantum algorithm is directly related to the number of operations required for computation because a quantum circuit is built up with gates, and each gate represents a specic operation.Our model's gate complexity is discussed individually for each type of layer in the model.a) Encoding layer: To encode a gray-scale image of size 2 n × 2 n using FRQI, 2n H gates on qLs are used to prepare the spatial information.For each pixel, a controlled RY gate with 2n controllers is required that can rotate a qubit with arbitrary degrees around the Y -axis based on 2n qubits' states.Thus, in total, this layer requires 2 2n RY gates controlled by 2n qubits and 2n H gates for input image encoding.
To reduce the number of gates applied in this layer, several techniques, such as those described in [67] and [68], can be applied.However, it is important to note that this study does not aim to address the issue of encoding images more efciently, as it falls outside the scope of this work.
b) Quantum convolution layer: The convolutional computation also relies on the rotation gate with multiple controllers.In our model, we apply the U3 gate controlled by k + 3 qubits in these layers regardless of the size of the input This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.image, and these controllers are 2 qLs, k qKs, and 1 qR (or the qC for the rst convolution layer).In the end, there are 4m × 2 k U3 gates with k + 3 controllers.To prepare 2 k kernels, k H gates for qK are also required.
c) Measurement layer: To obtain the expectation values in the measurement layer, we have to run the quantum circuit multiple times.After the quantum convolution layers, given the generated feature maps with 2 2(n−m)+k features, we need O(2 2n−2m+k+1 ) runs of the circuit to obtain the required feature values from the quantum state.
When comparing the number of operations between our model and classical CNN models, we would like to focus on the convolution layers that are the key components responsible for feature extraction in both models.However, it is worth mentioning that, if we consider the encoding layer and the measurement layer, the overall efciency of our model will be compromised.Still, as introduced before, there are possibilities to speed up the encoding process, and it is beyond the scope of this work.Thus, we mainly discuss and compare convolutional computation in different models.
As studied in [69], a classical convolution layer with the same settings as our QC-CNN model requires O(2 k+2n ) operations to process a gray-scale image of size 2 n × 2 n .In contrast, our quantum convolutional layer requires only 2 k+2 U3 gates with k + 3 controllers, regardless of the image's spatial size.This suggests that our quantum convolution layer can speed up the convolutional operation for feature extraction, particularly when analyzing large remote sensing images.
2) Number of Trainable Parameters: Table VIII compares the number of trainable parameters of a convolution layer with 2 k kernels of size 2 × 2 to process gray-scale images in our QC-CNN model and a CNN model.
As shown in the table, the number of trainable parameters in the quantum convolution layer depends on the utilized quantum gate along with the number of applied kernels.Applying controlled U3 gates can manage more complex rotations, but it also has a threefold increase in the number of trainable parameters compared with using RX or RY gates.
Compared to classical CNN models, the use of U3 gates requires more trainable parameters for a given number of kernels, as shown in Table VIII.However, the experimental results in Table III indicate that our model with only four kernels can outperform the CNN-14 model with 14 kernels, suggesting that the kernels in our model are more efcient than those in classical models.As a result, our model still requires fewer parameters than CNN-14 despite the adoption of U3 gates.
Furthermore, note that incorporating RX or RY gates in our model can sometimes lead to comparable performance as U3 gates, as evidenced by the experimental results presented in Table V.As such, it is possible to further reduce the number of trainable parameters in our model without compromising classication performance.This implies that our model has the

B. Scalability Analysis
The requirement of qubit resources for QML models is one of the essential criteria for quantum computing, especially in the NISQ era.Thus, it is important to analyze the qubits needed in the proposed model for the classication task.
Table IX concludes the number of qubits for each layer of the proposed model to classify gray-scale images.Concerning color images, three qubits are needed to encode the spectral information with the MCQI method [70].Thus, the number of qubits for the model containing 2 k kernels and m successive quantum convolution layers is 2n +k +m +1 for the gray-scale images and 2n + k + m + 3 for the color images.As shown in Table IX, for a gray-scale image with the size of N × N , our model only requires 2 log(N ) + 1 qubits.To prepare K kernels for the quantum convolution layers, log(K ) qubits are sufcient.Thus, the proposed model achieves advantages in terms of information encoding.

VII. CONCLUSION AND FUTURE WORK
In this article, a new hybrid QC-CNN is proposed to classify remote sensing images into multicategories, which can accelerate feature extraction in the quantum domain for classi-cation and achieve better performance than its classical CNN counterpart.Exclusively applying amplitude encoding in our model signicantly reduces the requirement on quantum bit resources.In addition, we investigated the impacts of quantum gates, measurements, the model structure, and the noise effects on our model's classication performance.More importantly, our experimental results demonstrate a proof of concept for applying CNN in the quantum domain for image classication.Furthermore, evaluating our approach using simulators on EO benchmarks has provided us with the opportunity to explore the potential of using QNNs for EO data comprehension within the current limitations of quantum machines.
Due to the QC-CNN model's acceleration in the computation procedure and its relatively low requirement on the number of qubits, the proposed model might provide a possibility to tackle the challenges in the remote sensing domain for image classication tasks when more advanced quantum machines are available in the future.
Regardless, future research could continue to explore the following directions: 1) investigate more suitable image encoding techniques for remote sensing images; 2) further study the properties of different gates and measurements for classication; and 3) explore the potential of quantum computing for different challenges in the EO domain, such as incomplete data [71] and noisy-labeled data [72].

Fig. 2 .
Fig.2.Quantum circuit for QC-CNN model: 1) white for qL qubits (spatial information encoding), gray for qC qubit (color information encoding), green for qK qubits (kernel index encoding), and yellow for qR qubits (feature map information encoding); 2) dot markers in the circuit highlight the involved qubits in the applied quantum gates or the measured qubits in the specic layers; 3) the model contains m convolution layers and each layer involves 2 k kernels; and 4) | i ⟩ and | m ⟩ are the outputs from the encoding layer and quantum convolution layers, respectively.

Fig. 3 .
Fig. 3. Circuit example of the encoding layer for one 2 × 2 image: 1) dot markers indicate the controlled state: white dots for |0⟩ and black dots for |1⟩; 2) H represents the Hadamard gate and RY(θ ) represents the RY gate with θ degrees; and 3) θ is the converted pixel value, and the subscripts here distinguish pixels in the image.

Fig. 4 .
Fig. 4. Illustration of the convolutional computation with a 2 × 2 kernel procedure: 1) encoding a 4 × 4-sized input image's spatial information needs four qLs, |x 1 x 0 ⟩ and |y 1 y 0 ⟩ for the vertical and horizontal dimensions, respectively; 2) color indicates weight variables (i.e., W 0 -W 3 ) of the kernel; 3) the quantum states |x 1 x 0 y 1 y 0 ⟩ with the specic |x 0 ⟩ and |y 0 ⟩ compute with the equivalent weight (the same color) for the elementwise product; and 4) the spatial information of the quantum feature map can be represented by |x 1 y 1 ⟩, and the values of the feature map are represented by f 0 − f 3 .

Fig. 5 .
Fig. 5. Circuit example of the quantum convolution layer: 1) white for qLs, gray for qC, green for qK, and yellow for qR; 2) dot markers in the circuit indicate the controlled state: white dots for |0⟩ and black dots for |1⟩; and 3) H represents the Hadamard gate, and U3 represents the U3 gate.

Fig. 6 .
Fig.6.Circuit example of the measurement layer: 1) white for qLs, gray for qC, green for qK, and yellow for qR; 2) one convolution layer with two kernels is applied on a 4 × 4 input image; and 3) two feature maps are generated: |x 1 y 1 ⟩ encodes the location information, |r ⟩ encodes the values of the feature map, and |k⟩ shows the index of the created feature maps.

Fig. 7 .
Fig. 7. Strategies for QML to deal with g-category classication tasks, including (a) using multiple binary classiers, (b) using multiple qubits in the last layer to indicate the prediction, and (c) using a classical dense layer in the end for prediction: 1) input: classical gray-scale input image; 2) encoding: encoding layer; 3) PQC: quantum convolution layer with trainable parameters; 4) measure: measurement layer; 5) dense: classical dense layer with softmax activation function; and 6) output: ID vector for prediction.
y ⟩ Quantum state of the (x, y)coordinate in the input; |l x ⟩ and |l y ⟩ are for the coordinates x and y, respectively, which can be written as |x n−1 , . . ., x 0 ⟩ and |y n−1 , . . ., y 0 ⟩.|c x,y ⟩ Pixel value at the coordinate (x, y); |0⟩ C and |1⟩ C are the basis states of qC.|r x,y ⟩ Weight of the kernel for the pixel at the coordinate (x, y); |0⟩ R and |1⟩ R are the basic states of qR.
, the p-values comparing our model with two CNN models are below 0.05.This indicates the rejection of the null hypothesis, i.e., in favor This article has been accepted for inclusion in a future issue of this journal.Content is final as presented, with the exception of pagination.

TABLE III CLASSIFICATION
PERFORMANCE COMPARISON BETWEEN THE PROPOSED MODEL AND OTHERS

TABLE V QC
-CNN'S ACCURACY WITH DIFFERENT QUANTUM GATES IN THE CONVOLUTION LAYERS AND OPERATORS IN THE MEASUREMENT LAYER

TABLE VI CLASSIFICATION
PERFORMANCE COMPARISON OF OUR MODEL WITH DIFFERENT STRUCTURESTABLE VII CLASSIFICATION PERFORMANCE COMPARISON WITH THE NOISE According to the results shown in Table VI, the model's structure can inuence its classication performance like the classical CNN.When applying a suitable number of kernels and convolution layers for feature extraction, our model's classication accuracy can be improved.

TABLE IX DEMANDED
QUBITS IN THE QC-CNN MODEL FOR GRAY-SCALE IMAGES potential to extract valuable features with signicantly fewer parameters, highlighting its advantage in training efciency.