On Circuit-based Hybrid Quantum Neural Networks for Remote Sensing Imagery Classification

This article aims to investigate how circuit-based hybrid Quantum Convolutional Neural Networks (QCNNs) can be successfully employed as image classifiers in the context of remote sensing. The hybrid QCNNs enrich the classical architecture of CNNs by introducing a quantum layer within a standard neural network. The novel QCNN proposed in this work is applied to the Land Use and Land Cover (LULC) classification, chosen as an Earth Observation (EO) use case, and tested on the EuroSAT dataset used as reference benchmark. The results of the multiclass classification prove the effectiveness of the presented approach, by demonstrating that the QCNN performances are higher than the classical counterparts. Moreover, investigation of various quantum circuits shows that the ones exploiting quantum entanglement achieve the best classification scores. This study underlines the potentialities of applying quantum computing to an EO case study and provides the theoretical and experimental background for futures investigations.


I. INTRODUCTION
E ARTH Observation (EO) has consistently leveraged technological and computational advances helping in develop novel techniques to characterize and model the human environment [1], [2], [3]. Given that many remote sensing missions are currently operative, carrying on board multispectral, hyperspectral, and radar sensors, and the improved capabilities in transmitting and saving a continuously increasing number of images, nowadays estimated in over 150 terabytes per day [4], the amount of data from EO applications has reached impressive volumes so that it is referred to as Big Data. At the same time, advances in computational technologies and analysis methodologies have also progressed to accommodate larger and higher-resolution datasets. Image classification techniques are constantly being improved to keep up with the ever expanding stream of Big Data, and as a consequence Artificial Intelligence (AI) techniques are becoming increasingly necessary tools [5], [6].
Given the need to help expand the processing techniques to deal with this high resolution Big Data, EO is now looking towards new and innovative computation technologies [7]. This is where Quantum Computing (QC) will play a fundamental A [8]. Today, there is a number of differing quantum devices, such as programmable superconducting processors [9], quantum annealers [10], and photonic quantum computers [11]. However, QC still presents some technological limitations, as reported in [12] with a special concern with noise and limited error correction. Specific algorithms, namely the Noisy Intermediate-Scale Quantum Computing (NISQ) algorithms, have been designed to tackle these issues [13].
Quantum computers promise to efficiently solve important problems that are intractable on a conventional computer. For instance, in quantum systems, due to the exponentially growing physical dimensions, finding the eigenvalues of certain operators is one such intractable problem, which can be solved by combining a highly reconfigurable photonic quantum processor with a conventional computer [14], [15].
Another example is the case of the Variational Quantum Eigensolver (VQE) algorithm used to solve combinatorial optimization problems like finding the ground state energy of a molecule. The algorithm finds a bound to the lowest eigenenergy of a given Hamiltonian [15]. This is, in essence, a kind of cost function which is defined by the expectation of the molecular Hamiltonian of a given prepared eigenstate. The goal of the VQE is to minimize this cost function by varying the parameters θ used to prepare the ansatz eigenstate often representative of a molecule. This hybrid algorithm prepares and determines eigenenergies through quantum circuits, and then it varies the parameter classically. By iterating through these classical variations and quantum calculations, a hybrid minimization process is established [14]. This approximation of critical minima is analogous to the gradient descent.
In QC a qubit or quantum bit is the basic unit of quantum information, i.e. the quantum version of the classic binary bit. A qubit is one of the simplest quantum systems which displays the peculiarity of quantum mechanics. Indeed, it is a two-state quantum-mechanical system, e.g. an electron in two possible levels (spin up and spin down), or a single photon in one of the two possible states (vertical and horizontal polarization). While in a classical system a bit can be in one state or the other, qubit exists in a coherent superposition of both states simultaneously, a property that is fundamental to quantum mechanics. Quantum computers utilize the principles of superposition and entanglement to streamline computation [16], [17], [18]. For every n qubits, 2 n possible states can be represented. This is an exponential improvement with respect to the classical systems which can only represent n states for every n bits. Moreover, quantum systems exist in a high dimensional space, known as a Hilbert space, whose inherent properties lend themselves to a complex linear optimization.
The application of quantum technology for remote sensing has been considered for at least the last 20 years. In [19], an active imaging information transmission technology for satellite-borne quantum remote sensing is proposed, providing solutions and technical basis for realizing active imaging technology relying on quantum mechanics principles. Another application discussed in literature is related to inteferometric synthetic aperture radars [20], [21]. In the first work Otgonbaatar and Datcu describe a residue connection problem in the phase unwrapping procedure as quadratic unconstrained binary optimization problem which is solved by using the D-Wave quantum annealer. The same authors in [21] present a quantum annealer application for subset feature selection and the classification of hyperspectral images.
The research presented in this article focuses on the possibility to use quantum computers to enhance the performances of Machine Learning (ML) algorithms when applied to Land Use and Land Cover (LULC) classification, chosen as an EO use case. The results of the multiclass novel QCNN classifier prove the effectiveness of the proposed approach, able to achieve better results that standard models of comparable complexity and on-par results with best standard models of the state of the art.
It is worth to highlight that only very few works have addressed the application of Quantum Machine Learning (QML) to remote sensing in the current state of the art. For instance, quantum computers and convolutional neural networks (CNNs) are considered together for accelerating geospatial data processing in [22], where quanvolutional layers [23] are used. These layers contain several quanvolutional filters that transform the input data into different output feature maps by using a number of random quantum circuits, in an analogous way to standard convolutional networks. Quantum circuitbased neural network classifiers for multi-spectral land cover classification have been introduced in preliminary proof-ofconcept applications as presented in [24], and an ensemble of support vector machines running on the D-Wave quantum annealer has been proposed for remote sensing image classification in [25]. In our preliminary work [26] hybrid quantumclassical neural networks for remote sensing applications are discussed, and a proof-of-concept for binary classification, using multispectral optical data, is reported. Finally, Otgonbaatar et. al [27] proposed a binary classifier based on a very deep convolutional network and a 17 qubit quantum circuit.
In this manuscript, different circuit-based hybrid quantum convolutional neural networks (QCNNs) are discussed, and a remote sensing image classification use case is considered, exceeding the simple binary classification presented in [26] and the more complex presented in [27]. Namely, hybrid networks based both on classical and quantum computing will be used, and a comparison will be made of performances provided, when dealing with different quantum circuits applied to classification of remote sensing images.
The main contributions of this work are as follows: • QC is applied to land-cover classification on the reference benchmark EuroSAT dataset [28] for optical multispectral images, thus by going further than initial proofs-ofconcept on a few images [24], [25]. • QCNN multiclass classification is tackled, with respect to the simple binary classification already discussed in [26], and better results are obtained through the quantum-based networks with respect to their fully-classical counterpart. • A comparative and critical analysis is carried out to analyze the performances of different gate-based circuits for hybrid QCNN, showing the advantages of the architecture with entanglement. • A structured prediction setting, with coarse-to-fine classification has been implemented to further challenge the capacities brought by entanglement. Moreover, it is worth to highlight that each model we proposed it has been implemented and designed from scratch. This process involved also the adaptation of the classical and quantum networks to fit the requirements imposed by the used dataset.
It is also worth to mention that this paper can represent a useful tool for machine learning and remote sensing scientists looking at the way quantum circuits and their parameters work when applied to practical EO problems, since it describes the necessary mathematical and physical elements for the understanding of the quantum approach. The paper is organized as follows. In Sec. II an overview of LULC classification in the field of remote sensing is given by highlighting the main issues and difficulties in LULC tasks for remote sensing interpretation. In Sec. III the applications of machine learning in the domain of QC are introduced, and in Sec. IV the mathematical and physical background to QC is provided. The proposed methodology and the hybrid QCNNs are presented in Sec. V, while the results are reported in Sec. VI. Concluding remarks are given in Sec. VII.

II. LAND USE LAND COVER CLASSIFICATION OVERVIEW
LULC classification using remote-sensing imagery has been playing an important role in sustaining, monitoring and planning the usage of natural resources since years. LULC classification has reached a crucial scope in the management of land use, agricultural sector, forest areas and biological resources [29], and it has a direct impact on atmosphere, soil erosion and water, while it is indirectly connected to global environmental problems [30], by helping in delivering up-to date and largescale information on surface conditions. A general overview of supervised object-based land-cover image classification techniques is reported in [31], whereas a more comprehensive and recent review of challenges and state-of-the-art techniques for LULC classification is provided by Talukdar et al. [32].
For years, classical techniques mainly based on pixel or object analysis in terms of reflectance or local texture have been used for LULC classification [33], [34]. Yet, they have shown several issues since extremely affected by the data acquisition issues (like cloud cover and regional fog, adaptation to new sensors) and environmental changes which make difficult to design a generic classifier suitable for every object or land class everywhere in the world.
Several new methodologies have been developed by the researchers to address those issues by building on more robust statistical models and in particular the well-known Deep Learning (DL). Two trends have emerged: object-based image analysis (OBIA) or patch-wise classification, and dense pixelwise classification.
Generally, patch-wise approaches focus on local neighborhoods which correspond to semantically meaningful objects to build the classifiers. The task to achieve is to give a label to a patch which correspond to a small region of a complete aerial or satellite image, as in the popular EuroSAT [35] or BigEarthNet [36] benchmarks. Dedicated OBIA methods can then be applied, which look for relevant objet borders for example, as the DOTA baseline which is based on a Region-CNN [37].
On the contrary, pixel-wise approaches follow the historical remote sensing way of modeling local appearance statistics. In the last decade, the use of (Fully-)Convolutional Networks (FCNs) have proved to be extremely efficient by relying on very large models able to capture the diversity of possible inputs, and thus for a large variety of LULC classes: CNNs and random fields [38], multi-modal multi-scale FCNs [39], ensemble of CNNs [40].
Finally, among the new techniques adopted to deal with LULC problems, they must be included strategies based on Capsule networks [41], recurrent networks [42], Graph Convolutional Networks (GCNs) [43], which have be applied to hyperspectral imagery for instance, and Transformers more recently applied to both patch-wise and pixel-wise classification [44], [45]. Building on this set of powerful tools, new challenges can now be addressed which include explainable and interpretable classification [46], weaklysupervised classification [47], self-supervised classification, or semi-supervised classification [48].
After Deep learning, which has proved to be a relevant tool for improving pre-existing classical models, the beginning of the era of quantum computing has brought new ideas to solve the LULC classification problems, as new opportunities (the amount of data available) but also new issues (largescale processing, variety of sensors, very high resolution) have appeared.

III. QUANTUM MACHINE LEARNING
As already underlined before, the research presented in this article focuses on the possibility to demonstrate how the use of quantum computers can help in enhancing the performance of ML algorithms when applied to LULC classification.
In this section, a brief review of the recent results and research open questions concerning QML is first reported. The benefits of QC for ML applications are explained, by highlighting the general advantages of QML and by also presenting some applications. Finally the open challenges of these approaches and existing systems are discussed.
The need for Quantum Computing. Given the premises of the Introduction section concerning the disruptive potentialities of QC, and the issues discussed in the previous section on the difficulties in LULC tasks for remote sensing interpretation, QML has quickly become a topic of interest for the information science [49], [50], [51], [52] since the 1990s. As already anticipated, with the continuously increasing volume of data requiring classification-related processing tasks, computers have had to adapt themselves to process these larger and more complex sets of information. This is why quantum solutions are gaining attention and being explored. Moreover, for ML applications, quantum computers may provide an added benefit since they can avoid getting stuck at relative minima in gradient descent, by quantum tunneling through "hills" [53]. Practically, quantum computers are likely to reach a better solution than classical computers. Moreover, QC provides many other benefits for ML, such as fast linear algebra, quantum sampling, quantum optimization, and quantum artificial neural networks [54]. Despite the still unsolved limitations, quantum resources are expected to provide advantages for learning problems.
Advantages of Quantum Machine Learning.As briefly mentioned at the end of the previous subsection, there are several advantages in using the QC applied to ML, and some examples are found in the literature. In [55], for instance, the authors introduce and analyse the QCNN as a machine learning-inspired quantum circuit model, and demonstrate its ability to solve important classes of intrinsically quantum many-body problems. They consider two classes of problems where QC offers some advantages: 1) the quantum phase recognition, which asks whether a given input quantum state belongs to a particular quantum phase of matter, and 2) the quantum error correction (QEC) optimization, where an optimal QEC code is chased, for a given, a priori unknown, error model, such as dephasing or potentially correlated depolarization in realistic experimental settings.
Currently, different quantum algorithms that could act as building blocks of ML programs have been developed, sometimes related to hardware and software challenges that are not yet completely solved [50]. Given that ML and AI can play fundamental roles in the quantum domain [52], the main benefits of QML, as already summarized in [56], are the following: 1) improvements in run-time, 2) learning capacity improvements, 3) learning efficiency improvements.
However, there is not a shared consensus on how and when QML can be advantageous with respect to its classical counterpart on general classes of problems. For instance, in [57], it is shown how the quality and the amount of data can sensibly affect the performance of classical and QML models in such a way that the quantum advantage is not always guaranteed. With this regard, this paper adds an important element of discussion with respect to the state of the art, by demonstrating how QML could help when dealing with real remote sensing images for a classification problem where multiple classes are used.
Quantum Machine Learning applications. Currently, there are several general methods for implementing quantum circuits into ML models, as it can be found in the literature. For instance, in [58] image classification is performed via a QML, while in [59] a quantum support vector machine is used for Big Data classification. In [23] quanvolutional neural networks are employed to carry out image recognition, and instead variational quantum circuits for inductive Grover oracularization are presented in [60]. Lithology interpretation from well logs is discussed in [61], and quantum variational autoencoder presented in [62]. Quantum Neural Networks (QNNs) are often presented as hybrid algorithms that leverage quantum nodes throughout the networks [63], [64], [65]. QNNs develop a network of both quantum and classical nodes with some given activation functions, convolutional connections, and weighted edges. Here, the quantum nodes can be represented by single qubits or clusters of qubits. QNNs can also present a more complexly integrated circuit with entanglement, where correlations between quantum nodes can be exploited to speed up computation.
Quantum Machine Learning challenges. Trying to create complex quantum networks which link together layers of quantum nodes still represents a research challenge. Despite the many possible theoretical applications of quantum computers, there is still significant progress that must be made towards more reliable computation. The QC industry currently finds itself in the Noisy Intermediate-Scale Quantum (NISQ) era, where there is a limit to the number of operations that can be performed on a quantum computer before the information stored becomes useless [13]. Currently, these limitations contribute to the difficulties in scaling up quantum computers. However, all the work in progress is not useless since as soon as scaling quantum computers become viable, they will be able to represent exponentially more information than the classical ones. Fortunately, recent events show promising evidence for moving ahead and away from the NISQ era. In particular, by using QCNN models, researchers have been able to create an optimal QEC scheme for a given error mode [55], and moreover, many QC companies are also projecting similar timelines for developing their architecture. Some companies are planning to release error corrected and fault tolerant commercial quantum computers by the 2025 [66], [67].

IV. MATHEMATICAL BACKGROUND ON QC
In this section the basic notions of quantum computing are introduced. Further information can be retrieved in [17], [18].
Qubits are the fundamental units of information held in quantum computers. A physical qubit exists in a superposition of two states, |0 and |1 , as shown in Fig. 1 referring to an hydrogen atom with ground and exited states. The state |ψ of the qubit describes the probability distribution of the state and is expressed as Quantum measurement is an irreversible operation in which information is gained about the state of a single qubit, and superposition is lost. Mathematically speaking, in Eq. (1) |ψ can be viewed as a vector in a Hilbert Space (i.e., a vector space equipped with an inner product operation) where |0 |1 Figure 1: Qubit modeling as hydrogen atom, with electron ground state |0 and first exited state |1 .
When considering a system of two qubits with states α 0 |0 +α 1 |1 and β 0 |0 +β 1 |1 , the state evaluated by means of the tensor product is the superposition given by where α i , β j ∈ C and α i β j = 1. The state |00 , for instance, is given as |0 ⊗ |0 , where ⊗ is the tensor product. It turns out that, in general, you cannot factorize the state in Eq. (3) in terms of the original qubits. This phenomenon, known as entanglement, has an important consequence in the measurement process. Indeed, considering the Bell state if the measurement of the first qubit returns the state |0 (with probability 0.5), than the entangled state collapse to |00 . At this point, the second qubit is completely known as it is in the state |0 as well. This result is true even when the two qubits are separated by a very large (theoretically infinite) distance, leading to the violation of the locality principle of classical mechanics. By using the Schmidt decomposition theorem, it can be shown that a quantum system can have different degrees of entanglement [68]. By exploiting superposition and entanglement, quantum computers can perform operations that are difficult to emulate on a large scale with classical computers, cutting down computational time and power to process information. The qubit state in Eq. (1) can be expressed as a function of two angles ϑ and ϕ, i.e.
and represented as a point sitting on the surface of a unitary three-dimensional sphere, named the Bloch sphere, as shown in Fig. 2. With this notation, ϑ describes the probability of the qubit to result in |0 or |1 and the angle ϕ describes the phase the qubit is in. Quantum gates, denoted by U in the following, are basic quantum circuits operating on a small number of qubits. They are the building blocks of quantum circuits, like classical logic gates are for conventional digital circuits. Quantum gates are unitary operators, i.e. U † U = U U † = I, where the symbol † denotes the conjugate transpose, and U is described as a unitary matrix relative to some basis. Important properties are that 1) U preserves the inner product of the Hilbert space and 2) qubit gate operations can also be visualized as rotations of the quantum state vector in the Bloch sphere.
The standard quantum gates used in this paper are introduced hereafter: • Hadamard gate, a single qubit gate described by the matrix: Starting from the single state qubit |0 , the Hadamard gate return the superposition of two states, namely the so called plus state |+ , i.e.
• Rotation gates, R x (θ), R y (θ), R z (θ), i.e. single qubit gates described by rotation matrices about thex,ŷ,ẑ axes of the Bloch sphere, respectively. The gate R y (θ), which will be used in the following, takes the form: • CNOT gate, which is a two qubits gate described by the matrix and represented in Fig. 3. When the input are basis states |0 and |1 , the CNOT gate transform the state i.e., it flips the second qubit (the target qubit) if and only if the first qubit (the control qubit) is |1 . The combination of Hadamard and CNOT gates is used to create an entangled Bell state as defined in Eq. (4). The corresponding circuit shown in Fig. 4 is the basic building block of the quantum circuits investigated in this paper, as it introduces entanglement in the circuit by enhancing the computation performances.

V. METHODOLOGY
In this section, a selected number of quantum circuits, investigated as potential quantum layers in the proposed hybrid network, are described. Firstly, the integration of the quantum part into the classical architecture is discussed, by presenting the "Data Embedding" operation and showing an example of interface between classical and quantum layers. At the end of the section, the hybrid QCNN is presented, and the model optimization and inference discussed. Although the quantum circuits presented in the following are standardly used in QC for data processing and they are fundamental units of IBM Qiskit [69], [70], it is worth to highlight that all codes have been realized from scratch by the authors and released openaccess in a public repository [71].

A. Data Embedding
To create a hybrid QNN, a parametrized quantum circuit is typically used as a hidden layer for the neural network. Yet, with respect to classical network architectures, right in order to integrate the quantum part into the classical architecture, it is critical to realize a higher dimensional quantum representation of classical data in the creation of the hybrid model. In this section, a brief description on how to prepare a quantum state at this end is given.
A feature mapping is first run through a unitary operator applied to a set of N |0 quantum nodes as a method of Figure 5: Interface between classical and quantum layers.
encoding the classical information in the new N-qubit space.
A unitary matrix, needed to encode the information, must be classically derived before applying it to the quantum circuit. Its parameters are determined by the values of the preceding classical nodes at the point of insertion. This operation is referred to as data embedding, where the preceding classical activation is represented through the related amplitude probability of measuring |1 in the quantum state. Different gate operations can be used to encode a quantum representation of classical information. For instance, Abbas et al. in [51] show how that can be done by first applying a Hadamard gate to put the qubits in a superposition state, and then by applying RZ-gate rotations to the qubits, with angles equivalent to the feature values of preceding inputs. Alternate gate operations can be used to encode a quantum representation of classical information. Yet, the interpretation of the prepared state must be self consistent, that means to consider the encoding system valid as long as the input operations and the output measurement accurately represent the classical information.
Proceeding the classical encoding, the parametrized quantum circuit is then applied. A parametrized quantum circuit is a quantum circuit where the rotation angles for each gate are specified by the components of a classical input vector. The outputs from the neural network's previous layer will be collected and used as the inputs for the parametrized circuit. The measurement statistics of the quantum circuit can then be collected and used as inputs for the following hidden layer. As a demonstrative example in Figure 5 the interface between classical and quantum layers is sketched.

B. Selected Quantum Circuits for Image Classification
Three types of circuits, selected among the possible quantum circuits and to be used in the proposed hybrid QCNN, are presented. Their structure reflects the adopted implementation with 4 qubits, which represents a more complex architecture with respect to simpler ones where less qubits are used [26]. Far from being an exhaustive comparison of all possible quantum configurations, the description of the adopted circuits will allow to get an insight on how their gates can influence the final results and help speed up certain computational processes. To better understand how the entangled qubits, introduced in Sec. IV, can affect the classification performance, it is necessary to clarify that the first circuit has no entanglement, whereas entanglement is introduced in the remaining ones through different gate connections.
No entanglement circuit In the simple QCNN presented in [26], there is no entanglement and classical nodes are merely replaced by a parameters quantum node [63]. As seen in Fig. 6, the qubits are first placed in superposition through the application of a Hadamard gate. Next, the quantum nodes undergo R y gate rotations about the parameters θ. This whole process is ultimately representative of quantum node activation which simply encodes the sum of the weighted activations from preceding classical nodes that are mapped into the quantum nodes. If only one qubit is considered, the effect of the Hadamard and rotation gates on the qubit |0 are summarized as: (10) The overall gate composed by 4 Hadamard and 4 rotation gates can be built by using the matrix multiplication for successive gates and the tensor product for parallel gates, hence the final unitary transformation U is The entire circuit returns the state which, when considering |0 as inputs, is Bellman Circuit The Bellman Circuit shown in Fig. 7 leverages a basic system of entanglement to encode classical information into a quantum space. Here the speedup may lie in the fact that the quantum states are prepared first through entanglement (by means of the Hadamard and CNOT gates) leading to correlational associations. Following the entanglement process, the parametrization using angular rotations predefined by classical information once more translate the classical information as a quantum activation.
The qubits are first entangled through the application of a Hadamard gate and then sequential CNOT gates. Following this, the qubits are rotated about the y axis using parameters θ. This is the basis of the activation process. Then the CNOT application process is reversed, but the superposition is never removed. The benefit of this process seems to lie in the variation of the encoding and rotation process, as it is now not just a projection of the classical information into a quantum space, but rather a transformation of this information that exploits quantum feature space. Considering the four inputs as |0 , before entering into the rotation gates the state of the 4 qubits is given as The four rotation gates applied to the entangled state correspond to the application of the gate corresponding to a 16 × 16 matrix. Finally, the rotated entangled state passes through three more CNOT gates and then it is measured. Supposing the four rotations are identities (i.e., θ i = 0, i = 1, . . . , 4), the effect of the three CNOT gates is Real Amplitudes Circuit As is shown in Fig. 8, breaking down the circuit, each qubit passes through a Hadamard gate and then undergoes a gate rotation with parameters θ (this value is derived from the result of the preceding classical node). This is the process by which the classical information is turned into quantum information. Then, the qubits are all mutually entangled using CNOT gates. For instance, considering identity rotations, i.e. R y (θ i ) = I, i = 0, . . . , 3, the state before the CNOT gates is After the CNOT gates, one can easily verify that this example state is unchanged, i.e. |Ψ 1 = |Ψ 2 (but in the general case it varies). Finally, the quantum parameters θ i , i = 4, . . . , 7 are implemented by means of the final four rotations. By using Eq. (15), the final state is During the validation and testing process, the second θ parameters are used as the "quantum weights" mapping to the following classically fully connected layer of the nodes.

C. Hybrid Quantum Neural Network Classifier
Differently from fully Quantum AI models, the proposed QCNN classifier is based on recent hybrid QML models and it consists of the combination of classical ML and quantum layers [53], [72]. This kind of paradigm [73], [74], mostly used in the computer vision domain, in this paper has been transferred and adapted to the Remote Sensing domain. Moreover, it is worth highlighting that the hybrid solutions are the preferred ones in the current stage of QML, mostly due to technology bottlenecks and limitations [26], [27].
The Fig. 9 shows the QCNN structure, where the classical part consists of a CNN derived from the LeNet-5 [75], in which both the number of convolutional layers and the input dimension were changed to fit the input image size. Moreover, with respect to the original LeNet-5 design, the proposed model contains only two fully connected layers, stacked before and after the quantum layer. These two layers are used respectively for adapting the input size, needed by the quantum layer, and the quantum layer output size to match the number of classes imposed by the chosen dataset. In other words, the purpose of these two classical neural layers is to ensure data embedding from the image space to the quantum capacity and to make possible the coexistence of classical and quantum layers in the hybrid structure.
Regarding the quantum part, the quantum layer (blue box labeled as Quantum Circuit in Fig. 9) aims to benefit of the properties of probabilistic quantum computing. This quantum layer is implemented with one the circuits described in Sec. V. In the course of this study, several quantum circuits were tested and analyzed to investigate their potential.
For comparisons purposes, two versions of the classical counterpart of the proposed QCNN classifier have been implemented and tested. For the classical CNN classifier 1, the quantum circuit has been replaced with a fully connected layer of 16 nodes, based on the quantum circuit output size. For the classical CNN classifier 2, the quantum circuit has been replaced with a multi-layer perceptron with fully-connected layers of 256, 64, 32, 10 nodes.
The experimental dataset under consideration is the "Eu-roSAT: Land Use and Land Cover Classification with Sentinel-2", a dataset of Sentinel-2 satellite images covering 13 spectral bands and consisting out of 10 classes with in total 27.000 labeled and geo-referenced images [35]. The dataset has been divided in training and validation sets with a 80-20 factor. Sample images of the dataset are shown in Fig. 10.
In the following sections several experiments have been carried out, such as: 1) experiments on 3 different quantum circuits, 2) experiments on 2 classical deep learning models for comparison with the quantum counterpart, 3) experiments on a coarse quantum classifier and 3 fine-grain quantum classifiers and 4) an additional experiment, involving the finegrain classifier, to create a segmentation map.
As highlighted at the beginning of this section, it is fair to remark that all the proposed models were implemented and designed from scratch. This process involved also the adaptation of the classical and quantum networks to fit the requirements imposed by the dataset used for the experimental analysis. No pre-trained weights were used and also the selection of hyperparameters and the loss settings were selected according to the problem requirements.

D. Training and testing
As stated before, both the training and testing procedure, when possible, has been conducted under the same hypothesis and by using the same settings. All the qubits in Fig. 6, Fig. 7, and Fig. 8 are set equal to the state |0 .
Each QCNN classifier, regardless of the circuit it used, has been trained for 50 epochs, using the Adam optimizer, with a learning rate of 0.0002, and the Cross-Entropy as loss function. The two classical CNN have been trained in the same way, but they took ∼ 100 epochs to converge.
The training procedure is summarized in Algorithm 1, where the fundamental steps of this process have been reported. The training phase, as happens for any machine learning model whose training is based on backpropagation algorithms, can be divided into two streams, the feed-forward and backward. In the first stream, input data passes through both the CNN and the Quantum Circuit, then the overall output is compared with the ground truth, to calculate the error, and through the backward stream all the model's weights are updated according to the error and its gradient. The testing of the models have been conducted on the validation dataset, according to the procedure summarized in Algorithm 2 for the sake of reproducibility.

A. EuroSAT dataset classification
In this section the results of all the proposed models are presented in the form of confusion matrices and tables with classification reports, showing accuracy, precision, recall and F1 score, as defined by equations (16).
In equations (16), T P, T N, F P, F N are the number of True Positive cases, True Negative cases, False Positive cases, and False Negative cases, respectively.
In Table I the F1 scores are reported for each class, together with the overall Accuracy, computed on the three proposed quantum classifier and on the two classical counterparts. While in Table II and Table III the Precision and Recall are reported for each class and for each model mentioned above.
The main evident difference among the quantum-based models is the higher performance when circuits with entanglement are used, thanks to their increased computational capabilities. Both entangled circuits also performed better than the two classical counterparts. Among circuits with entanglement, the Real Amplitudes Circuit reaches the best overall accuracy of 92%, a +10% gain over the second best approach. Delving into details, it has to be underlined that the model using the no entanglement circuit fails to recover the Highway class, one of the classes on which all the classifiers analyzed have found greater difficulties. This result highlights that the choice of the quantum circuit is not only linked to the type of application but also to the complexity of the data being used. In fact, this circuit has been successfully applied for digit image classification [70], but its effectiveness is poor on more complex remote sensing images.
In Fig. 11 the confusion matrices for each model are shown. The Real Amplitudes Circuit-based QCNN shows the best confusion matrix, with nearly-perfect scores on the diagonal. It is able to surpass the performances of all the other quantumbased models and those of the classic models, which all come up against difficulties for specific classes.

B. Coarse-to-fine structured land-cover classification
Classification results shown in section VI-A and especially Table I demonstrate the ability of our hybrid classical-quantum network to perform multi-class EO classification. Even if some-state-of-the-art classical networks achieve better performance (as in Helbert et al., JSTARS 2019 [35]), it is worth highlighting that the proposed quantum models are extremely less complex and with very few parameters as shown in Table IX. Moreover, to further challenge the capacities of our hybrid approach of learning with a limited number of parameters, we propose a structured prediction setting, with coarse-to-fine classification, which shows on par results with the best standard approaches.
Three difficult subsets for images of visually-similar classes were created. Then, these clusters have been used to train three hybrid QCNNs with Real Amplitudes Circuit, namely the fine-grain classifiers. In this way the 4-qubit and the entanglement have been applied within the selected macroclasses and their inherent complexity used to encode details finer than in the overall set-up. The proposed clusters are: 1) Vegetation: Annual Crop, Permanent Crop, Pasture, Forest and Herbaceous Vegetation, 2) Urban: Highway, Industrial and Residential and 3) Water Bodies: River and Sea Lake.
The overall structure of the coarse-to-fine land-cover classifier is shown in Fig. 12. A first coarse classifier, based also on the real amplitudes circuit, is trained and applied to divide the data into three macro-classes. Then, based on the coarse-classifier output, the corresponding fine grain classifier is applied to obtain the final classification.
In Table IV the performances of the coarse classifier only are reported. The proposed model reached an overall accuracy of 98% and an overall F1 score of 98%.
In Table V (resp. Table VI and Table VII ) the performances of the fine grain classifier for the vegetation (resp. Urban and Water) classes are reported. The proposed models reached overall accuracies of 94% to 99% and overall F1 scores of 94% to 99%. This is consistently better if compared with the results for each individual class obtained with the standard classifier (Table I), meaning that with constant complexity on a slightly reduced dataset, the hybrid QCNN can learn finer details to distinguish similar images.
In Table VIII the performances of the overall coarse-tofine grain classifier are reported. The proposed model reached an overall accuracy of 97% and an overall F1 score of 97%, improving over the standard classifier by +3% and reaching performances on par with Helber et al. [35] where the authors reached a 98.57% of overall accuracy, by using a model based on the ResNet-50. It is worth to highlight that the architecture proposed in this manuscript is extremely less complex than the one proposed in [35], since the ResNet-50 is composed of 50 layers while the proposed one is composed of 6 layers only: 5 classical and 1 quantum. This is an asset for computations in environments with frugal resources. The comparisons are better highlighted in Table IX, where are reported the overall accuracy of classical and quantum models, the size of each model in terms of layers and the complexity of each model in terms of number of parameters. The table is organized in two branches, the first one containing the results of the state-ofthe-art models while the second one contains the results for both the classical and quantum models proposed in this work.
Finally graphical results for the the Real Amplitudes Quantum Classifier and for the coarse-to-fine land cover classifier are reported in Table X and Table VIII respectively. These tables are structured in order to show correctly and wrongly predicted classes with the idea of underlying the increase of performances introduce with the coarse-to-fine structured landcover classification.

C. Semantic segmentation by patch-wise classification
To further demonstrate the efficiency of the proposed approach, the trained fine-grain quantum classifier has eventually been applied to unseen Sentinel 2 images from the Onera Satellite Change Detection Dataset (OSCD) [79]. In order to run the classifier on these large images, we used a sliding window of 64x64 pixels, to match the size of the EuroSAT data, with a step of 32 pixels, leading to a patch-wise classification map or semantic map, reproducing the experiment of [80] for comparison to state-of-the-art deep learning approaches.
In Figure 13 are reported the results on one location from OSCD, the city of Beirut. The maps produced by the quantum classifier have been compared with the Wide-ResNet and JEM models presented in [80]. Results are satisfying: the classifier is able to accurately distinguish the urban, vegetation and water bodies zones along the input image. Moreover maps are comparable with other state-of-the-art solutions, with even a slight advantage on retrieving residential areas in the very urban area of Beirut.

VII. CONCLUSION
This paper investigates the circuit-based hybrid QCNNs for Remote Sensing image classification. Unlike traditional CNN architectures, the chosen QCNN updates the standard neural network with a quantum layer. The proposed method is applied to the LULC classification tasks and, through a comparative and critical analysis, the performance of different gate-based circuits has been evaluated and the hybrid QCNN has proven to be effective in terms of multiclass identification and computing efficiency.           Table shown the model used, the overall accuracy and the number of layers to give an estimate of the complexity. All approaches in the second part of the table are our implementations, described in this article. Other comparisons with classical models can be found in [78].
to better results by a significant margin with respect to the others. Secondly, the quantum layer has allowed to reach better results than its classical counterpart. Moreover, all the code and experiments presented in this paper have been collected and made available open access in the GitHub page [71]. This material, along with the background on QC given in this article, will hopefully be a useful tool to help the Geo-science and Remote Sensing community tackling EO problems with this cutting-edge technology.
Regarding the classical component, which is required for data embedding given the current capacity of NISQ devices, straightforward future work will consist in exploring more powerful networks for data encoding (e.g. compressing the image information in such a way that it may be encoded on the quantum layer). Regarding the quantum component, future work will aim at increasing the proportion of quantum processing in the hybrid approach. Indeed, more complex quantum circuits are expected to enhance the learning power of the model. In particular, quantum convolutions could be examined to incorporate spatial information and invariance in the processing.
More fundamentally, the understanding of the probabilistic mechanisms at work in the quantum layers will represent the key to design better models, develop deep quantum learning, and eventually implement it to many real-life applications.    Figure 13: LULC semantic maps on never-seen OSCD city Beirut compared with the Wide-ResNet and JEM models tested in [80]. Lab's Quantum Computing for Earth Observation (QC4EO) initiative. We thank Pierre Philippe Mathieu Head of Φ-lab explore office and Ph.D. co-supervisor of Alessandro Sebastianelli and Giuseppe Borghi Head of Φ-lab, for their continual support. Moreover, the authors thank Su-yeong Chang for helpful discussions on mathematics of quantum circuits and Javiera Castillo-Navarro for sharing her expertise for the semantic segmentation experiment.