Variational Methods in Optical Quantum Machine Learning

The computing world is rapidly evolving and advancing, with new ground-breaking technologies emerging. Quantum Computing and Quantum Machine Learning have opened up new possibilities, providing unprecedented computational power and problem-solving capabilities while offering a deeper understanding of complex systems. Our research proposes new variational methods based on a deep learning system based on an optical quantum neural network applied to Machine Learning models for point classification. As a case study, we considered the binary classification of points belonging to a certain geometric pattern (the Two-Moons Classification problem) on a plane. We think it is reasonable to expect benefits from using hybrid deep learning systems (classical + quantum), not just in terms of accelerating computation but also in understanding the underlying phenomena and mechanisms. This will result in the development of new machine-learning paradigms and a significant advancement in the field of quantum computation. The selected dataset is a set of 2D points creating two interleaved semicircles and is based on a 2D binary classification generator, which aids in evaluating the performance of particular methods. The two coordinates of each unique point, $x_{1}$ and $x_{2}$ , serve as the features since they present two disparate data sets in a two-dimensional representation space. The goal was to create a quantum deep neural network that could recognise and categorise points accurately with the fewest trainable parameters possible.


I. INTRODUCTION
The phases of human understanding have always alternated between awe at the vastness of the phenomena in front of us and joyous mastery of the goals established.Galileo's insight that mathematics is the simple language that allows us to communicate with nature has been confirmed by the gradual development and introduction of a formidable mathematical arsenal into science.Our faith in positivist determinism at the turn of the 20th century was severely undermined by emerging phenomena that eventually gave rise to quantum mechanics.Machine learning techniques are used to calculate enormous amounts of data [1], [2].

The study of theoretical and practical parallels between
The associate editor coordinating the review of this manuscript and approving it for publication was Tao Huang .particular physical processes and learning systems, such as neural networks, is known as quantum machine learning [3], [4].Though basic molecular structures can be described by simulating them with automatic calculation tools that require ever-increasing computational power, the observation that deterministic chaos can be generated from models with a seemingly simple apparatus of differential equations and the apparent need suggest that the time has come for a profound reflection on our methods of scientific inquiry.
Feynman claimed that the key to exponentially reducing the computational complexity of the system in issue and adequately regulating the predictive power for the model would be to describe the world around us, which is fundamentally quantum, through a sort of computation based on quantum mechanics.We could also state further that the advent of quantum computing devices would also resolve the issues associated with the impending passing of the construction limit of present-day computers (Amdahl's law), the potential for a significant reduction in the energy required by current computing devices due to computational reversibility, and the advancement of nanotechnology.Due to such possibilities' potential, there is increasing global interest in such devices and significant investment in the necessary research.As a result, quantum computing devices may answer inquiries that would be impossible for genuine old-style equipment to be quickly processed since they would require the whole time of the universe to do so.That is made possible by various quirks of quantum mechanics that appear at minor scales, such as superposition, entanglement, and interference [5], [6].Though theoretical research on quantum algorithms points to the prospect of overcoming computer issues that are now insurmountable [7], the technology is still in its infancy and does not yet provide meaningful advantages.The age of Noisy Intermediate-Scale Quantum (NISQ) [8] is still in effect.Implementing a gate-based quantum algorithm is complicated by NISQ-devices' aforementioned limitations on quantum resources and noise.Numerous characteristics of circuit design, including depth, width, and noise, should be taken into account to ascertain if a gate-based algorithm's implementation will function properly on a specific NISQ device [9], [10].
In physics, variational methods are often employed [11], [12], most notably in quantum mechanics [13].Variational Quantum Algorithms (VQAs), their direct descendants, have emerged as the most powerful method for obtaining a quantum advantage on NISQ devices.Without a doubt, VQAs are the quantum counterpart of powerful machine learning methods like neural networks.Furthermore, as VQAs use parametrised quantum circuits to operate on the quantum computer and subsequently contract out parameter optimisation to a classical optimiser, they make use of the classical optimisation toolkit.Contrary to quantum algorithms created for the fault-tolerant period, this approach has the added advantages of minimising noise and maintaining a light quantum circuit depth [14].Finding the precise or reasonably close collection of parameter values that minimises a particular cost (loss) function is the goal of this kind of approach.That depends on the parameters themselves as well as, obviously, on the input values, which are the non-trainable portion of the schema.Regulators comprise an output measuring device and a parameter modification circuit.We may argue that the system mimics a traditional physical feedback system (Figure 1), which attempts to minimise a well-defined starting loss function while optimising the set of its parameters.
A more detailed diagram than Figure 1 can be found in Figure 2, where the individual components that make the mentioned transformations possible are highlighted; conceptually, it is possible to divide the system into two parts: a classical information part and a quantum processing part.The latter aims to speed up the calculation and subsequent determination of the final result.In the classical processing part, a conventional calculator has the purpose of transforming the input data, adapting them to the subsequent quantum processing, and verifying through a function whether the configuration resulting from the quantum calculation is a result of minimum cost concerning a particular function, chosen in the initial phase of analysing the problem.
In the quantum processing part, the appropriately adapted input data modulate the physical parameters of the quantum network; subsequently, the measuring devices detect the eigenstates of the collapsed quantum system, on which the expected value is calculated.This value is returned to the classical part for the processing described above.
Our approach is based on using a feedforward neural network (FFNN), usually referred to as a multilayer perceptron, whose layers have been built utilising quantum photonic circuits in this particular instance.Eq.1 is the mathematical function that accurately depicts the transformation provided by an FFNN.The fundamental structure, however, is essentially the same: a horizontal stacked multi-layer arrangement with each layer being composed of an initial linear transformation (an affine transformation) and a nonlinear function called ''activation'' (Eq.2) with i = 1..k and (n 1 , n 2 , . . ., n k−1 , n k , n k+1 ) ∈ N k+1 .Moreover • W i ∈ R n i+1 ×n i is the i-th matrix for the i-th layer, called the weight matrix containing θ W , • x i is the i-th input vector, • b i ∈ R n i+1 is the i-th vector, called the bias vector containing θ b , • ϕ i is the non-linear activation function for the i-th layer Usually, we are dealing with a n-qubits quantum ansatz, whose transformation is represented by the n × n unitary matrix U that depends on the vector of parameters θ, with dimensions m ϵ N (Eq.1); the loss function we refer to is the expectation value measured on every single qubit channel (Eq.3); the observable is the operator O i , where Z i is the Z-Pauli Operator, acting on i-th qubit (Eq.4); the quantum state on which to operate the measurements will be the state resulting from the ansatz (ψ in Eq.3).
The ability of quantum computers to resolve particular problems more quickly than classical ones is widely recognised.Nevertheless, packing data into a quantum computer is not so simple because the information should be encoded as quantum bits.They may interact with the data in various ways, making a wide range of information encodings possible [15].There are several ways to incorporate data; the most popular ones are Basis Encoding [16], [17], Amplitude Encoding [18], [19], [20], and Angle Encoding [21], [22].
However, an alternative approach may be used to achieve the same results: an optical-quantum layer-based FFNN [23].Thus, the following transformation (Eq.6) can be ensured by using the Singular Value Decomposition (SVD -Eq.7): is the dimension of the feature space and x the generic feature entering the system.Data can be encoded in position eigenstates.
The Singular Value Decomposition theorem, which guarantees the factorisation of W into three matrices, two orthogonal and one as a positive diagonal matrix, may be used to decompose the matrix itself.That ensures we can utilise specific quantum gates that can mimic the behaviour determined by the corresponding matrices. with Regarding the physical implementation, squeezer gates may be utilised to obtain the diagonal transformation, while interferometers (a combination of beamsplitters and rotation gates) can be employed to achieve the orthogonal ones.The addition operation using the bias vector, b, is then obtained by appending position displacement gates.Hanging a Kerr gate as the final circuital block might be a potential option to produce a non-linear transformation; this is the most common choice.Figure 3 shows the circuit diagram of a single layer relating to the input use of a single q-mode, while Figures 4 and 5 represent the situations with two q-modes and four q-modes, respectively.Our research intends to investigate several facets of quantum computation's applicability to machine learning.Concerning learning a straightforward classification task, we specifically compared the abilities of a fundamental OQ-FFNN (Optical Quantum FFNN) with dense layers with an identical network only made up of photonic quantum components.
The aim of this work is the development of a quantum deep neural network able to recognise and classify points accurately with the fewest trainable parameters possible.
Furthermore, this investigation intends to continue a study theme initiated in one of our previous papers, in which the identical classification issue was solved using a conventional quantum network [24].

II. RELATED WORKS
Since 2010, there has been a growing interest in applying machine learning techniques [25] to quantum computing [26], [27], [28].Early attempts to efficiently simulate the quantum world were prosperous to the extent that small physical systems [29], [30] (with few particles) were considered; the use of high-performance computers was necessary due to the large dimensions involved [31], [32], [33], [34], [35].The introduction of virtual and augmented reality to better explain the concepts of quantum mechanics is interesting from a didactic point of view [36].Quantum machine learning [37], [38], [39] integrates quantum algorithms into machine learning programmes.The phrase is most frequently used to describe machine learning algorithms that analyse classical data and are executed on a quantum computer.Quantum machine learning uses qubits, quantum processes, or specialised quantum systems to speed up computation [40].
A wide range of concepts with varying degrees of similarity to conventional neural networks are included in current QNN proposals [41], [42], [43].The challenge of integrating the linear and non-linear parts and the unitary framework of quantum mechanics is at the heart of quantum neural network theory.Quantum neural networks are only one form of the most recent type of machine learning models implemented on quantum computers.They broadly   use quantum phenomena like superposition, entanglement, and interference to exploit possible benefits such as quicker training and processing [44], [45], [46].
It has recently been proposed that the representation of quantum information need not be binary or discrete.Instead, it is also possible to leverage the innately ''continuous'' quantum features of matter, which would inevitably result in encoding the information in continuous variables (CV).The position and momentum of a particle are typical examples [47], [48], [49], [50].It needs a quantum circuit with a universal layer structure so that we can manufacture any CV state with no more than polynomial complexity in order to do arbitrary proper transformations for the learning process by the machine [51], [52].Therefore, the architecture to be selected must be composed of layers, with parameterised Gaussian and non-Gaussian gates present in each layer [53], [54], [55].The non-Gaussian gates provide the model with both nonlinearity and universality [56], [57], [58].The application of photonic quantum machine learning is proving effective in both traditional classification and regression issues, which is undoubtedly a fascinating finding [59], [60], [61], [62].However, more importantly, it is fast improving our knowledge of quantum processes themselves [63], [64].

III. THE SYSTEM ARCHITECTURE
We tested three different types of OQFFNN, whose general structure is identical, and the differentiation comes from the distinct implementations of the quantum networks.All our models consist of a series of quantum layers (L i ) followed by a measurement apparatus whose outputs go into a classic dense layer that enables the binary categorisation of the input items (Figure 6).The Number Operator (n i = â † i âi , that is the sequential action of the annihilation and creation operators) has been selected as the observable in our experiments, and the measure will be its average value, ⟨n i ⟩, for each i-th q-mode.
The chosen dataset is a set of 2D points, so it can be represented as a matrix whose shape is (number_of _features, 2).

A. IMPLEMENTED QUANTUM NETWORKS
We can now examine the three varieties of quantum layers we implemented.

1) FIRST KIND OF NETWORK
The dataset features are entered into the network as values for the two parameters (ρ, ϕ) of a classical coherent state |α⟩, with α ∈ C | α = ρ • e i•ϕ .In order to allow a direct correspondence between the coordinates of the point and the geometric meaning of the parameters of the coherent state given in input, the pair of coordinates (x 1 , x 2 ) that identify a point of the plane in an orthogonal Cartesian reference are transformed into polar coordinates (ρ 1 , θ 1 ) before being transferred as input into the network.The quantum layer (Figure 7) is composed of a Rotational Gate (R), a Squeezing Gate (S), another Rotational Gate (R), a Displacement Gate (D) and a Kerr Gate (K).

2) SECOND KIND OF NETWORK
The features of the input dataset are represented in the network as the phases ϕ of an equal number of coherent states, whose amplitude ρ is arbitrarily set to 1, to improve the numerical simulation's efficiency without straying too far from the typical values of a physical implementation.The quantum layer (Figure 8) is composed of two BeamSplitters (BS), four Rotational Gates (R), two Squeezing Gates (S), two Displacement Gates (D) and two Kerr Gates (K).

3) THIRD KIND OF NETWORK
The dataset features are entered into the network as values for the two parameters (ρ, ϕ) of a two-mode squeezing gate, whose inputs are Vacuum States.The quantum layer (Figure 9) is composed of four Rotational Gates (R), two single Squeezing Gates (S), two Displacement Gates (D) and a Cross-Kerr Gate (K).

IV. DATA EXTRACTION AND PROCESSING
The two-moons database, which creates two interleaving semicircles of 2D points and is typical for the study of clustering and classification techniques, was chosen to solve the classification challenge.
The number of samples in the dataset was divided in such a way as to ensure the following quotas: 75% for the training set, 15% for the validation set and the remaining 10% for the test set (Figure 10).
Moreover, the features have been given as points of a plane in polar coordinates (Figure 11) to improve the operation of particular gates whose parameters operate on complex values (in C set).The kind of input in the Cartesian or Polar form shall be defined for each model in section V.

V. DISCUSSION OF RESULTS
All the models to be discussed have been implemented using the known open-source Python library, PennyLane, 2 with StrawberryFields 3 backend for simulation, both by Xanadu [65], [66].Both can be perfectly interfaced with the most famous Deep Learning frameworks, such as Keras, TensorFlow and PyTorch.
All tests and simulations were performed using the specialised open-source library for Deep Learning, Keras. 4  Every model has a traditional dense layer as the last layer (Figure 6), which serves as a classifier.Its input is the average number of photons processed by the quantum network, and its output is a float value between 0 and 1.The sigmoid function is utilised as its activation function.All the models were compiled using: • optimizer: SGD (Stochastic Gradient Descent), with a learning rate equal to 0.01 • loss function: binary cross-entropy • metric: accuracy The simulations were conducted initially using a simple structure, consisting of a single quantum layer, for a total of 9 parameters (Figure 12).Indeed, the number of parameters to be trained for the first model is given by 2 (parameters for the final Dense layer) and the product between the number of quantum layers to be used and 7 (the sum of the parameters of the single quantum gates making up the layer -see Appendix A for details).Figures 13 and 14 show the metrics (LOSS: loss function on training set, ACCURACY: accuracy on training set, VAL_LOSS: loss function on validation set, VAL_ACCURACY: accuracy on validation set) for the Model n.1, trained for 50 epochs.The outcomes are superb.As shown in Figures 13 and 14, it is evident that the search algorithm has found the best solution, enabling the model to correctly classify all of the data, both the samples of the training set and of the validation set, as early as the 46 th epoch.Specifically, Figure 14 represents the attempts of the 2 https://pennylane.ai/ 3 https://strawberryfields.ai/ 4 https://keras.io/131398 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Other simulations were run using more quantum layers, but none of them showed a discernible improvement over the single-layer experiment.Specifically, we conducted three simulations using respective 2, 3, and 4 layers.

2) MODEL 2
The number of parameters to be trained for the model n.2 is given by 3 (parameters for the final Dense layer) 5 Coefficient k is in mV −2 .and the product between the number of quantum layers to be used and 16 (the sum of the parameters of the single quantum gates making up the layer).Following the reasoning shown in section I (Introduction), it was decided to utilise phaseless beamsplitters in this simulation so that each item in its representation matrix might have a real value.
The simulations were conducted initially using a simple structure, consisting of a single quantum layer, for a total of 19 parameters (Figure 16).By starting with this model, it is feasible to increase the degree of categorisation and moderate the oscillations during the solution research phase.The phase angle parameters will be added to the individual BeamSplitters, taking the reflectivity values from real to complex.The number of training parameters for the quantum layer is 18; the final number is 21.
Figures 20 and 21 show the usual metrics for the Model n.2, with phase angles, trained for 50 epochs.The values of the best parameters7 determined after network training are displayed in Table 3.
We have got the following values for the dense layer's parameters: w = (−8.04,7.08) T and b = −0.18.The Test Set's scores (Figure 22) are likewise quite good, with an accuracy performance of 97.33%.
It was noticed that the model's performance did not meaningfully improve with the addition of more quantum layers to the network.

3) MODEL 3
A hybrid of the first two, the third kind of layer (Figure 9) contains two q-modes but does not employ beam splitters.The interaction between states is achieved by using a Two-Squeezing Gate, which serves the function of suitably encoding and modulating the input information, while the  Cross-Kerr Gate is found in the final step of the Quantum Layer.They both carry out nonlinear transformations commonly utilised to generate entanglement in quantum models with continuous variables.
The simulations were conducted initially using a simple structure, consisting of a single quantum layer, for a total of 16 parameters (Figure 23 Further investigations showed that increasing the number of layers leads to a partial but meaningful improvement in performance: 8 Coefficient k is in mV −2 .131400 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Furthermore, replacing the Cross-Kerr layer with a Two-Squeezing Gate yields an excellent result after 50 epochs of training (Figure 27), hitting 100% in the Accuracy Test, with only 17 trainable parameters.So, the research presented in this chapter aimed to delve deeper into the feasibility of using quantum computers for machine learning purposes in the NISQ era by attempting to train a simple network consisting of photonic quantum gates.The study examined various essential aspects of a statistical classifier, such as the convergence of the proposed model, the accuracy of the result concerning a standard metric, the minimisation of the number of trainable parameters (i.e., the number of quantum gates required in the implemented circuit), and the scalability of the solution.The results obtained from this research and other similar recent studies [67] suggest that quantum computation can be used to advantage in the field of Machine Learning.In a previous works [24] , the solution was a Fermionic Quantum Machine Learning that leveraged two different libraries to simulate the proposed network, which comprised rotation gates and Cnot.It produced consistent results as the noise level on the dataset changed.Only twenty epochs were needed to effectively train the network, which achieved a remarkable 100% recognition rate for samples belonging to the Validation Set when the noise level was low.Moreover,  it achieved an impressive 85% recognition rate for too high noise levels.The progress of the Loss and Accuracy curves of the model suggested that it was robust to noise and immune to the drop-out phenomenon.Surprisingly, a similar network for classical Machine Learning (e.g., a simple FFNN) needed 354 trained weights (although relatively few for networks of this type) versus the 54 of the proposed solution.The degree  of scalability was excellent, as the size of the individual dataset did not tie to the network structure.
The most notable disadvantage during simulations was the high run-time, but typical of machine learning hybrid solutions (Classic + Quantum).In contrast, in this research, the primary constituents of the quantum circuit are linear and non-linear optical components.The elements of the dataset modulate the incoming photons, transforming them into specific prepared quantum states.Here, the goal, besides the model's convergence and classification ability, was to limit the number of trainable parameters while retaining an accuracy of more than 80%.The three models had surprisingly comparable patterns, confirming their quality and validity and proving their efficacy.This investigation, which employed quantum mechanics, attempted to compare the behaviours emerging from several circuit implementations of the kernel method on a not overly complicated dataset.The results are excellent for all the models investigated, demonstrating the circuit structures' effectiveness.Some    The second model receives in input two consistent states, the phase of which is modulated by the input data.The confusion matrix and the ROC curve highlight a good model response, with a recognition accuracy of the test data of 91%.
The third model has a very similar structure to the second.Still, the interaction between the two photons in the final stage reduces the performance to 88%, a very high value, considering the low number of trained parameters (equal to 13).An excellent improvement is achieved by replacing the Cross-Kerr stadium with a Two-Squeezing Gate: the results obtained are comparable to those of Model 1.
The limit of the first model is its inability to be scalable if taken in its original version, as its input is a coherent state that depends solely on two parameters.Therefore, datasets with many features cannot be directly encoded in the network.The second and third models are more scalable than the first one.In these models, each input photon contains information about every feature in the dataset.However, minimising the number of gates in the structure is essential to ensure high performance.That is because other stages that stack horizontally can significantly degrade the model's performance.
The research's aim to be done shortly will concern the capacity to categorise increasingly complicated and 131404 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.structured data sets while leaving a suitably low number of gates needed to accomplish the task.

VI. CONCLUSION
Variational circuits play a crucial role in the field of hybrid algorithms, serving as a link between classical and quantum computing.These circuits leverage a parameterised design that can be adjusted to achieve a specific outcome, like tuning a quantum circuit to simulate particular input-output relationships based on training data.
However, despite the apparent simplicity of this concept, the actual execution of variational circuits is incredibly complex.One of the most significant challenges is selecting the proper structure or ''ansatz'' for the efficient circuit regarding depth, width, and parameters while remaining robust and versatile.This challenge is similar to a fundamental problem in classical machine learning, which involves creating simple yet effective models.Nevertheless, the challenge continues beyond model selection.Training these quantum models differs from traditional approaches, as they are based on physical quantum algorithms rather than mathematical formulas.That raises questions about whether we can improve upon traditional numerical optimisation techniques.As a result, there are still some unanswered questions and uncertainties.In the context of quantum systems, choosing optimal parameterisation techniques is crucial to achieving high-quality results.It is imperative to consider whether classical iterative training strategies are adequate or if new techniques should be devised to suit the unique characteristics of quantum systems.As such, it needs to address whether existing parameterisation methods are fit for purpose or whether novel approaches need to be developed to meet the specific requirements of quantum systems.In essence, variational algorithms are not just another tool; they may hold the key to a whole new dimension in machine learning.They could pave the way for a broader paradigm in which we use physical devices as machine learning models, guided by our classical computers during training.
Transferring these investigations from quantum computer simulators to real quantum computers is desirable once the simple tests necessary to validate the model have been completed.Despite the excellent performance and optimisations gained so far of both hardware and software on the enormous amount of calculations to be performed, the simulators still need to be able to exploit the quantum peculiarities of matter, which allow parallelism in the calculation, which is not classically attainable.
Future investigations will focus on this front.That will provide insight into the problems arising from the physical implementation and the feasibility of the proposed solutions.

FIGURE 2 .
FIGURE 2. General scheme for quantum variational method circuit.

FIGURE 3 .
FIGURE 3. Circuital block diagram for a single layer, single q-mode, related to an optical-quantum layer-based FFNN.

FIGURE 4 .
FIGURE 4. Circuital block diagram for a single layer, two q-modes, related to an optical-quantum layer-based FFNN.

FIGURE 5 .
FIGURE 5. Circuital block diagram for a single layer, four q-modes, related to an optical-quantum layer-based FFNN.

FIGURE 6 .
FIGURE 6. Common structure for the proposed OQFFNN (m is the input dimension).

FIGURE 7 .
FIGURE 7. First type of layer.

FIGURE 8 .
FIGURE 8. Second type of layer.

FIGURE 9 .
FIGURE 9. Third type of layer.
Figures 17 and 18 show the metrics (LOSS: loss function on training set, ACCURACY: accuracy on training set, VAL_LOSS: loss function on validation set, VAL_ACCURACY: accuracy on validation set) for the Model n.2, trained for 50 epochs.The outcomes are satisfactory.The optimal parameters' values 6 are displayed in Tab.2, while for the dense layer, they are w = (−7.64,6.90) T and b = −0.33.The results provided in Figure 19 indicate what happens when the model is applied to the test set (150 samples), confirming the model's good performance (Test Accuracy: 96.67%).

FIGURE 10 .
FIGURE 10.Dataset: Distribution of pattern points on the plane.

FIGURE 11 .
FIGURE 11.Dataset: Distribution of pattern points on the plane in polar form.
): 13 belong to quantum layer.Figures 24 and 25 show the metrics (LOSS: loss function on training set, ACCURACY: accuracy on training set, VAL_LOSS: loss function on validation set, VAL_ACCURACY: accuracy on validation set) for the Model n.3, trained for 50 epochs.The outcomes are satisfactory.The optimal parameters' values 8 are displayed in Tab.4, while for the dense layer, they are w = (−6.13,−7.14) T and b = 7.43.The results provided in Figure 26 indicate what happens when the model is applied to the test set (150 samples), confirming the model's good performance (Test Accuracy: 89.33%).

FIGURE 13 .
FIGURE 13.Loss and validation loss for model 1.

FIGURE 17 .
FIGURE 17. Loss and validation loss for model 2.

FIGURE 18 .
FIGURE 18. Accuracy and validation accuracy for model 2.

FIGURE 20 .
FIGURE 20.Loss and validation loss for model 2 with phases.

FIGURE 21 .
FIGURE 21.Accuracy and validation accuracy for model 2 with phases.

FIGURE 23 .
FIGURE 23.Number of training parameters.

FIGURE 24 .
FIGURE 24.Loss and validation loss for model 3.

FIGURE 25 .
FIGURE 25.Accuracy and validation accuracy for model 3.

FIGURE 27 .
FIGURE 27.(a) An interesting variant for Model 3; (b) Loss and Validation Loss; (c) Accuracy and validation accuracy.

TABLE 1 .
Optimised parameters' set for the quantum network.

TABLE 2 .
Optimised parameters' set for the quantum network (Model without phase angles for beamsplitters.)

TABLE 3 .
Optimised parameters' set for the quantum network (Model with phase angles for beamsplitters.)

TABLE 4 .
Optimised parameters' set for the quantum network.