Application of Quantum Natural Language Processing for Language Translation

In this paper, we develop compositional vector-based semantics of positive transitive sentences using quantum natural language processing (Q-NLP) to compare the parametrized quantum circuits of two synonymous simple sentences in English and Persian. We propose a protocol based on quantum long short-term memory (Q-LSTM) for Q-NLP to perform various tasks in general but specifically for translating a sentence from English to Persian. Then, we generalize our method to use quantum circuits of sentences as an input for the Q-LSTM cell. This enables us to translate sentences in different languages. Our work paves the way toward representing quantum neural machine translation, which may demonstrate quadratic speedup and converge faster or reaches a better accuracy over classical methods.


I. INTRODUCTION
Machine translation (MT) [1] -automated translation of natural languages by computers -was proposed by Warren Weaver in 1949 [2]. Subsequently, a rule-based machine translation (RBMT) was used based on dictionaries and grammars for four decades. Then from the 1980s to 2000s, statistical machine translation (SMT) gained better performance than RBMT and dominated the field. SMT uses statistical models based on the analysis of bilingual text corpus to achieve translation [3]. A few years later in 2003, a language model based on neural networks was suggested [4], which gave a better-quality for the data sparsity problem of traditional SMT models, getting a foundation for the next neural networks on machine translation such as Convolutional Neural Network (CNN), and then Recurrent Neural Network (RNN) as the decoder to transform the state vector into the target language [5]. This resulted in the birth of the Neural Machine Translation (NMT), which is a method that uses deep learning neural networks to map natural The associate editor coordinating the review of this manuscript and approving it for publication was Yuan Zhuang .
language. NMT's nonlinear mapping differs from the linear SMT models, and describes the semantic equivalence using the state vectors which connect encoder and decoder. NMT is used to predict a sequence of numbers when a sequence of numbers is provided. In the case of translation, each word in the input sentence (e.g English) is encoded as a number to be translated into a resulting sequence of numbers representing the translated target sentence (e.g Persian) via neural network. Vaswani et al. introduced the basic transformer encoder-decoder architecture [6]. Martin Popel et al. presented a deep learning system, CUBBITT [7], which could win a race (English-to-Czech and Czech-to-English news translation in preserving text meaning) against a professional agency translation in a context-aware blind evaluation by human judge. In general, the success of NMT depends on the quantity and quality of the training pairs of sentences in the source and target language. NMT architecture consists of embedding layers, a classification layer, an encoder network and a decoder network. In the architecture of the CUBBITT machine translation system, the input sentence is converted to a numerical representation that is encoded into a deep representation by a six-layer encoder, which is decoded by a six-layer decoder into the translation in the target language.
Natural language processing (NLP) [8]- [13] is a subgroup of linguistics and artificial intelligence used for language interactions between computers and humans, e.g. programming computers to analyze natural language data with large volumes. A computer can understand the meanings and concepts of the texts in documents, recognises speech, and generates natural language via NLP. NLP was proposed first in 1950 by Alan Turing [14] i.e. in what is now called the Turing test as a criterion of intelligence for automated interpretation and generation of natural language. Regarding the Turing test for CUBBITT, most participants struggle to distinguish CUBBITT translations from human translations [7]. Also, a group of researchers at OpenAI have developed Generative Pre-trained Transformer 3 (GPT-3) language model [15], as the largest non-sparse language model with higher number of parameters and a higher level of accuracy versus previous models with capacity of ten times larger than that of Microsoft's Turing-NLG to date.
Recent advances in quantum computation and information has opened up new windows in different technologies with broad applications [16]- [21]. For example, recent quantum approaches for NLP have been developed that may reach quantum advantages over classical counterparts in future [22], [23]. Protocols for quantum Natural Language Processing (QNLP) have two aspects: semantic and syntax. Both aspects are performed by a mathematical framework. Compact closed categories are used to provide semantics for quantum protocols [24]. The use of quantum maps for describing meaning in natural language was started by Bob Coecke [25]. Coecke has introduced diagrammatic language to describe processes and their compositions [26]. The diagrammatic language of non-commutative categorical quantum logic represents reduction diagrams for sentences, and allows one to compare the grammatical structures of sentences in different languages. Sadrzadeh has used pregroups to provide an algebraic analysis of Persian sentences [27]. Pregroups are used to encode the grammar of languages. One can fix a set of basic grammatical roles and a partial ordering between them, then freely can generate a pregroup of these types [25]. The category of finite dimensional vector spaces and pregroups are monoidal categories. Models of the semantic of positive and negative transitive sentences are given in ref. [25]. Moreover, Frobenius algebras are used to model the semantics of subject and object relative pronouns [28]. Brian Tyrrell [29] has used vector space distributional compositional categorical models of meaning to compare the meaning of sentences in Irish and in English. Here, we use vector-based models of semantic composition to model the semantics of positive transitive sentences in Persian.
Our goal is to represent an algorithm that can be implemented on quantum hardware and our work is a theoretical proposal in this study. First we need to convert classical information to quantum and vice versa. To go from classical to quantum we employ various encoding quantum circuits.
Parametrized quantum circuits offer a concrete way to implement algorithms and even demonstrate quantum supremacy in the noisy intermediate-scale quantum (NISQ) era. According to [22] the DisCoCat diagram is simplified to some other diagram and is turned into a quantum circuit, which can be compiled via NISQ devices. The grammatical quantum circuits are spanned by a set θ. The meaning of the words and hence whole the sentence are encoded in the created semantic space. Finally, we rewrite the diagram as a bipartite graph to turn a quantum circuit. ZX-calculus, like a translator, turns a linguistic diagram into a quantum circuit. According to [30] we consider both grammar and meaning of a grammatical sentence in Persian and turn DisCoCat diagram into a quantum circuit form. In section V we propose an algorithm to translate sentences of different languages. In this algorithm we use the special kind of recurrent neural networks as quantum long short-term memory (LSTM) model. A recurrent neural network (RNN) can be thought of as multiple copies of the same network, each passing a message to a successor. RNNs are not able to learn to connect the information. This problem are solved via the long shortterm memory (LSTM). In [31], a hybrid quantum-classical model of LSTM (QLSTM) is proposed. Variational quantum circuits (VQCs) are introduced as the building blocks of the proposed framework. Finally, we present an algorithm that can be used as translating a sentence into its successor. This approach will be applied to convert short English sentences into corresponding Persian sentences. The QLSTM encoder and decoder are used to process the sequence to sequence modelling in this task.

II. PRELIMINARIES
In this section, we provide some content, which will be used throughout this paper. See the references [28] and [25] for more details.
Definition 1: A category C consists of:

Definition 2:
A monoidal category is a category C with the following properties: • a functor ⊗ : C × C → C, called the tensor product and we have • there is a unit object I such that Monoidal categories are used to encode semantic and syntax of sentences in different languages.
Definition 3: A symmetric monoidal category is a monoidal category C such that the tensor product is symmetric. This means that there is a natural isomorphism η such that for all objects A, Graphical language is a high-level language for researching in quantum processes, which has applications in many areas such as QNLP and modelling quantum circuits.

A. GRAPHICAL LANGUAGE FOR MONOIDAL CATEGORY
According to [25], morphisms are depicted by boxes, with input and output wires. For example, the morphisms where f : A → B and g : B → C, are depicted as follows: States and effects of an object A are defined as follows, respectively from left to right: Definition 4: A compact closed category is a monoidal category where for each object A there are objects A r and A l , and morphisms such that: The above equations are called yanking equations. In the graphical language the η maps are depicted by caps, and maps are depicted by cups [25]. The yanking equation results in a straight wire. For example, the diagrams for η l : are as follows, respectively from left to right: Definition 5: As defined in [25], a partially ordered noncommutative monoid P is called a pregroup, to which we refer as Preg. Each element p ∈ P has both a left adjoint p l ∈ P and a right adjoint p r ∈ P. A partially ordered monoid is a set (P, . , 1, ≤, (−) l , (−) r ) with a partial order relation on P and a binary operation − · − : P × P → P that preserves the partial order relation. The multiplication has the unit 1, that is p = 1.p = p.1. Explicitly we have the following axioms: p We refer the above axioms as reductions.

B. PREG AND FVEC AS COMPACT CLOSED CATEGORIES
Preg is a compact closed category. Morphisms are reductions and the operation '' . '' is the monoidal tensor of the monoidal category. As mentioned in [25], the category Preg can be used to encode the grammatical structure of a sentence in a language. Objects and morphisms are grammatical types and grammatical reductions, respectively. The operation '' . '' is the juxtaposition of types. According to [28], let FVect be the category of finite dimensional vector spaces over the field of reals R. FVect is a monoidal category, in which vector spaces, linear maps and the tensor product are as objects, morphisms and the monoidal tensor, respectively. In this category the tensor product is commutative, i.e. V ⊗ W ∼ = W ⊗ V , and hence V l ∼ = V r ∼ = V * , where V l , V r and V * are left adjoint, right adjoint and a dual space of V . We consider a fixed base, so we have an inner-product. Consider a vector space Consider the monoidal functor F : Preg → FVect, which assigns the basic types to vector spaces as follows: The compact structure is preserved by Monoidal functors; this means that for more details see [28].

III. POSITIVE TRANSITIVE SENTENCE
The simple declarative Persian sentence with a transitive verb has the following structure: subject + object + objective sign + transitive verb. For example, 'Sara ketab ra mikharad' is the Persian sentence for 'Sara buys the book'. In Table 1, we present the equivalent expressions in English and Persian. In this sentence, 'Sara' is the subject, 'ketab' is the direct object, 'ra' is the objective sign and 'mikharad' is the transitive verb in present tense, see [27].

A. VECTOR SPACE INTERPRETATION
Vector spaces and pregroups are used to assign meanings to words and grammatical structure to sentences in a language. The reductions and types are interpreted as linear maps and vector spaces, obtained by a monoidal functor F from Preg to FVect. In this paper we present one example from Persian: positive transitive sentence, for which we fix the following basic types, n: noun s: declarative statement o: object According to [25] if the juxtaposition of the types of the words in a sentence reduces to the basic type s, the sentence is called grammatical. We use an arrow → for ≤ and drop the ''. '' between juxtaposed types. According to [27] the example sentence 'Sara ketab ra mikharad', has the following type assignment: Sara ketab ra mikharad. n n (n r o) (o r n r s) Which is grammatical because of the following reduction: For the above sentence this reduction is depicted diagrammatically as follows: A positive sentence with a transitive verb in Persian has the pregroup type nn(n r o)(o r n r s). The interpretation of a transitive verb is computed as follows: So the meaning vector of a Persian transitive verb is a vector in N ⊗ N ⊗ S. The pregroup reduction of a transitive sentence is computed as follows: and depicted as: The distributional meaning of 'Sara ketab ra mikharad' is as follows: where − → ra is the vector corresponding to the meaning of 'ra'. We set VOLUME 9, 2021 and in this case we have We obtain diagrammatically: Which by the diagrammatic calculus of compact closed categories [32], is equal to: Consider the vector in the tensor space which represents the type of verb: where for each i, − → w i is the meaning vector of object and − → v j is the meaning vector of subject. Then we have:

B. TRUTH THEORETIC MEANING AND CONCRETE INSTANTIATION
According to [28] we let N to be the vector space spanned by a set of individuals { − → n i } and S to be the one dimensional space spanned by the unit vector − → 1 . The unit vector and the zero vector represent truth value 1 and truth value 0 respectively. A transitive verb Ψ ∈ N ⊗ N ⊗ S is represented as follows: where k and l range over the sets of basis vectors representing the respective common nouns, the truth-theoretic meaning of a transitive sentence is computed as follows: For concrete instantiation in the model of Grefenstette and Sadrzadeh [33] the vectors are obtained from corpora and the scalar weights for noun vectors are not necessarily 1 or 0. For any word vector −−→ word = c word i − → n i , the scalar weight c word i is the number of times that the word has appeared in that context. Where − → n i 's are context basis vectors. The meaning of the transitive sentence is: Note that the sum of the tensor product of the objects and subjects of the verb throughout a corpus represents the meaning vector of the verb. So the meaning of the transitive sentence is: The meaning vector is decomposed to point-wise multiplication of two vectors as follows: where is the point-wise multiplication. The meaning vector of the transitive sentence in English is as follows Thus for synonymous sentences in the same corpus of English and Persian we have: The procedure for learning weights for matrices of subject, object and verb is represented in [33]. Note that ( − → obj ⊗ − → sub) is the Kronecker product of − → obj and − → sub of the verb. In order to compare the meaning of two sentences in English and Persian, we compute the following: Our goal in classification synonymous sentences in two different languages is to translate the sentence from one language to another.

IV. DIAGRAMS REWRITING AND QUANTUM CIRCUITS
As mentioned in the previous sections a sentence in a corpus is parsed according to its grammatical structure. According to [22] we simplify the DisCoCat diagram to some other diagram and turn into a quantum circuit, which can be compiled via NISQ devices. Two methods are presented for this purpose. The bigraph method and snake removal method. Both methods are done in the symmetric version of the pregroup grammar. We consider the grammatical sentence from Fig. 1, and obtain the diagram in Fig. 2. where the boxes denote tensors, and the order of which is the number of their wires. We use a bigraph method to turn the diagram in Fig. 2 into a bipartite graph. Words at odd distance from the root word are transposed into effects, and we obtain the diagram in Fig. 3. Transposition turns states into effects, see [32]. According to [22], we consider CNOT+U(3) of unitary qubit ansatze. Layers of CNOT gates between adjacent qubits with layers of single-qubits rotations in Z and X form unitary quantum circuits. LetP be the symmetric version of the pregroup grammar P. Consider the monoidal functor fromP to fHilb, in which word states are mapped to the state ansatze. State ansatze are obtained by applying the unitary ansatze to the pauli Z |0 state. Word effects are mapped to the effect ansatze, in which effect ansatze are obtained by transposing the state ansatze in the computational basis, and wire crossing are mapped to swaps. Now consider the diagram in Fig. 3. If each wire is mapped to a qubit, the circuit has about four CNOTs. In ZX-calculus [34], suppose single-qubits white and black dots are rotations in pauli Z and pauli X. A CNOT gate is Black and white dots connected by a horizontal line. So we obtain the following circuits: One can use the bigraph algorithm to form quantum circuits of the semantic side of the meaning. In the pregroup type of the sentence 'Sara ketab ra mikharad' set o = n. For atomic types n and s consider two qubits and one qubit respectively. The number of qubits for each type t is the sum of the number of qubits associated to all atomic types in t. For example the transitive verb 'mikharad' has five qubits. VOLUME 9, 2021 For each word in the sentence we have a quantum circuit as follows: The quantum circuit of the whole sentence is as follows: The reduction diagram of the sentence 'Sara buys the book' in English is: and the quantum circuit of each word is as follows: So the quantum circuit of the whole sentence is as follows: The two sentences 'Sara Ketab ra mikharad' and 'Sara buys the book' have the same meaning but are grammatically different. By training these circuits on a quantum hardware, we'll give a model to translate from English to Persian and vice versa. According to [30] we present grammar+meaning as quantum circuit for the above two sentences. Consider the states |ψ n s and |ψ n o correspond to the subject and the object, respectively. Also a transitive verb as a map η tv that takes |ψ n s ∈ C 2 and |ψ n o ∈ C 2 and produces |ψ n s .n o .tv ∈ C 2k , diagrammatically: So |ψ mikharad ∈ C 2 ⊗ C 2 ⊗ C 2k . Because the quantum model relies on the tensor product, an exponential blow-up occurs for meaning spaces of words. In order to avoid this obstacle in experiments decrease the dimension of the spaces in which meanings of transitive verbs live. For the transitive verb, instead of state in the large space |ψ mikharad ∈ C 2 ⊗ C 2 ⊗ C 2k consider state in a smaller space |ψ * mikharad * ∈ C 2 ⊗ C 2 , diagrammatically: Then copy each of the wires and bundle two of the wires together to make up the thick wire. Thus 'mikharad' is obtained: For more details see [30]. Now inter Sara and Ketab into the picture: we pull some spiders out and obtain: and by Using the Choi-Jamiolkowski correspondence we obtain the circuit in Fig. 5. The circuit in Fig. 4 requires 4 qubits and has two CNOT-gates in parallel, but the circuit Fig. 5 requires 3 qubits and has sequential CNOT-gates. Indeed, the use of the Choi-Jamiolkowski correspondence has reduced the number of qubits, but has increased the depth of the CNOT-gates. As mentioned in [30] ion trap hardware has less qubits, but performs better for greater circuit depth. In ZX-calculus and via Euler decomposition any one-qubit unitary gate is represented as follows: Each verb is represented by an unitary gate U and has different values α, β and γ . So we obtain the following circuits: By considering the singular value decomposition for the verb we obtain the circuit of Fig. 6. Where the state P is the diagonal of the matrix. We represent all noun states by gates and obtain the circuit of Fig. 8. As mentioned in III-A, similarly for the sentence 'Sara buys the book' we obtain the DisCoCat diagram in Fig. 7. Indeed we ignore 'the' and 'ra' of positive transitive sentences in English and Persian respectively. Therefore according to [30] the parametrised quantum circuit of the diagram in Fig. 7 is as Fig. 9. The quantum circuits in Fig. 8 and Fig. 9 are compatible to train on a quantum hardware. By providing a set of quantum circuits of sentences in English and Persian and training on a quantum hardware we can perform the classification task. This approach can make a quicker and an advantage over classical counterparts for NLP. In the next section we try to provide an algorithm for translating sentences of different languages.

V. ALGORITHM OF USING QUANTUM LONG SHORT-TERM MEMORY FOR QUANTUM NATURAL LANGUAGE PROCESSING
In this section we propose an algorithm to translate sentences of different languages. In this algorithm we use quantum  long short-term memory (LSTM) model. The long shortterm memory is a special kind of recurrent neural networks (RNNs) that can learn a longer range of sequential dependency in the data. In this way, we use parametrized quantum circuits (PQCs) of sentences as quantum input for quantum LSTM cell. This model encodes features of a quantum circuit of a sentence into a learned hidden vector state and then decodes that vector into its successor sentence. A hybrid quantum model of LSTM to translate is capable to modeling sequential data. Now we use the method presented in [35] to map a DisCoCat diagram into a quantum circuit. Sentences of different grammatical structure are mapped to different quantum circuits. This method is easier than the methods presented in section IV. In DisCoCat diagram of the sentence 'Sara ketab ra mikharad' in Fig. 2 we bend down all nouns of the sentence, and obtain the diagram of Fig. 10. Then we have the quantum circuit of Fig. 11. We choose one qubit for every wire of type n and s, and replace all word states (effects) with parametrised quantum states (effects). So the verb is a state on three qubits. For verb, so called IQP-based states is being used. IQP layer consists of an H gate on every qubit composed with two controlled Z-rotation gates, connecting adjacent qubits. The words 'Sara' and 'ketab' replace with the quantum effects. Note that each cup is corresponding to a Bell-effect. Similarly for the DisCoCat diagram of the sentence 'Sara buys the book' we have the diagram of Fig. 12, and obtain the quantum circuit of Fig.13. Consider a language model trying to translate a sentence. Encoder refers to the part of the network which reads the sentence to be translated, and decoder is the part of the network which translates the sentence into desired language. The first step of our encoding scheme is to transform sentences into PQCs. We choose features of the circuit of each sentence as  an input vector of a quantum LSTM cell. Quantum circuits of sentences are trained on a quantum processor. At each time step t, the aforementioned input is composed of a set of features characterizing both the circuit and the processor. In particular, with respect to the circuit, the following information can be considered: • an integer value representing the number of qubits composing the circuit; • an integer value representing the total number of CNOT gates in the circuit; • an integer value representing the total number of Z-rotation gates in the circuit; • an integer value representing the number of quantum effects in the circuit; • a matrix of integer values where the item [i, j] contains the number of CNOT gates between the control qubit and the target qubit of the circuit. The task we present can be thought as translating a sentence into its successor. We propose an algorithm of using quantum LSTM for quantum natural language processing in order to perform various tasks such as prediction a sentence and translating. The algorithm in order to translate a sentence from English to Persian is pictured diagrammatically in Fig. 14. The encoder-decoder model of quantum LSTM (QLSTM) encodes a feature vector of the quantum circuit of the sentence into a learned hidden vector state and then decodes that into its successor sentence. The QLSTM cell of the algorithm can be considered as shown in Fig. 15, [31]. In the QLSTM cell, there are six VQCs. For VQC 1 to VQC 4 consider the concatenation v t of the hidden state h t−1 from the previous time step and current input vector x t . We choose features of the quantum circuit of each sentence as input vector x t . The measured values of VQCs go through nonlinear activation functions. A mathematical formulation of the QLSTM cell is as follows: An output of a vector f t has values in [0,1] through the sigmoid function. * is the element-wise multiplication. f t * c t−1 determines whether to keep or forget the elements in the cell state c t−1 from the previous step. The output of the VQC 2 goes through the sigmoid function to determine which values will be added to the cell state. The output of the VQC 3 goes through the tanh function to generate a new cell stateC t . i t * C t is used to update the cell state. The output of the VQC 4 goes through the sigmoid function, and determines which values in the cell state c t are relevant to the output. o t * tanh(c t ) is processed with VQC 5 to get the hidden state h t . Also o t * tanh(c t ) is processed with VQC 6 to get the output y t . Generic VQC architecture is shown in Figure 16. In the VQC architecture, x i 's are elements of the vector v t . The number of qubits and the number of measurements are determined by the numbers of x i 's. Three rotation angles α i , β i and γ i are not fixed. They are updated in the iterative optimization process [31].
In the algorithm presented, one may face some problems. The first three sentences in Table 2 have the same structure but different meanings, with quantum circuits such as in Fig. 13. Each sentence may have different values for α, β, γ , δ and ξ . For entering this feature to the algorithm, one can enter this angles as real numbers into the input vectors. In Table 2, the last two sentences have the same meanings but different structures. For each language, one has to do the meaning classification task in QNLP. As mentioned in the III-A, the meaning of sentences can be reduced to linear algebraic formulae. For example, the meaning vector of our transitive sentence is: where f is the linear map that encodes the grammatical structure. Learning a semantic vector for each word is learning it's basis weights from the corpus. This setting offers geometric means to reason about semantic similarity, e.g. via Cosine measure. Grefenstette et al. [36] give some similarity calculations for some sentence pairs via inner product between the meaning vector of sentences. In Table 3, one may translate the first three sentences in Table 2 to Ukrainian and Spanish, and can consider grammatical sentences of any language, and determine the type of words in the sentences. One can verify   that the structures of these sentences are similar to English with similar circuits. Moreover, one may consider different parameters for synonymous sentences in different languages.
The method used to convert the DisCoCat diagrams into quantum circuits for our particular and simple examples, can be generalized to a recipe. Pregroups have been used to VOLUME 9, 2021   analyze the syntax of different languages, from English and French to Polish, and many more. For more references see [37], [38]. The proposed algorithm can be used for the translation of an English sentence to another language. In general, the recipe can work for all languages in the world, and not only for English and Persian.

A. QUANTUM ADVANTAGE
It has been shown that under certain conditions quantum algorithms for compositional QNLP demonstrate quadratic speedup over classical methods [39]. For example, an immediate advantage for quantum implementations of the Dis-CoCat diagrams [25] is gained by storing meaning vectors in quantum systems. Each n-qubit system with 2 n degrees of freedom, indicating that N-dimensional classical vectors can be stored in log 2 N qubits. Consider a corpus whose word-meaning space is given by a basis of the 2000 most common words, in the worst case one obtains the dramatic improvements, e.g. for one transitive verb 8×10 9 classical bits are required while only 33 qubits are required in QNLP, and thus for 10k transitive verbs 8×10 13 classical bits are required but only 47 qubits for this aim. Regarding the complexity comparisons for different closest vector algorithms, there is a quadratic improvement in scaling with M training vectors with the dimension of the vectors N , the complexity in classical regime is O(NM ) while in quantum is O( [39]. In QLSTM protocol, it has been shown that for certain testing cases, QLSTM converges faster or reaches a better accuracy than its classical counterpart, that paves the way toward implementation for sequence modeling on NISQ devices [31].

VI. RESULTS AND FUTURE DIRECTIONS
First of all, we highlight what have done well in the study. Then we give explanations and direct the future works.
• Developing a compositional vector-based semantics of a transitive sentence for a non-English language within a categorical framework.
• Comparing quantum circuits of two synonymous sentences in English and Persian.
• A quantum algorithm for language translation via quantum long short-term memory.
• The feasibility of a generalized algorithm for language translation for other languages. In this paper, we have suggested a protocol for translating simple sentences from English to Persian via quantum natural language processing. First, we extended the compact categorical semantics to analyse meanings of positive transitive sentences in Persian language. It is necessary to introduce linear maps to represent the meaning of negative transitive sentences and grammatically more complex sentences in Persian. The demonstration of DisCoCat diagrams and quantum circuits of complicated sentences in Persian or any non-English language remains to be studied in future works. Here, the two sentences 'Sara ketab ra mikharad' (in Persian) and 'Sara buys the book' (in English) are instantiated as parametrized quantum circuits. The meaning of the two sentences are the same but the appearance of the obtained quantum circuits are different. These circuits need to be compiled correctly, thus it is necessary to introduce a test measurement at the terminal of the circuits to give almost similar results for the meaning of the synonymous sentences in different languages. One may use the compiler t|ket to this aim, and run the circuits on the IBMQ and analyze the results. Finally, we propose an algorithm of using quantum long short-term memory for quantum natural language processing in order to translate the sentence above from English to Persian.
Here, we have proposed an algorithm in which one should first convert sentences to circuits and then to vectors. As a limitation, it does not convert the words to circuits and therefore not applicable for words, but it is good for texts which are composed of sentences. On the other hand, for implementation we should propose an algorithm that offer improvements to the quality of results, particularly for more complex sentences in terms of memory and computational requirements. As a future research prospect, one can perform different tests on IBMQ. For example, the meaning classification task of sentences in Persian and performing the translation algorithm. Future directions include implementation of Q-NLP tasks such as sentence similarity of any non-English language and using real-world data with a pregroup parser. Bidirectional encoder-decoder model, which predicts sentences in English from the previous and subsequent texts, is another interesting future research direction. Moreover, one can investigate the uncertainty of the proposed model based on what is widely discussed in the literature [13], [40], [41].

AUTHOR CONTRIBUTIONS
MA, VS and XZ have proposed the idea and consulted it with SSM and MZM, and all authors contributed to the development and completion of the idea, MA did the main part of the calculations, and the all authors participated in discussions and structure of the paper, and writing the manuscript. Her research interests include machine learning, deep learning, health informatics, information retrieval/filtering, recommender systems, sentiment analysis, and natural language processing. She is an Editorial Board Member of Web Intelligence and a reviewer of many journals and conferences related to her research fields.