Semantic Comprehension of Questions in Q&A System for Chinese Language Based on Semantic Element Combination

Question understanding is an extremely important part of the question and answer (Q&A) system, which is the basis of subsequent information retrieval and answer extraction. In order to improve the semantic understanding of question, we propose a method of understanding the question based on the combination of semantic element for Chinese language. Firstly, the method uses lexical and rules recognition to extract the semantic elements of questions and recognizes the functions according to the preprocessing pattern. Secondly, combining the dependency analysis tree structure of the question and the function type, it can produce the semantic elements. Finally, a normalized semantic expression was generated. We extract the user questions in Baidu-Zhidao as the test set. The result shows that the accuracy of the proposed method is 97.8%, so the semantics of the Chinese language questions can be effectively understood by this method.


I. INTRODUCTION
Nowadays, the focus of competition in information society is no longer how much the information we have, but how to get the needed information in the fastest speed and the most accurate way. Question and answering (Q&A) system is a kind of advanced information retrieval, which can answer user questions in natural language accurately and concisely. From the perspective of the system framework, the Q&A system is generally composed of three main parts: question comprehension, information retrieval and answer extraction [1]. Question comprehension, as the primary part, is the premise and basis of the whole Q&A system, which has a crucial influence on the performance of the whole Q&A system. Namely, wrong question comprehension can lead to wrong answers certainly, and more than 78.1% of errors in Q&A answer were caused by misunderstanding [2]. Therefore, how to effectively and correctly understand the semantic meaning of questions and transform questions The associate editor coordinating the review of this manuscript and approving it for publication was Wajahat Ali Khan . from fuzzy natural language to clear logical language is the key and breakthrough point of current research on Q&A system for Chinese language. Compared with Latin languages, the semantic understanding of Chinese question is more difficult. First of all, in the Latin languages, for example English, spaces are used as natural separators, while in Chinese, due to the tradition inherited from ancient Chinese, there is no separation between words. In addition, the boundary between word and phrase in Chinese is fuzzy from the sentence structure directly. For example, '' ''(''There are many plane in the airport'') is a typical Chinese sentence, and the phrase '' '' (''plane'') consists of two independent characters which can combine with many other characters leading to completely different meaning. In English, it is clearly for merely using a word ''plane'' usually. Secondly, there is no direct relationship between the parts of speech and the structures that perform grammatical functions in Chinese. Namely, the Chinese sentences are much more various, and the analysis method for English sentence can be fail when applying directly to Chinese language. Thirdly, the corpus-related resources for VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Chinese language are scarce and the construction is not good enough for Q&A system researching, and in the experiment, it is difficult to find a suitable questions set. Therefore, studying the semantic understanding of Chinese language is challenging and urgent.
Aiming at the semantic understanding of questions in Q&A system for Chinese language, this paper focuses on the transformation of natural language questions into normalized logical expressions. We start from the dependency analysis structure of questions and put forward the question understanding method based on the combination of semantic elements, as showed in Figure.1. Due to the lack of similar research reference, the input and output of this paper are natural language questions and combined semantic expressions, which are converted into semantic expressions for in-depth analysis. Therefore, firstly we need to identify the concept, entity, relationship, attribute, attribute value, and functional category corresponding to the question (pre-processing). Then, according to the dependency analysis structure and predefined combination rules, the identified semantic elements are combined. Finally, a structured question semantic expression is generated. Figure.1 illustrates the processing process with '' 747 A380 '' (''Which is faster, the Boeing 747 or the Airbus A380?'') as an example. The main work can be divided into the following three aspects: 1) collect and analyze a large number of user questions in related and relevant fields, summarize their dependent syntactic structure and language characteristics, and divide the lexical structure of the questions into eight complex functions and one basic type; 2) design a variety of complex attribute recognition rules to solve the complex attribute recognition in the user questions neglected in the current research; 3) creatively put forward the method of semantic element combination to understand the question semantics, take semantic elements as the basic unit, design basic rules and function rules to combine the basic unit, and generate logical expressions to obtain the user's query intention.

II. RELATED WORK
At present, the research on semantic comprehension of questions in Q&A system mainly focuses on the following four types for Chinese language: the method based on keyword matching, the method based on template, the method based on semantic relationship extraction, and the method based on semantic analysis.
Keyword matching method [3], [4] takes the set of keyword construction as the core. Firstly, the keyword phrases in the question are extracted. Then, the dictionary -based or statistical method is used to expand the keyword phrases, and a series of retrieval expressions are also generated to express the semantic meaning of the question. Namely, a series of retrieval expressions are generated to express question semantics centered on the extended keyword set. Simple keyword matching can only understand the semantics of the questions at a superficial level, which is difficult to deeply understand the user's query intention.
The method based on template [5]- [8] takes the pattern templates as the core, matching the questions to pattern templates in the pre-built template library at the first place. Then the semantic information of the question is extracted directly from the pattern template. This method does not involve complex semantic analysis, and can directly analyze users' query intention to obtain semantic information. However, this method needs to construct a sufficient number of templates in advance.
The method based on semantic relationship extraction [9], [10] takes relation as the core and extracts multiple semantic relation triples. By associating the nodes in the triples that point to the same entity, the combined semantic relation triples represent the question semantics. This method does not need to construct the question template in advance and can solve a large range of questions, but its query ability for complex question is still relatively weak, and it is difficult to achieve effective reasoning. Besides, its accuracy also needs to be further improved.
The method based on semantic analysis [11]- [13] focuses on normalized semantic expressions. The semantics of the question are analyzed by using the pre-designed semantic analysis method, and the question is transformed into a normalized semantic expression. This method adopts the logical normative semantic representation as the analytical form of the question, with clear logical relationship and semantic understanding of reasoning and solving complex question. Traditional semantic analysis [11] relies on manually annotated logical word lists, which can only carry out supervised machine learning in a small range. But it is difficult to deal with samples that have not been learned in supervised learning. In order to extend the application scope, a semi-supervised model is proposed to extend semantic analyzer [12]. This method firstly trains the supervised semantic analyzer through a predefined logical vocabulary. Then, the scope of semantic analyzer can be improved by expanding the vocabulary in the original semantic analyzer, which still relies on manually annotated logical vocabulary. Therefore, a new semantic analysis method is proposed to get rid of manual annotation [13], [14]. The multiple semantic expressions are obtained by using the DCS language. By designing the characteristic function and loss function, the existing question-answer pairs are utilized, and the training models are selected. Then, the semantic expressions are chosen to get the correct answer. Clearly, its accuracy depends on the quantity and quality of the question-answer pairs.
In this paper, the question comprehension method based on semantic element combination is proposed for Chinese language, which belongs to a method based on semantic analysis. This method can deeply understand the semantics of complex user questions by means of logical expressions, and not depend on a large number of sentence modules constructed manually. A normalized semantic expression of the question is directly generated through semantic element analysis and semantic element combination. It not only avoids the limitation of logical vocabulary on the application scope of the system, but also does not depend on the difficult collection of question-answer to the training model, so as to effectively solve the semantic understanding of complex questions in the Q&A system.

III. OVERVIEW OF METHODS
The set of Chinese natural language questions Q:{q i |q i ∈ Q} is given in Q&A system. The method in this paper accepts any input q i , and then outputs the corresponding semantic expression s i after understanding the question, where S:{s i |s i ∈ S} is the set of semantic expression of the questions.
The question comprehension method based on semantic element combination can be divided into the following three stages, as shown in Figure 2.
(1) Pre-processing: word segmentation, correction, part of speech tagging, stop words, dependency analysis; (2) Semantic element recognition: semantic elements in the question are extracted and labeled based on lexical database recognition and rules.
(3) Semantic element combination: First the types of functions contained in the question are identified, and then according to the basic semantic element combination rules and functions, the semantic elements are recombined to generate the semantic expression of the question.

IV. QUESTION COMPREHENSION METHOD BASED ON SEMANTIC ELEMENT COMBINATION A. PRE-PROCESSING
Before the semantic analysis of the question, pre-processing the question is necessary. The pre-processing process mainly includes the following parts: word segmentation, part of speech tagging, correction, deletion of stop words, dependency analysis.

1) WORD SEGMENTATION, PART OF SPEECH TAGGING
Word segmentation and part of speech tagging are the first stage of question comprehension, which should be the basis of subsequent processing. For Chinese question, the question text consists of consecutive word sequences. Namely, there is no natural separator between words. To correctly understand the semantics of user questions, identifying the boundaries between words accurately is necessary, which is the word segmentation. Part of speech tagging assigns an appropriate part of speech to each word in the question, such as noun, verb, and adjective. In essence, it is a typical sequence labeling issue.

2) CORRECTION
In the normalized form, there are not all of the questions asked by users. For example, the questions are interspersed with Pinyin, wrong characters, mixed Chinese and English, Arabic numerals in Chinese. Non-canonical forms may lead to errors in semantic element identification. So fixing the user question is an important process. It is possible to normalize the wrong way into the correct way through correction, solving the misunderstanding of the question caused by the form error.

3) DELETION OF STOP WORDS
Stop words refer to auxiliary words, modal words and partial pronouns that have no practical help in understanding a sentence, which often appear in Q&A system. If these function words are not filtered out, the effect and efficiency of question comprehension will be affected negatively. There are two ways to filter stop words, part of speech approach and a lexicographical approach. In practice, most of function words can be filtered out by using the first approach. But if the first approach doesn't work, the latter approach to filter can be used.

4) DEPENDENCY ANALYSIS
The purpose of dependency analysis is to reveal the syntactic structure of a sentence by analyzing the interdependence among its components. The central element in a sentence that governs the other components is the central verb, and the other component is in some way subordinate to the dominant. Through the dependency analysis, the semantic modification relation among sentence components could be found out, and the long-distance collocation information between sentence components could be obtained.
In this paper, LTP dependency analysis tool is adopted, and the complex attributes and semantic elements of composite questions are identified by the dependency analysis structural tree. Taking '' 747 A380 '' (''Which is faster, the Boeing 747 or the Airbus A380?'') as an example, LTP dependency analysis is conducted, and the results are shown in Figure 3.

B. SEMANTIC ELEMENT RECOGNITION
The purpose of semantic element recognition is to extract the semantic elements in the question and mark their types, which can provide support for the subsequent combination of semantic elements. There are two usual methods for identification. The first method is to construct relevant thesaurus for recognition. The advantage of this method is that as long as the information of thesaurus is comprehensive enough, its recognition accuracy is very high, while the drawback is that it is pretty difficult to expand thesaurus indefinitely. The other is to make rules and algorithms for identification. The advantage of this method lies in its generality, which can identify complex semantic elements with high coverage rate and wide identification types while the drawback is the limited accuracy. In this paper, semantic elements are identified by the combination of lexicon and rule. Firstly, semantic elements are identified by thesaurus, while words that fail to be identified by thesaurus are identified by rules. This method effectively guarantees the accuracy and coverage of the recognition.

1) SEMANTIC ELEMENT DEFINITION
In this paper, the data types of the knowledge base include entity (E), concept (C), attribute (A), relation (R) and attribute value (V ) [15], [16]. Among them, the entity (E) is the most basic element in the knowledge base, such as ' Based on the data types of the above knowledge base, depending on the type of attribute value, additional V d and V o represent numeric attribute value and object attribute value ({V d , Vo} = V ). Then, we defines the following two categories of semantic element forms: Unary semantic element: e and V , and e ∈{E, C} Binary semantic element: p, p ∈{A, R}. The main difference between A and R is: If the knowledge base triples are represented as < subject, 'relation', object >, the unary semantic elements are set: {subject, object} while the binary semantic elements are set: {' relation '} ('Relation' is a generalized concept, including attributes and relationships)

2) THESAURUS RECOGNIZES SEMANTIC ELEMENT
Lexical recognition is the first step of semantic element recognition. According to the types of semantic elements to be recognized, each type is established its own lexicon. The thesaurus used in this paper includes the following parts:

a: DOMAIN THESAURUS
This is the most important part of the lexicon, which is extracted from the elements in the knowledge base, including entity library, concept library, relation library, attribute library and attribute value library, and the above libraries respectively store the entities, concepts, relations, attributes and attribute values. To be sure, the library of attribute value mainly refers to attribute value library for the object type.

b: GENERAL WORD THESAURUS
To solve the difference between user vocabulary and knowledge base vocabulary, general word thesaurus is mainly used to improve the accuracy of the stage of high score word, and it can also ensure the feasibility of subsequent word recognition. This paper adopts the general word thesaurus in the LTP word segmentation program of Harbin Institute of Technology.

c: SYNONYM THESAURUS
This part represents a synonymy extension of the domain thesaurus. Synonyms are often used in the question expression by users in practice, so it is necessary to expand the synonyms of the domain thesaurus to improve the coverage and accuracy of recognition.

d: USER THESAURUS
Some users ask questions with partial acronyms, abbreviations or aliases, for example, the nickname of '' '' (''Boeing'') in Chinese is '' '' ('' Jumbo jet airplane''). Therefore, it is necessary to collect relevant acronyms, abbreviations and aliases in advance, and establish the corresponding user thesaurus of the domain thesaurus for expansion.
The main steps to identify semantic elements in the lexicon can be summarized as followed: receive the preprocessed question, identify and marks {e, p, E, C, A, R, V, V d , V o } based on established lexicon, where the subcategory and the parent category can be marked a same semantic element simultaneously, for example, '' 747'' (''Boeing 747'') is marked as <e, E >.

3) RULES IDENTIFICATION FOR SEMANTIC ELEMENT
Although thesaurus recognition can cover many semantic elements, some complex semantic elements are difficult to be identified, such as '' '' (''How far''), whose corresponding attribute A is '' '' (''Distance''). Due to the complexity of the structure, these semantic elements are difficult to be recognized by thesaurus, and their recognition requires the help of part of speech, predefined text structure, text pattern. In this paper, the following rules are formulated to assist semantic elements of thesaurus recognition: Recognition of Entity E, Concept C, Relation R: Entity identification and relationship extraction are the key technologies in information extraction. Named entity recognition based on conditional random field (CRF) and relationship extraction based on rule and heuristic algorithm were selected for E, C, and R identification.

Recognition of Attribute Value V :
The attribute value V to be recognized consists of object type attribute value V o and numeric type attribute value V d . The method of lexicon recognition can effectively identify so many object type attribute values V o . The partially uncovered values need to define rules for identification. Therefore, the following heuristic rules are defined to supplement the attribute value V of thesaurus recognition:

b: RECOGNITION BY PATTERN
A word marked as a numeric attribute value V d must contain some numeric data. But these words are sometimes not so obvious or direct. The value V d has its special mode in the question, which combines the keyword mode with the dependency analysis structure. So, the following mode is defined for V d recognition: Keyword

Recognition of Attribute Type A:
Complex attribute recognition is one of the most difficult parts of semantic element recognition. From the current research, most of the papers to deal with the process are generally simple word recognition. The experimental results of this paper show that 77.08% of the errors in semantic expressions caused by complex attributes fail to recognize the combinations in question comprehension, which indicates the importance of complex attribute recognition. A large number of questions are collected from Baidu, through statistical analysis the following complex attribute recognition rules are formulated:

c: IDENTIFY ATTRIBUTES BY E/C AND PART OF SPEECH RULES
Most attributes are closely related to entity E and concept C, and there are rarely separate attribute units. Therefore, the following recognition patterns can be defined by combining part of speech and recognized E/C: (E/C/ )word (n) + (len(word (n) ) > 1) where ''+'' represents a combination of conditions, and ''()'' means it is required, ''/'' represents the mean of ''or''. word (n) represents the noun word. The meaning of the above model is that it needs to meet two conditions: 1) Specific lexical structures; 2) Chinese character length (word (n) ) shall be at least two.   '' (Length). Sometimes, however, the implicit attribute in the query block could be repeated with the previous, for example, '' E_A '' (''What is the length_A of the passenger plane?''). Therefore, if it is repeated, recognized attribute should be ignored. The complex attribute recognition of query block consists of three parts:

Part 1: query Block Confirmation:
The simple definition of the query block is the semantic block containing the query word in the question. In order to extract query block and determine the coverage of query block, this paper proposes a retrospective method based on dependency analysis tree, which mainly includes two steps: The first step is to locate query words in the dependency analysis tree and obtain the following query words by statistics: Query words without real meaning: The second step is starting with the query word, and going back up or down to the first E, C, R or A, and then stopping backtracking. The node passed in the forward output backtracking process is the query point block.

Part 2: Complex Attributes Recognition:
The identification of complex attribute is based on the obtained query block. First, the implicit attribute is extracted from the query block. Then, whether the verb is included  '' (''manufacture address''). The complex attribute recognition process is as follows: As the first step, based on the corresponding dictionary of query block-attribute, the attribute of query block is extended. In this paper, some query block-attribute corresponding dictionaries are collected, as shown in table.2.
The second step is to determine whether the question block contains a verb. If a verb exists, the query block is extended according to the attributes obtained above. The extension means is that the verb comes first and the attribute comes last. For example, the query block '' (V)'' (''How long to complete (v)'') → the attribute '' '' (''the finish time''), the query block '' ''(''manufacture address'').

Part 3: Attribute Repeat Judgment:
The attribute A prime obtained by the above complex attribute recognition may duplicate the previous attribute A in the question. So it is necessary to judge the attribute repeat. The judgment process is mainly based on the dependency analysis tree, which allows us to determine whether the identified attribute A' and its most recently connected attribute A in the dependency tree structure are repeated. If repeated, the attribute A' will be ignored. If not, attribute A' will be returned and the node part of query block in the dependency analysis tree will be replaced with attribute A'.

C. SEMANTIC ELEMENT COMBINATION
The purpose of semantic element combination is to combine the identified semantic elements through certain rules to form a standardized semantic expression to represent the question semantics. In order to ensure the accuracy of semantic element combination, a combination method based on dependency analysis tree is proposed in this paper. This method uses the dependency relationship between semantic elements and the structure of dependency analysis tree to identify and replace function units. The dependency analysis tree is then simplified to eliminate the nodes of function units and semantic elements. Finally, according to the depth of the tree, this method can combine layer by layer from leaf nodes according to the rules (if leaf nodes are attributes or relationships, this method allows to skip some nodes and connect to entities or concepts) to get the semantic expression of the question. The combinations of semantic elements fall into two categories: basic combinations and functional combinations. The basic combination is executed firstly, and the function combination is executed later. Through the implementation of two types of combination, the semantic expression of the question can be accurately generated and effectively expressed.

1) BASIC COMBINATION
Basic combination is the basis of semantic element combination. It mainly performs simple concatenation and combination of unary semantic elements and binary semantic elements of non-functional elements, and does not involve complex structural combination. According to different combination types, the following basic combination rules are formulated in this paper: : giant aircraft) means the category of entity ''e?'' has both attribute '' '' (''passenger plane'') and attribute '' '' (''giant aircraft''). The activation conditions of "∩" for the combination rule are as follows: 1. The entities ''e1 e2 e3'' are connected to the same entity ''e'' in the dependency analysis tree, such as e1-e2-e3, and the semantic expression is e1∩e2∩e3.
3. The structure in the dependency tree is C1-C2.

d: CLASS INCLUSION
The class inclusion contains a combination of rules between C and E, which has the highest priority in the base combination. However, only C and E that meet the activation conditions can execute the combination, and the combination result is C[E], which represents E of category C, whose activation rules are as follows: 1. The dependency tree structure is E → C and the dependency relationship is ATT; 2. E and C are adjacent to each other in the user question; 3. The dependency tree structure is E-[ ] -Cquestion word, where ''[]'' means that the contents in parentheses are optional. ( : have/has; : be) e: OTHER SPECIAL STRUCTURES Some special dependency tree structures need to be modified and/or perform special semantic combinations, mainly including the following three types: 1. E-A1-A2 is modified to A1-E-A2. 2. As for the structure of e-R1-R2, it executes special rule of semantic combination as e.R1.R2.
3. As for the structure of e-R-A, it executes special rule of semantic combination as e.R.A.
Function Combination: Function combination is a high order operation of semantic element combination. Based on the basic combination and predefined function combination rules, it uses dependency analysis tree structure to generate semantic expression with function body to understand complex question semantics. For example, the semantic expression of '' '' (''What is the fastest speed VOLUME 8, 2020  ?'' (''How much faster is the Boeing 747 compared to the Airbus A380?''). The former focuses on fast entities, corresponding to the function Com-pareSelect() while the latter is concerned on how much faster the velocity is, corresponding to the function CompareValue Therefore, it is necessary to confirm the query block type before function recognition. In combination with the keyword set, this paper defines the query block type as follows:

3). Function recognition and semantic element combination
The difficulty of function combination lies in function recognition. In order to accurately identify the functions in the questions, this paper proposes a pattern-based function recognition, which is characterized by the types and numbers of semantic elements in the questions, such as the types of query blocks, keyword set, structure of dependency tree, and the dependency relationship, and formulates the rules of semantic elements combination according to different functions.
In order to obtain the ideal function combination effect, this paper analyzes and summarizes a large number of user questions. Based on the features of query block type and dependency relationship, a set of accurate recognition patterns covering most users' question expressions is established for each function. And the following symbols are set to accurately describe the pattern structure: "[]" means optional, "()" means required, and "/" is the same to the meaning of ''or''. "+" represents a combination of multiple conditions. "-" represents a bidirectional dependency connection while ''→''''←'' represents a one-way dependency connection. "NumE" represents the number of entities E in the question. Taking comparative functions (belongs to comparison class) as an example, the function recognition pattern and semantic element combination rules established in this paper are as follows: CompareBase  Figure.4 based on the dependency tree structure. Attributes are generally associated to entity E by a basic combination after they are joined to the entity. Therefore, the position of attribute A is not marked in the Figure.4, only the entity structure associated with the function is marked.
CompareSelect  (The meaning of Chinese character is same to 'which/which one'; is reflect 'more' or comparative degree ) Positioning to replace: Key_cmp/query block is replaced by CompareSelect().
Combination rules: the combination rules are same to that of CompareBase(). Since there is no suitable test set similar to QALD in Chinese, in order to test the question comprehension method, this paper collects question sets by means of Baidu Zhidao. Firstly, the entity set and the entity-relation set were constructed by combining the knowledge base, and the question set was retrieved in Baidu Zhidao. Then, based on this question set, it was searched again in Baidu Zhidao to collect synonymous questions. Finally, the extraneous and repetitive questions were eliminated. A total of 8,732 questions were collected, and 6,215 questions were obtained after removing questions with similar structures. 1000 questions were randomly selected as the final test set, and 5,215 questions were selected as the auxiliary analysis set.

CompareValue(E.A)
In terms of experiment, the language technology platform (LTP) [17] tools of Harbin Institute of Technology does not support additional thesaurus to assist word segmentation, so this paper the Jieba word segmentation tool is adopted, and semantic elements are added to identify all thesaurus. Jieba word segmentation uses dynamic programming to query the maximum probability path to find the maximum shard combination based on word frequency. For unregistered words, it uses HMM model based on Chinese character word-forming ability to identify them. The language technology platform (LTP)-3.4.0 is adopted to analyze the part-of-speech tagging and dependency, whose support vector machine (SVM) is selected as the basic classifier to tag part-of-speech, with relatively high labeling accuracy. In addition, for the problem of data sparsity, especially the unregistered words identified in the word segmentation stage, it introduces the characteristic of Chinese characters' partial radicals which has good generalization ability. The dependency analysis adopts the decoding algorithm based on columnar search and the two-stage syntactic analysis method based on punctuation. The efficiency of syntactic analysis is greatly improved without loss of accuracy. The thesaurus is an extension of the thesaurus forest of the social computing and information retrieval research center of Harbin Institute of Technology.

B. EVALUATION STANDARD
First, the semantic expression Fm of each test question is given manually. Then the semantic expression Fc is obtained by using the method in this paper. Then we compare the string similarity between Fm and Fc. If it is exactly the same, Fc is considered correct; otherwise, it will be judged manually by 7 people. When more than half of the people agree with Fc, Fc is correct; otherwise, it is wrong. The experimental effect was evaluated by the accuracy rate P. The calculation formula is as follows: The number of correct semantic express Total number of test questions (1)

C. RESULTS AND ANALYSIS
Experimental analysis was conducted on 1000 test questions, and 4 groups of experiments were set for comparison according to the different methods of semantic element recognition. The first group (1#) only used the thesaurus to identify semantic elements. The second group (2#) used thesaurus To further analyze the cause of the error, 1000 test questions were counted by function category, as shown in Figure 5. (The base combination class does not contain any function types). The effect of fourth group of experimental methods is shown in Figure 6.
Through experiments, it can be found that the forth method can completely and accurately understand the questions of concept class, basic combination class and calculation class. For comparison class, reason class and choice class, the effect is slightly worse. The reasons are as follows: 1) Dependency analysis errors cause semantic composition errors. The questions of concept class, basic combination class and calculation class do not involve complex dependency analysis structure, so they are not affected by it. However, other functions are more dependent on the dependency analysis structure and are more affected by it.
2) The complex functions are recognized wrongly, for example, the comparison functions, which have complex identification rules.
3) The user's questions are too complicated. Some user questions may contain multiple sub-questions and complex referential relationships, leading to errors in semantic element recognition.

VI. CONCLUSION
This paper proposes a question understanding method based on semantic element combination. Through semantic element analysis and semantic element combination, the semantic semantics of the questions are understood and the normalized semantic expressions of the questions are directly generated. It not only avoids the limitation of the logical vocabulary on the scope of application of the system, but also does not depend on the difficult collection of question-answer training model. The user questions extracted from Baidu Zhidao are used as test sets for experiments. The results showed that this method had high accuracy in semantic understanding. It can solved the difficulty of question semantic analysis in Q&A system, and can also understand and reason complex questions effectively. The accuracy rate of all types of questions on the test set reached 97.8%, among which the accuracy rate of non-function types reached 100% and that of function types reached 96.8%. This paper tries to give the relevant researchers a potential method which can be effective to promote the in-depth development of question semantic understanding and Q&A system, and lays the foundation for the next step to convert the logical expression into a machine language (such as SPARQL), and further execute the corresponding handler to achieve a more intelligent knowledge data.
Although the method in this paper can understand the vast majority of questions, some complex questions are still misunderstood. Through careful analysis of the reasons affecting the experimental results, it is found that most of these errors are related to the function pattern and referential relationship. Since 2013, he has been with the Science and Technology on Near-Surface Detection Laboratory, Wuxi. His research interests include signal processing, signal detection technology, and intelligence algorithms.
Dr. Ding is an Editor of the Journal of Ordnance Equipment Engineering.
KAI DU, photograph and biography not available at the time of publication.
XIAONAN ZHANG, photograph and biography not available at the time of publication. VOLUME 8, 2020