TBLC-rAttention: A Deep Neural Network Model for Recognizing the Emotional Tendency of Chinese Medical Comment

In the current paper, a hybrid depth neural network model, TBLC-rAttention, aiming at Chinese text emotion recognition, is proposed to identify the emotional tendency of the Chinese medical reviews. The model includes the following steps: acquiring and preprocessing the Chinese corpus; mapping the preprocessed text into the word vectors; using Bi-directional Long Short-Term Memory network (Bi-LSTM) with the attention mechanism to acquire the context semantic features of the text; using Convolutional Neural Network (CNN) to obtain local semantics features on the basis of the context semantic features; and inputting the final feature vectors into the classification layer to complete the task of emotion recognition and the classification of the Chinese medical reviews. In this experiment, the corpus data is the comments of 999 cold medicine on a large e-commerce platform. All corpus are divided into three types, including high praise, medium praise and bad review. Classical machine learning models (SVM, NB) and neural network models (CNN, LSTM, Bi-LSTM, BiLSTM-Attention and RCNN) are performed as the comparison benchmarks to assess the category performance of TBLC-rAttention model. All the results were obtained when the training accuracy and test accuracy were stable after 1000 cycles of repeated calculation. The results show that TBLC-rAttention can get better text feature than the reference models, and the text classification accuracy reaches to 99%. In conclusion, the TBLC-rAttention model can identify semantic feature information to the greatest extent. In addition, this study also completes the numerical quantification of the predicted results.


I. INTRODUCTION
In recent years, the Internet has become an important way for people to obtain information in their daily life. Statistics released by the China Internet Network Information Center (CNNIC) show that by June 2019, China's Internet penetration rate had reached to 61.2%, with 854 million Internet users and 660 million online shoppers. The textual data generated by the network is increasing at an alarming rate, so how to effectively organize and utilize the textual The associate editor coordinating the review of this manuscript and approving it for publication was Dong Shen . data has become a problem that people can pay more and more attention to. The text classification technique of natural language processing (NLP) is an effective solution [1], which is a process of training classifiers using text data and then classifying new text based on the trained classifiers.
Text emotion recognition, also known as sentiment analysis [2], referred to making judgments on the opinions of the text through specific methods according to the meaning and the emotional information expressed by the text, and then dividing it into different categories of emotions [3]. So, it can be regarded as a classification task, which is one of the core research directions of NLP field and the core content VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ of the network public opinion identification. Although traditional text emotion recognition methods have achieved good results in the acquisition, analysis and guidance of the network public opinion, problems such as time-consuming, low accuracy, poor timeliness and unstable performance also appeared in the processing of the massive network public opinion data with complex emotion characteristics [4]. However, the emerging of deep learning provides an effective way to solve these problems, which can better capture the hidden features of data and complete the task of emotional classification.
Recently, the amount of medical data has been increasing exponentially, and there is a mass of information contained in the online comments related to the medicine. However, few studies have applied the deep learning-based text emotion recognition technology to medical reviews. In this paper, an emotion recognition scheme, hybrid neural network (TBLC-rAttention) model, which based on deep learning method is proposed for Chinese text, and was verified in the corpus of Chinese. The corpus is consumers' comments from e-commerce platforms after purchasing and using the cold medicine. This study utilized this model to identify consumers' emotional tendency and aimed to further promote the application of NLP technology in the Chinese medical field [5]- [7].
The structure of this paper is as follows: Section 1 is the introduction. Section 2 is the related work, which briefly introduces the current typical methods of the emotion recognition and their shortcomings, points out the advantages of deep learning methods. Section 3 introduces in detail the Chinese medical text emotion recognition model proposed in this paper. Section 4 is the experimental part. Section 5 introduces and discusses the experiment results. Section 6 describes the conclusions and the future work.

II. RELATIVE WORK
The opinions expressed by netizens contain their own subjective emotional tendencies, which affect the trend of online public opinions and thus affect people's lives [8]- [11]. Text emotion recognition technology provides the opportunity to mine hidden information from different angles. Consumers decide whether to buy a drug based on the others' online comments. Drug developers can further improve drugs by analyzing consumer reviews of drug efficacy [12]- [14]. It indicates that the analysis of emotional tendency of the network comments plays an important role in public opinion guidance, marketing strategy formulation, government opinion survey and other aspects.
Generally, there are two mainstream ways to mine text emotion. One is machine learning-based method [15]- [17], the other is dictionary-based method [18], [19]. The dictionary-based method is to analyze the text emotion by contrast according to the established dictionary, so as to obtain the emotional polarity. This method has a relatively high accuracy, but relies heavily on the emotional dictionaries and costs considerable efforts on the dictionary construction.
Moreover, it only calculated the positive or negative polarity values of the words [20], so this method not only fails to take the context meaning into account, but also is too simple to deal with the strength of emotions. The machine learning-based method is mainly to extract text features and classify emotions using classification algorithm. It firstly converts the text into digital information that computers can process, then uses artificial intelligence algorithms to train and categorize text emotion, and achieve better effect results than the dictionary-based approach. However, with the rapid increase of the network data, this method presents some problems such as text represents high dimension, high sparsity, weak feature expression ability and degraded classification performance [21], mainly because its performance is highly dependent on the extraction of classification features. Moreover, the massive text information and the emerging vocabulary on the Internet make it more difficult for traditional machine learning-based methods to fully extract accurate text features. However, with the development of artificial intelligence technology, deep learning provides a feasible solution to solve these problems.
Deep learning was first proposed by Hinton and other scholars in 2006 [22], which originated from the study of the neural networks. Traditional neural network learning algorithms, which are generally composed of single input layer, hidden layer, output layer and other low-level network structures, the features obtained after learning are monolayer. Deep learning adopted neural network learning algorithm, usually includes one input layer, several hidden layer and one output layer. It has high nonlinear operations level, thus can be regarded as a simulation of the nerve connections in the human brain structure. By transforming the original features layer by layer, the new feature space and the hierarchical features are obtained, and the classification is then realized. Therefore, the deep neural network can better capture the hidden features of corpus, eliminate useless information but retain useful information for classification task, and finally obtain better model performance. So far, deep learning has been widely studied in academia and applied in image classification, speech recognition and other directions [23]- [28]. It also been applied in NLP tasks such as automatic translation [29]- [31].
Many text emotion recognition methods based on neural network models emerged in recent years because of the rapid development of in-depth learning [32], [33]. Recurrent Neural Network (RNN) is good at capturing text information in time-series data [34]. However, it is time-consuming in processing data because of the complexity of the network structure caused by the recursive loop property of RNN itself. In addition, RNN also has the problem of gradient explosion and gradient disappearance [35]. Schuster et al. proposed a variant of RNN named Bi-directional Long Short-Term Memory network (Bi-LSTM) [36], which can not only process longer sequence information, but also capture the context information better through bi-directional structure. Although Bi-LSTM has alleviated the problem of gradient explosion and the gradient disappearance, it makes calculated amount increase further. The Convolutional Neural Networks (CNN) [37] is characterized by time-saving because of sparse connection and parameter sharing. However, long sequence information cannot be obtained due to the fixed convolution kernel. Later, Kalchbrenner et al. used wide convolution instead of narrow convolution and adopted K-max pooling to solve the problem of sequence length limitation [38]. In a word, CNN and RNN show their respective advantages and disadvantages with their own network structure characteristics in the different tasks. Therefore, researchers put forward the TextRCNN model [39], which combines the advantages of CNN and RNN. Although the accuracy has only increased by 1%, the design idea of the model is worth learning. At present, Attention mechanism can distinguish the different importance of each word in a sentence when categorizing. The introduction of attention mechanism strengthens the ability of the neural network to mine data information, which opened its application in the NLP field [40]- [43].
People are paying more and more attention to health, previous study has found that more than 81 percent of the Internet users in the United States search for health information online [2], which indicated that mining the hidden information in the comments of these social network sites has become an urgent problem. However, NLP technology has not been widely applied in the Chinese medical field. One of the reasons is because most of the research results of emotion recognition study in medical field have had been written in English. The second reason is because although medical text contains a lot of valuable information, more difficult to handle. Medical text contains a large amount of unstructured free text and image information. The doctors themselves input the text may lead to spelling mistakes. Abbreviations for medical terms and idioms used by different doctors in different regions can lead to the obtained information cannot be effectively recognized by the computers. Therefore, deep learning and NLP may play an important role in the analysis and mining of medical texts. This study uses a hybrid deep neural network based on supervised learning to realize text emotion recognition in the field of Chinese medicine.

III. METHOD
Using existing information to better understand corpus and find out the effective features is the core problem in the task of text emotion recognition. This study proposed a hybrid deep neural network model, TBLC-rAttention, aiming at Chinese text emotion recognition (Fig. 1). The model combines Bi-LSTM and CNN with attention mechanism. In a gradual way, it first obtains contextual semantic features, then extracts the local semantic features, at the same time give feature information at each moment r different weights, thus get better text feature representation and further improve the accuracy of the text classification.
The model consists of seven parts: (1) Input layer: obtaining text data. (2) Preprocessing layer: word segmentation and removal of the irrelevant data. (3) Embedding layer:  Fig. 2 is the process Chinese text emotion recognition using TBLC-rAttention model, and steps are summarized as follows: 1) Obtaining text data.
3) Mapping the pre-processed text data into word vectors. 4) Building a hybrid neural network text emotion recognition model. 5) Establishing the cost function and training the text emotion recognition model using the random gradient descent method. 6) Calculate the emotional polarity value of the text. 7) Validating the effect of the model based on validation set text data. 8) Application of the TBLC-rAttention model.

A. PREPROCESSING LAYER
The obtained text data is preprocessed in the following way: 1) Cleaning the text data, including deleting irrelevant data and duplicate data, processing outliers and missing data, such as HTML web page tags, punctuation marks, special emoticons, etc. Thus, the information irrelevant to classification is preliminarily filtered out. 2) Categorizing and labeling the text data.
3) Using jieba to segment words, during which the user dictionary is loaded. Some medical terms are added to the dictionary, as shown in Table 1. In order to improve the accuracy of the word segmentation, we use the fastText neology-finding algorithm to identified the unregistered words and add them to the dictionary. Then, the emotion polarity of the text is determined by the emotion extremum calculation algorithm used, and then it is added to the semantic emotion dictionary. In addition, some common stop words are removed in the process, as shown in Table 2. In order to make the model suitable for long text data, we used TextRank algorithm to remove irrelevant data from the text that ends in the segmentation. 4) Dividing the pre-processed data into three parts: training set, test set and verification set.

B. EMBEDDING LAYER
The premise of using NLP to process text is to transform text data into a vectorized form that can be recognized and processed by computer [44]. The text vectorization mapping process is shown in Fig. 3 By mapping the word embedding matrix E w , the batch text data with labels is mapped to a three-dimensional word vector matrix M , which contains two parts: comment content D and label content L.L = {High praise, Medium praise, Bad review}, the word embedding matrix can be obtained by Word2Vec or NNLM method. Then, a text containing n words D j = {x 1 , x 2 , . . . , x n } can be expressed in the following form: where M ∈ R batch×n×d and E W ∈ R V W ×d denote the word embedding matrix, V W denotes the dictionary size, batch is the number of text data read in batches. d denotes the word vector dimension. Each word in the word embedding matrix E W has a unique index b xi for retrieving its corresponding word vector, which is a binary vector whose value is 0 or 1. All positions, except the xi position, are zero. These vectors represent the most primitive information in the corpus and have a significant impact on the next steps.

C. BILSTM LAYER
LSTM can overcome the shortcomings of RNN [34], and its unit structure is shown in Fig. 4. LSTM adjusts information through the gate structure and stores historical information through the storage unit. It is mainly composed of four parts: input gate i t , forgetting gate f t , output gate o t , candidate gate g t . Using LSTM to extract semantic features, the hidden layer state h t in a single direction at time step t is updated as follows: σ and tanh represent activation functions. Each part will influence data at the next moment. Using the current word vector e i and the hidden state h t−1 of the previous moment as input, the unit structure determines whether the current state uses these inputs, and whether to forget the previously stored partial data, and whether to output the new generated state. Thus, the current unit state c t is determined by calculating the data of the unit state c t−1 at the previous moment and the data currently generated by the unit.
Nevertheless, LSTM network only considers the temporal information while ignoring the contextual information. Bi-LSTM network expands the one-way LSTM network by two layers of network structure (forward and reverse), and guarantees that the past and future information can be taken into account in the time series [35]. Therefore, the global semantic features of the text can be fully captured. As shown in Fig. 1, C b0 and C f 0 represent the forward and reverse initial unit state, respectively, C bn and C fn store the forward and reverse final unit state information, respectively.
When the word vector is input into the forward Bi-LSTM network, the forward hidden layer semantic feature − → h i will be obtained. Similarly, the backward hidden layer semantic feature ← − h i can be obtained if the word vector is input to the reverse Bi-LSTM network. The context embedded representation h i at the i-th moment in Bi-LSTM network is a concatenation of the forward output − → h i and the inverted output ← − h i . The global semantic features H are obtained by splicing the semantic states of each time step, as shown in formula (10).
where H ∈ R batch×n×2d , ⊕ denotes concatenation. n denotes time step, which is equal to the maximum sequence length of text.

D. ATTENTION MECHANISM
Attention mechanism is a resource allocation system, which simulates the characteristics of human brain attention and pays more attention to the important information. The introduction of attention mechanism in NLP can highlight the impact of input on output [17]. In this paper, attention mechanism is introduced after Bi-LSTM network to generate weighted global semantic features V with attention probability distribution a, so that highlights the influence of different features in input global semantic features on text categories. The attention mechanism model used in the current paper is shown in Fig. 5.
where a ∈ R batch×r * n×2d , m i ∈ m, m j ∈ m, r denotes the number of attention schemes for each context. W a1 ∈ R d×n is the global attention weight matrix, b a is the global attention bias matrix, W a2 ∈ R r * n×d denotes the different attention scheme matrices for each context, and a i is the global semantic feature attention probability distribution at the i-th moment. The larger value of m, the more important global semantic features of the moment are. After obtaining a i and h i of each time, V i is calculated by multiplying a i and h i . Then, V can be obtained by splicing all V i .

E. CNN LAYER
By inputting V into CNN, the local features can be extracted. The wide CNN model used in text categorization is shown in Fig. 6. Each convolution generates a new featureĈ i through a fixed size window. After convolution, the j-th text contains local and global semantic features C j : 96816 VOLUME 8, 2020 where C j ∈ R (r×n−h+1)×2d , W vi ∈ R 2d×h denotes the convolution kernel vector used in convolution operation, h and 2d denote the height and width of the convolution kernel window respectively. V i:h denotes the weighted global semantic eigenvalues of lines i-th through h-th, corresponding to the feature i-th through h-th. b vi denotes bias. Then, the maximum pooling method is used to get the final feature representationc j of each text. After obtaining all ofc j , the final text feature vector representation C of the batch of text data is obtained.

F. OUTPUT LAYER
Finally, C is classified into the categories of emotional tendency through the Softmax classifier. Taking C as the input of the classification layer, the classification layer connects the best feature C d to the Softmax classifier by dropout method, and calculates the output vector p(y): where p(y) ∈ R batch×classes , W c ∈ R 2u×classes and b c denote the weight parameters and bias terms of the Softmax classifier, respectively. The classes represent the number of text categories. C d ∈ R batch×2u , which is the best feature generated by C through dropout. Softmax classifier is used to calculate the probability vector p(y) of text data belonging to every category. It is a vector whose dimension is the number of the categories. Each dimension of the vector is a number in the range of 0 to 1, representing the probability that the text belongs to each category. Then, the type corresponding to the maximum probability y is selected as the predictive output of text classification.
Through the classification of classifier layer, the whole model achieves the task of categorizing text data.

G. THE DIGITAL REPRESENTATION OF THE PREDICTIVE RESULTS
In order to measure the text classification results digitally, this study used a dictionary-based unsupervised method to calculate the emotion value of text classification results. The method first uses various dictionaries to identify emotional words in the text, and then calculates the overall emotional value of the text according to the judgment rules. Its emotional calculation process is shown in Fig. 7.

1) BASIC EMOTION DICTIONARY CONSTRUCTION
Emotional words referred to the words that can express emotional tendency. They are the basis of emotion calculation process, including nouns, verbs, adjectives, adverbs and idioms. This study used the NTUSD [17], the ontology database of Chinese emotion vocabulary, and the Hotnet emotion dictionary (www.keenage.com) as the basic emotion dictionaries, and combined the positive and negative emotion words. Then, we filtered out emotion words whose polarity is not obvious, emotional words with ambiguous emotion tendency, as well as repeated emotion words. Finally, the basic emotion dictionaries of positive and negative emotional tendency were obtained, respectively. The specific construction process of the dictionary is shown in Fig. 8.

2) EMOTION DICTIONARY EXPANSION
The extension methods of emotion dictionary mainly include manual construction and automatic construction. Since the manual construction method consumes more manpower, material resources and time, the automatic construction method is adopted in this study. The unregistered words are used to expand the emotion dictionary, and the process is shown in Fig. 9. Unregistered words recognized by the fastText neology-finding algorithm in the pre-processing.
The vector representation of post-word segmentation corpus and basic emotion dictionary is obtained by using the open source tool word2vec of Google. Then, the emotion tendency of the unregistered words is identified by calculating the semantic similarity distance of corpus and basic emotion dictionary. After that, the unregistered words are classified into the nearest basic emotion dictionary (positive or negative) by polarity judgment algorithm, so as to automatically expand the dictionary of basic emotion dictionary. The semantic similarity distance can be calculated by calculating the cosine value of two word-vectors. The larger the cosine value is, the more similar the semantics of two words are. The formula for cosine of two N-dimensional vectors a = (x 1 , x 2 , . . . , x i , . . . , x n ) and b = (y 1 , y 2 , . . . , y i , . . . , y n ) is calculated as follows: The polarity judgment algorithm is described as follows: (a) Word2vec is used to train the preprocessed corpus and the basic emotion dictionary respectively, and the corresponding vectorized representation files are obtained.
(b) If the word w in the corpus can be found in the basic emotion dictionary, directly transferred to (f). Otherwise, go to (c).
(c) Getting the n words closest to the unregistered word w in the corpus vectorization representation file by word2vec.
(e) Storing w into the corresponding emotional dictionary (positive or negative).
(f) Analyzing the emotion polarity of the next unregistered word.

3) CONSTRUCTION OF OTHER DICTIONARIES
The dictionary-based text emotion calculation method failed to consider the function of other categories of words, such as the negative words. So, it is necessary to construct other dictionaries. This article mainly through the CNKI and manual working to construct other dictionaries. Negative words, for example, searching the semantic elements that contains negative meaning, such as {neg | }, {deny | }, {BeUnable | }, {unable | } etc., extracting concepts that containing negative semantic elements, and collecting them to obtain the final Negative word dictionary. The final information of each dictionary in this study is shown in Table 3. Then, the emotional value of the corpus is calculated based on these constructed dictionaries.

4) TEXT EMOTION VALUE CALCULATION a: CALCULATING THE EMOTIONAL VALUE OF EACH WORD IN THE SENTENCE
In the process of emotion value analysis, the subjective sentence should be distinguished from the objective sentence. The subjective sentence refers to the sentence with certain emotional tendency or some emotional color, whereas the objective sentence refers to the sentence without any emotional color, which needs to be filtered out. Matching each sentence after word segmentation with emotion dictionary. If emotion word appears in the sentence, the sentence could be judged as subjective sentence. Then, emotion values are assigned to each phrase in the subjective sentence. The weight of emotion values in various dictionaries is shown in Table 4. In addition, if sentiment words are dynamic sentiment words, it is necessary to judge whether there are collocation words nearby. If so, the polarity of the sentiment word is need to be inverted, and consider the word as a new sentiment word for polarity calculation of the text. At this point, the emotional value of each word in each sentence is obtained.

b: CALCULATING THE AVERAGE EMOTIONAL VALUE OF EACH SENTENCE IN THE TEXT
The weighted sum method is used to calculate the average emotional value of each sentence. After preprocessing, each sentence is represented as S = {W 1 , W 2 , . . . , W z , . . . , W n } .
The average emotional value of each sentence can be expressed as: (24) where Neg is the number of negative words in the sentence, n is the number of sentiment words in the sentence, EO(W z ) represents the emotional value of each word, m is the modifier of the i-th emotional word, and mod j is the weight of the corresponding j-th modifying word. In order to eliminate the effect of the text length on scores, the sum of scores is divided by the total sentence L S , and final be the emotional score of the whole review text. In this paper, the threshold T is set to 0. When the emotional Score(D) is greater than T , the text is positive. When less than T , it is negative. otherwise, it is neutral. The formula is as follows:

IV. EXPERIMENTS A. EXPERIMENTAL CORPUS
In this experiment, crawler technology was used to get corpus data, which is the comments of users on a large e-commerce platform after purchasing and using 999 cold medicine. The experimental corpus is shown in Table 5. Fig. 10 presents the sentence length distribution of corpus. Fig. 11 shows some key information of corpus in the form of word clouds.

B. EXPERIMENTAL SETTINGS
The experiment was completed in Python language under the Tensorflow deep learning framework. The experimental environment configuration information is shown in Table 6. VOLUME 8, 2020   Word vector dimension represents the size of text features. The more features, the more accurately words can be distinguished from words. We set the size of word vector to 100. As Fig. 10 shown, the length of the text sequence is mostly distributed within 50 words, so the maximum length of the text sequence can be set to 50. We use three types of comment text data in this paper, including high praise, medium praise and bad review. The total number of Chinese characters has exceeded 80,000, while only 3500 words are commonly used, including 1000 secondary common words. In order to preserve the content of text data as much as possible, the size of dictionary is set to 5000.
One layer of Bi-LSTM neural network was used, and the number of forward and reverse neurons were both set to 100. After the processing of Bi-LSTM layer, the eigenvalue of corpus increases from the initial 100 to 200. Therefore, the width and height of convolution core is set to 200 and 5, respectively. Different attention schemes can learn sentence representation with different emphasis points, in this experiment, three attention schemes were taken for each text data. In order to avoid the phenomenon of over-fitting in the process of model training, we used dropout in the experiment and set the value to 0.5. The learning rate of training was set to 1e-3. In addition, to constantly understand the training situation, the training results were saved once every 10 cycles and were displayed every 100 cycles in the training session. Table 7 summarized the experimental parameters in this experiment.

C. MODEL TRAINING
In the training process of various deep learning network models, network parameters are adjusted according to the characteristics of network input in an attempt to make the predictive label as consistent as possible with the input label. Suppose the network input is (x 1 , y 1 ), (x 2 , y 2 ), . . . , (x i , y i ), . . . , (x n , y n ), where x i is the input text, y i is the corresponding label, i = 1, 2, . . . , n. W i is the weight that connecting i-th layer neurons and i-1th layer neurons in the network, bi is the bias of the i-th layer neurons. W i and b i are training update parameters, and the training process is divided into two stages: forward and back propagation stage. In the forward propagation stage, the predicting output y out is calculated through x i , W i and b i of the current network. In back propagation stage, the error between the predicting output y out and y i is calculated and  the cost function E(W , b) is calculated according to the error. Then, the cost function is optimized to update the parameters W and b. It is usually optimized by gradient descent method.
where η is the learning rate parameter used to control the strength of the error propagation. During model training, the cross-entropy J (θ) between the predicted results and the real text categories is regarded as the cost function of model training. The random gradient descent method of Adam optimizer is then used to optimize the model.θ where the E(D, L; θ)/N is the mean error between the actual output and the predicted output, the λ is scale factor, (λ||θ|| 2 )/2 is norm square of model parameters, which is used to control the impact of overfitting. θ is the current parameter of TBLC-rAttentionmodel. α denotes learning rate. N is the size of training sample. D is the training sample. L is the real class label (High praise, Medium praise, Bad review) corresponding to sample D.L i ∈ L.y is the predicted classification result of Softmax classifier. p(L i ) denotes the correct classification result. Cross-entropy can express the degree of difficulty of y by p(L i ). The smaller the cross-entropy, the more similar p(L i ) and y are.
Adam method is used to minimize the J (θ ) while training the model. It mainly uses the first-order moment estimation and second-order moment estimation of gradient to dynamically adjust each parameter of the model. After the bias correction of Adam, every iteration learning will have a certain range, which makes the parameter change more stable. The training process of the TBLC-rAttention network is shown in Fig. 12.

D. EVALUATION INDICATORS
In order to facilitate the evaluation of the model proposed in this paper, the commonly evaluation indicators of text categorization, Precision (P), Recall (R) and F 1 , are used. The calculation formulas are as follows: where TP means true positive, TN means true negative, FN means false negative, FP means false positive. The relationship is shown in Table 8.

V. RESULTS AND DISCUSSION
The performance of a new model can generally be evaluated from the accuracy of the results and the complexity of the model. In this study, when the cycle reaches about 2000 times, the accuracy and loss gradually stabilized, with a value around 99.00% and 0.01, respectively. In order to assess the performance of TBLC-rAttention model, the results of another five categorization models (CNN, LSTM, Bi-LSTM, BiLSTM-Attention and RCNN) are used as comparison benchmarks. Fig. 13 shows the accuracy and loss of the whole model training process, where (a) The accuracy value   All the results were obtained when the training accuracy and test accuracy were stable after 1000 cycles of repeated calculation. The code and data access link of this study is: ''https://github.com/ xingyu068/tblc-ration''.
In Table 9, as show in the experiment 1-16, the accuracy of traditional machine learning method, SVM and Naive Bayes (NB), can only reach about 80% and 76%, respectively. When SVM and NB are combined with feature extraction method, the method of ''SVM (or NB) + sentiment word + negative word + adverbs + part of speech + conditional word'' achieves the highest accuracy of 89% and 85%, respectively. Meanwhile, it can be found that the classification accuracy is gradually improved with the addition of ''sentiment word'', ''negative word'', ''part of speech'', ''adverbs'' and ''conditional word'', among which ''negative word'' and ''adverbs'' have the greatest influence. However, the addition of ''?'' and ''!'' actually reduced the classification accuracy, which indicate that symbols could not be used as a feature of traditional machine learning methods, and only the best feature combination can guarantee the accuracy and reduce the cost of computation. The combination of Word2Vec and SVM (or NB) can also achieve a better classification effect, which indicates that the word vectors obtained by Word2Vec contain rich semantic information and can even be directly used as the feature of emotion classification. Moreover, the result of SVM is better than that of NB.
The results of LSTM and Bi-LSTM (Experiment 18 and 19) show that the accuracy of Bi-LSTM is about 3% higher than that of LSTM. This is due to the fact that Bi-LSTM network extends one-way LSTM network through two layers of forward and reverse network structure, which can fully extract text context information. As a result, the accuracy will be improved, but the space complexity is twice that of LSTM. Compared to Bi-LSTM model (Experiment 19), the accuracy of BiLSTM-Attention model (Experiment 20) is improved by about 4%, which indicates that attention mechanism can effectively identify the feature information that has great impact on classification. Due to the feature of parameter sharing, the accuracy of CNN (Experiment 17) is not very high, but the training time can be greatly saved. As show in the experiment 17-21, RCNN absorbs the advantages of RNN (and its variants) and CNN, and the classification effect is better than that of RNN (and its variants) or CNN, which is close to the result of BiLSTM + Attention model.
Compared with the deep learning method (Experiment 17-22), the traditional machine learning method has a lower classification effect. Its classification accuracy largely depends on the extraction of features of training corpus. It is difficult to identify emotional information in complex sentence patterns because it ignores the features of Chinese sentence structure. It's worth noting that the TBLC-rAttention model has the highest accuracy among all the models in this experiment. It introduces rAttention into RCNN and used TextRank algorithm in the pretreatment process. As a result, the accuracy of the model reaches to 99% and loss value is as low as 0.035.
Model complexity can be analyzed from time complexity and space complexity. In general, the number of basic repeated operations of the algorithm is a function f (n) of module n. So, the time complexity of the algorithm can be recorded as T (n) = O(f (n)). T (n) is directly proportional to f (n), so with the increasing of module n, the smaller f (n) is, the lower the time complexity of the algorithm is, and the higher the efficiency of the algorithm is. In order to calculate the T (n), first calculate the f (n), and then get the order of magnitude ξ (n) of f (n) by ignoring the constant and the coefficient of the lower power and the highest power.  in Table 10, the time complexity of TBLC-rAttention is of linear order. In Fig. 16, it is also verified by parameter ''time'', which is the change of time spent in the training process of TBLC-rAttention. It can be seen that it is almost linear.
Space complexity is a measure of the amount of storage space temporarily occupied by an algorithm during its operation, denoted as S(n). If the data size is independent of the number of temporary variables, the spatial complexity is always S(1) as long as the number of defined variables does not change, regardless of the size of the data. In general, if the algorithm does not involve dynamically allocated space and recursive operations, the space complexity is S (1). In this study, all the defined variables are the same, and there is little use of temporary variables, so the space complexity of TBLC-rAttention is S(1). In total, the space complexity of TBLC-rAttention is not very high, and it shows a good classification accuracy, so it has high practical value. Table 11 shows the validation result of TBLC-rAttention model. It categories the comment data with almost 100% accuracy. Of note, the e-commerce system will default to FIGURE 16. Partial parameter data in training process of TBLC-rAttention model. praise when consumers do not comment, but the model treats such data as a medium review, which is more objective. In addition, When the predicted digitalized emotion value is positive, the review is high praise. When it is negative, it means that the review is bad, and when it is equal to 0, it means that the review is medium praise. Emotion value can not only further determine the accuracy of classification, but also realize the digital measurement of the text emotion.
The above experimental results indicate that the proposed model is effective. Compared with the other models, it achieves the best results. This also shows that the combination of Bi-LSTM and CNN can more effectively mine text semantic features and also shows the necessity of introducing attention mechanism. However, although TBLC-rAttention improves the accuracy, it meanwhile increases the computational complexity and results in a slight increase in the computational time. In conclusion, the method proposed in this study has the following characteristics: (1) It can effectively mine the hidden information in the comments of the drug, help the staff analyze the problems existed in each link from the medicine sale to use. For example, for managers, by analyzing the emotional tendency of evaluators, they can provide improvement strategies for products. For drug developers, it is possible to analyze the user's evaluation of the drug to understand the therapeutic effect of the drug and further improve the drug.
(2) It proposed an automatic emotion dictionary extension method based on the knowledge of medicine in medical field. Firstly, vector representation is performed on the corpus after word segmentation and the basic emotional dictionary. Then, the semantic similarity distance between the corpus and the basic emotional dictionary is calculated to judge the emotion tendency of unregistered words in the corpus. Lastly, polarity judgment algorithm is used to classify the unregistered words into the basic emotional dictionary. The automatic expanded emotion dictionary is applied in the preprocessing to verify VOLUME 8, 2020 the effect of the expanded dictionary on the performance of TBLC-rAttention. The experiment shows that the expanded dictionary can better identify new words in the medical field, further improve the effect of Chinese word segmentation in the medical field and improve the recognition accuracy of the model.
(3) TBLC-rAttention deep neural network model is designed for Chinese medical text emotion recognition based on deep learning technology. It fully draws on the advantages of Bi-LSTM, attention mechanism and CNN. Firstly, Bi-LSTM is used to extract the global semantic features of the text, and rAttention mechanism is introduced in the process. Secondly, CNN is applied to extract the local semantic features and further mining text information, and obtain the final semantic feature vectors. Thirdly, r attention schemes are used instead of traditional single attention scheme during calculating weighted global semantic features. Different attention scheme can learn sentence representations with different emphases, thus more valuable feature information can be extracted. Fourthly, TextRank algorithm is used in the pre-processing to let the model can also process long text data and minimize the impact of interference information on classification. Finally, the classification task of Chinese medical reviews is completed by using the obtained semantic feature vectors. In this way, the model can mine semantic feature information to the greatest extent, and meanwhile effectively solve the problems existing in traditional methods, such as high dimension, high sparse, weak feature expression ability, inapplicability in large data sets and so on. In addition, the model is implemented by Google's TensorFlow framework and trained to adjust parameters. Experiments show that the model has good convergence speed and accuracy.
(4) A dictionary-based unsupervised method was used to calculate the emotional value of the results predicted by the TBLC-rAttention model. At first, various dictionaries are used to identify the sentiment words in the text, and then the overall emotional value of the text is calculated according to the judgment rules. On the one hand, the digital measurement of the predicted results is realized; on the other hand, the positive and negative values of the calculated results can verify the predicted results. And the absolute value of the calculated results indicates the emotional intensity.
(5) The model is easy to use. Deep learning can handle the problem of text understanding, and don't need to understand any language related to the other syntactic or semantic structure. After the training is completed, deep learning can directly analyze and infer the high-level goals, so as to solve the problems of inaccurate word segmentation or inadequate understanding of semantic features caused by a large number of professional terms in the medical field, and improve the classification effect of the model. (6) The TBLC-rAttention model solves the difficulty of little information in short phrase data. For example, ''Logistics is very fast, cure cold, very responsible, bought two bags at one time, 999 is trustworthy.'' In the absence of context learning, it is difficult to judge what ''999'' means in ''999 is trustworthy'' only by using the literal meaning of the text. This phrase is too short to contain particularly obvious information words that can be used as subject judgments. The key word ''999'' is more likely to be understood as simple numbers in machine. TBLC-rAttention model can link ''999'' to disease after learning from the context of the text, such as ''cure'', ''wind-cold'' and ''cold'' appear in the text.

VI. CONCLUSION
In this paper, a hybrid deep neural network model, TBLC-rAttention, aiming at Chinese text emotion recognition, is proposed. This model can be used for knowledge extraction, modeling, classification and mining of the medical texts. In this study, the performance of the TBLC-rAttention is evaluated and verified on the collected data set. Although the training time is slightly longer, it is within an acceptable range.
Deep learning is an effective technique for analyzing and processing medical texts. Although this study has made some achievements in this respect, the classification and mining of medical text using models with better performance and higher intelligence level remains a challenging field. Thus, In the future, our major work includes: (1) considering how to carry out a better preprocessing operation on the corpus data, such as reducing noise to a greater extent, segmenting words more accurately, and recognizing medical terms better. And it is necessary to further optimize the efficiency and accuracy of dictionary and medical terms recognition. (2) Trying other algorithms and models, especially the new method just proposed, and carrying out effective fusion and improvement, so as to further improve the accuracy of the model and improve its time efficiency. For example, more convolution cores can be designed in the convolutional neural network model for parallel convolution calculation. (3) Increasing the diversity of corpus data, like using as many kinds of drug comment data as possible, even text data in other fields. Therefore, the model will be more versatile. It can be used for text categorization tasks in long/short text and other arbitrary fields.
QIBING JIN received the Ph.D. degree in control theory and engineering from Northeastern University, Shenyang, Liaoning, China, in 1999.
He joined the Beijing University of Chemical Technology, Beijing, China, in 2002. He is currently a Full Professor with the College of Information Science and Technology and the Director of the Institute of Automation, Beijing University of Chemical Technology. He has rich experience in control engineering, and his many research results have been applied in petroleum and chemical industry. In recent years, he was awarded several prizes for science and technology progress. His main research interests include advanced control, intelligent instrument, system identification, and control theory.
XINGRONG XUE received the bachelor's degree in building electricity and intelligence from the Shandong University of Architecture, in 2017, where he is currently pursuing the master's degree.
He was admitted to the Institute of Automation, Beijing University of Chemical Technology, Beijing, China, in 2017. His research directions are pattern recognition and machine learning, including natural language processing, data mining, and knowledge graph.