Exploring E-Commerce Product Experience Based on Fusion Sentiment Analysis Method

With the speedy development of e-commerce, a growing number of customers tend to share their subjective perceptions of the product or service on the Internet. This phenomenon makes the commercial value of online reviews increasingly prominent. In this context, how to gain insights into consumers’ perceptions and attitudes from massive comments has become a hot-button topic. Addressing this requirement, this paper developed a fusion sentiment analysis method combining textual analysis techniques with machine learning algorithms, aiming to mine online product experience. The method mainly consists of three steps. Firstly, inspired by the sensitivity of sentiment dictionary to emotional information, we utilize the dictionary to extract sentiment features. Afterward, the SVM algorithm is adopted to identify sentiment polarities of reviews. Based on this, sentiment topics are extracted from reviews through the LDA model. Furthermore, to avoid the omission of emotional information, the dictionary is extended based on semantic similarity. Meanwhile, in this research, the fact that words in reviews have unequal sentiment contribution, which has been neglected in existing studies, is taken into account. Specifically, we introduce the weighting method to measure the sentiment contribution. Finally, the investigation of consumers’ reading experiences of online books on Amazon has verified the feasibility and validity of the method. The results demonstrate that the method accurately determines reviews’ emotional tendencies and captures elements affecting reading experiences from reviews. Overall, the research provides an effective way to mine online product experience and track customers’ demands, thereby strongly supporting future product improvement and marketing strategy optimization.


I. INTRODUCTION
Sentiment Analysis, also called Opinion Mining, is a Natural Language Processing (NLP) technique used for identifying, extracting and analyzing the subjective information in texts [1], [2]. With the advent of the information era, Internet has infiltrated every aspect of people's daily routines. The network platform has become a progressively important medium for people to share opinions and make comments [3], [4], [5], [6]. In recent years, there has been an explosion of valuable reviews about public figures, major events and commercial products on the Internet. Online reviews transmit rich The associate editor coordinating the review of this manuscript and approving it for publication was Arianna Dulizia . sentiment information including individual's views, attitudes or emotional tendencies such as joy, anger, sadness, criticism and praise, which enable potential users to ascertain public opinions on certain people, events or products. For these reasons, it is of crucial importance to identify and analyze the subjective information that expresses personal ideas and sentiments from social network platforms in aspects of public opinion investigation, policy development and business decision-making [7], [8], [9].
A boom in e-commerce has been created by the rapid development of the Internet and the transformation of people's shopping modes, which has contributed to the emergence of Amazon, eBay and other numerous e-commerce platforms [10], [11]. In the meantime, a growing number of customers are sharing their subjective perceptions of the product or service on the platform through online comments. The reviews not merely reflect the performance as well as the quality of online goods or services, but also display consumers' shopping experiences in an authentic and comprehensive manner. It turns out, online reviews have been regarded as a valid information source for both consumers and merchants [12], [13], [14]. Especially for certain new and untried products, the reviews offer consumers valuable references for product selection, which is of great significance to reduce purchasing risks. Besides, the reviews help merchants have an appreciation of consumer attitudes, such as motivations, satisfaction, etc., thus developing products that can meet the expectations of consumers. Admittedly, it is crucial for business success to gain insights into product experience and timely grasp consumers' practical demands from the reviews.
In this context, how to identify and obtain consumers' perceptions and attitudes from massive casual comments has become a hot-button topic. Addressing this requirement, the present research proposes a fusion sentiment analysis method, aiming to mine consumers' experiences of online products and track users' demands. The potential application values of the research are mainly embodied as follows. In one respect, it contributes to customers' comprehensive assessment on product qualities and rational consumption. More importantly, it provides an effective approach for merchants to understand consumers' practical demands, which makes it possible for future product improvements and marketing strategy optimization.
Methodologically, we combine textual analysis techniques with machine learning algorithms. Specifically, the method mainly consists of the following steps. Firstly, the sentiment dictionary is used to extract sentiment features, based on which, the Support Vector Machines algorithm is adopted to identify sentiment polarities of review texts. Subsequently, sentiment topics are extracted from reviews with distinct sentiment polarities respectively through the Latent Dirichlet Allocation (LDA) model. The contribution of our method mainly lies in taking full advantages of the sensitivity of sentiment dictionary to emotional information and the strong generalization of machine learning algorithms. Importantly, the method overcomes demerits that the susceptibility to human interference in feature extraction of machine learningbased methods and the weak adaptability in cross-domain application of dictionary-based methods. In the meantime, the dictionary is extended based on semantic similarity to avoid the omission of emotional information. Apart from that, in the present research, the fact that words in reviews have unequal sentiment contribution, which has been neglected in the existing studies, is sufficiently taken into account. The weighting method is introduced in the process of sentiment feature extraction, that is, the weight is used to measure sentiment contribution.
The rest of the paper is organized as follows. Section 2 introduces the related work of the research. Section 3 presents the principles of the proposed method. We display the experimental process and analyze the experimental results in Section 4. Finally, Section 5 concludes with implications, limitations as well as future lines of the research.

II. RELATED WORK A. SENTIMENT ANALYSIS IN BUSINESS DOMAIN
In recent years, e-commerce has flourished and penetrated into almost all areas of our lives, particularly in consumption behavior. Various e-commerce products with high qualities and low prices have been attracting an increasing number of consumers. However, online shoppers are inclined to know more about the product through online comments, for it is not practical for them to get direct experience with the product. In this context, purchasing decisions are often strongly influenced by online reviews [15]. People would prefer to trust word-of-mouth reviews of products from other consumers rather than trust marketing advertisements, as the other consumers are considered as disinterested parties [16]. This phenomenon makes the commercial value of online reviews increasingly prominent. In fact, the research on excavating the value of massive online reviews has drawn extensive attention from the academic community.
At present, many researchers have reached a general consensus that sentiment analysis is an efficacious approach to exploring the commercial value of online reviews [17], [18], [19], [20]. Sentiment analysis has been applied to many fields of contemporary organizational business activities, including enterprise management, business decision-making, etc. In terms of the value derived from the application of sentiment analysis in contemporary corporations, it is worth noting the research of Saura et al. [21] who explored major opportunities and obstacles for remote work during the COVID-19 pandemic. As they reported, their study attempted to answer ''what are the main opportunities and challenges of remote work'' by extracting relevant emotional topics from Twitter's GUC data. Obviously, their research has active guiding significance for helping companies cope with the challenges brought by the epidemic. As for business decision-making, Kauffmann et al. [22] discussed the positive role of business decisions in creating sustainable competitive advantage. He et al. [23] presented a framework to demonstrate how to make use of media data to obtain valuable knowledge for business decisions.
Additionally, in aspects of products and services, as highlighted by Gonçalves et al. [24], the value of sentiment analysis lies in monitoring the reputation of products or services through review analysis, providing analytical perspectives for responding to market opinions. For instance, Wang et al. [25] explored the demanding themes and feature extension words of distinct car models by means of extracting the demand preference topic of new energy automobile customers. Their study aims to help consumers understand other users' feedback, and to facilitate car enterprises objectively capturing consumers' practical demands. Similarly, Arora et al. [26] carried out a comprehensive survey into useful insights from VOLUME 10, 2022 online platforms regarding the performance of welcome smartphone brands, battery life as well as screen quality. The survey results indicated the great latent capacity of sentiment analysis of online reviews in assessing consumers' reflections on popularized brands of the portable device. Correspondingly, Almjawel et al. [27] tried to conduct comprehensible sentiment analysis of online book reviews collected from Amazon to assist users in making efficient purchasing decisions. With respect to services, Chang et al. [28] explored the service quality of the hotel from the standpoint of consumers. By analyzing the attitudes and emotions conveyed in customer comments, they found that professional services and clean rooms have a positive effect on improving customer satisfaction. This is consistent with Freitas's emphasis on evaluating service quality from the perspective of customers [29].
Actually, as the potential of sentiment analysis in creating business competitive advantages is increasingly prominent, a growing number of scholars have launched research on the application of sentiment analysis in the business field. In all, sentiment analysis is capable of tracking people's online emotional information at a low cost, which makes it a powerful tool for observing commercial activities.

B. SENTIMENT ANALYSIS METHODS
Currently, sentiment analysis has emerged as one of the most popular research areas in NLP, with two primary types of approaches: the dictionary-based approach and the machine learning-based approach [30]. The dictionary-based approach identifies emotional words in a given text utilizing a preestablished sentiment dictionary, and calculates the emotional value of all recognized words to determine the overall sentiment tendency. Early attempts to undertake sentiment analysis using a dictionary could be traced back to the 1960s, when Philip et al. [31] established a sentiment lexicon and classified words of the text into positive and negative polarities. It should be indicated that the sensitivity in sentiment information identification together with the easy application and accessibility of the sentiment dictionary have contributed to the extensive application of the dictionary-based approach. However, the performance of the dictionary-based approach is largely determined by the comprehensive construction of the sentiment dictionary. Considering that it is a great challenge to construct a sentiment dictionary manually, researchers have attempted to use data processing methods to assist the dictionary construction. For instance, Esuli and Sebastiani described a semi-automatic method for constructing a sentiment dictionary and built the lexical resource, SentiWordNet [32]. Ohana and Tierney applied SentiWord-Net to the sentiment classification of movie comments and demonstrated that it outperformed manually constructed dictionaries [33]. In fact, researchers have been working on the automatic construction of sentiment dictionaries. Rao et al. [34] developed a method to construct a word-level dictionary automatically by adopting three pruning strategies (maximum, average and minimum) and applied it to the investigation of emotional reflections on social affairs. As they claimed, the pruning strategies adopted effectively refine the sentiment dictionary and improve sentiment analysis performance. As far as the construction of a sentiment dictionary is concerned, it has experienced from initial manual construction to automatic construction. It is worth noting that sentiment terms in the dictionary may have diversified denotations in different disciplines and contexts, which makes the dictionary-based approach confront certain challenges in cross-domain applications. Focusing on this problem, scholars put forward the conception of constructing domaindependent dictionary. For instance, Ahmed et al. [35] developed a model to build a domain sentiment dictionary based on a weakly supervised neural network. The model trained the dictionary on the basis of unlabeled weakly supervised data, and mapped the words into different clusters in light of their emotional tendencies in the target domain. Accordingly, the adaptability of the sentiment dictionary was effectively enhanced. Despite this, the construction of the domaindependent dictionary still fails to completely overcome the challenge. In addition, in the present age of information explosion, human knowledge is being constantly enriched and updated. Some new sentiment words and expressions are unable to be timely included by the dictionary, making emotional information at risk of being omitted.
The machine learning-based approach utilizes feature functions to extract sentiment features from labeled corpora, based on which a model is trained to achieve automatic sentiment analysis of a given text. Due to the powerful data processing capacity of the machine learning algorithm, this approach has a strong generalization ability and is suitable for cross-domain sentiment analysis, thus attracting the attention of numerous researchers. For instance, combining the sentiment information and domain relevance of words, Liu et al. [36] constructed a cross-domain sentiment analysis model based on continuous bag-of-words. Wang et al. [37] discussed the scheme of building a dimensional sentiment analysis model using CNN and LSTM. Relevant studies have shown that the performance of the machine learning-based approach is dependent on the sentiment feature extraction quality to a large degree [38]. Under such circumstances, how to improve the feature extraction quality has become one of the major concerns for many researchers. For instance, Tripathy et al. [39] designed an integrated sentiment analysis model and attempted to apply the approaches of unigram, bigram, trigram and their combinations to feature extraction, aiming to increase the adaptability of the model. Similarly, integrating feature extraction and classification techniques, Kaur et al. [40] proposed a sentiment analysis approach for Twitter data. They adopted the n-gram algorithm for emotion features extraction and KNN for sentiment classification. Besides, Bounabi et al. [41] conducted a text classification study, in which TF-IDF was used to select text features. Singh et al. [42] proposed a methodology for sentiment prediction optimization using four different machine learning algorithms as classifiers. They adopted three feature selection methods, namely Document Frequency (DF), Mutual Information (MI) and Information Gain (IG) to evaluate the sentiment classification performance of different classifiers. Presently, the frequently-employed sentiment feature extraction algorithms mainly include n-gram (unigrams, bigrams, trigrams), TF-IDF, MI, IG, etc. Nevertheless, the dependence on artificial design and susceptibility to human interference of the preceding algorithms make it difficult to ensure the quality of feature extraction when dealing with complicated sentiment texts.
Presently, as highlighted by Liu, et al. [53], one of the major challenges confronted by researchers in this area is how to combine the sentiment dictionary with machine learning. Given the above, a fusion sentiment analysis method is proposed in the paper. The method integrates merits of sentiment dictionary in emotional information identification and the strong generalization of machine learning algorithms. In order to avoid the omission of emotional information, the dictionary is extended based on semantic similarity. Meanwhile, considering the unequal contribution of sentiment words in reviews, the weighting method is introduced to measure sentiment contribution. These strategies provide powerful support for sentiment feature extraction when dealing with complex texts. Additionally, by means of the strong generalization ability of the SVM algorithm, the proposed method has good adaptability in cross-domain sentiment analysis.

III. PRINCIPLES OF THE METHOD
The procedure of the proposed method is mainly composed of the following subprocesses, namely, review texts collection, text preprocessing, sentiment feature extraction and sentiment analysis. The detailed research framework is displayed in Figure 1.

A. TEXT PREPROCESSING
To begin with, a large number of online review texts are collected from Internet platforms such as Amazon, eBay, etc. Afterward, the obtained reviews are standardized into sets of word arrays through text preprocessing to facilitate the subsequent procedures of text analysis. There are two major phases in text preprocessing, which comprises text preparation and text representation.

1) TEXT PREPARATION
Text preparation involves multiple steps including tokenization, text cleaning (case folding and stop word removal), stemming and lemmatization. With regard to English review texts, all words are required to be converted into lowercased ones, except for specialized expressions and proper nouns that ought to remain capitalized. Besides, punctuation marks and stop words (typically functional words such as articles, prepositions, conjunction, etc.), normally regarded as noise in emotion analysis, are supposed to be cleaned in review texts, for they scarcely manifest sentiment orientations. Meanwhile, due to the rich morphology of English words, stemming and lemmatization are essential steps to standardize lexical forms in text preprocessing. Stemming is to obtain the word stem by removing prefixes and suffixes, while lemmatization transforms the inflected or variant forms of words into the basic one. For example, in English, verbs in different tenses such as ''likes'', ''liked'' and ''liking'' demonstrate the identical sentiment tendency, which should be reduced to the stem ''like''. For English ''be'' verbs, words like ''am'', ''is'', ''are'' and ''been'' are required to be lemmatized into the original form ''be''.

2) TEXT REPRESENTATION
Review texts, as a type of unstructured information, are not allowed to be directly processed and analyzed via conventional data tools. Text representation enables unstructured information to transform into mathematically computable structured data. Currently, two primary types of text representation are one-hot representation and distributed representation. In one-hot representation, M -bit of state flip-flop is used to encode M states and each bit of state is stored in a separate flip-flop with only one bit true at any time. On account of the rules outlined above, one-hot representation encodes an individual word into a vector representing a certain feature [43], [44]. Nonetheless, it defaults that each word in the text has a separate identity, which fails to represent the sequence VOLUME 10, 2022 and correlation between words. Moreover, provided that the number of eigenvalues is particularly large, it will generate massive redundant sparse matrices, resulting in the curse of dimensionality.
In 2013, Mikolov et al. [45] developed a word embedding model Word2vec, which is based on a shallow neural network. Being one of the distributed representations, Word2vec allows a model to be trained from a massive corpus of texts in an unsupervised way that embeds words in a low-dimensional vector space. Regarded as a local context window method, Word2vec captures cooccurrence information one window at a time, which sufficiently considers the semantic relationship between words for vector representation. Also, the lowdimensional word vectors output by Word2vec effectively avoid the curse of dimensionality. In 2016, an improved version of Word2vec, namely, FastText, was proposed by Joulin et al. [46]. FastText is an open-sourced library that allows for capturing semantic similarities and generating better word embeddings for different words in a given text. It supports high speed model training of word vectors and can complete the training of a 1 billion vocabulary model within 10 minutes, further advancing the efficiency of the model training. As a consequence, we use FastText to obtain vector representations for words in the proposed method.

B. DICTIONARY EXTENSION
With the advancement and transformation of society, an increasing number of updated words on the Internet have emerged in the reviews. Nevertheless, a number of newly-developed words, despite conveying people's emotional tendencies as well, are not included in the preestablished sentiment dictionary. It will inevitably omit the updated words to extract sentiment features using an existing dictionary, thus failing to obtain complete sentiment information in the reviews. As a result, it is of critical necessity to extend the sentiment dictionary to enlarge its coverage.
To begin with, the top N positive and negative sentiment terms are selected from the dictionary to construct a reference word set C for keywords extraction. Cosine similarity is a measure of similarity between two vectors and the closer the cosine value is to 1, the higher the similarity between the vectors. Hence, the semantic similarity between words d 1 and d 2 can be effectively evaluated by the cosine similarity value CS.
where v denotes the word vector. Afterward, cosine similarity values between words in review texts and that of the set C are obtained through traverse calculation, and words with a similarity value greater than the threshold t are supplemented to sentiment dictionary. The detailed process of sentiment dictionary extension is shown in Figure 2. Algorithm 1 shows the procedure for sentiment dictionary extension according to semantic similarity. reference sentiment words set C 3: j ← 1 4: while j < length(d) do 5: if d(j) ∈ sentiment dictionary then 6: delete(d(j)) 7: else 8: set(++) ← d(j) 9: end if 10: end while 11: return set 12: for p = 1 → length(set) do 13: for q = 1 → 2 * N do 14: semantic similarity calculation 15: end for 16: CS max ← max(CS) 17: if CS max > t then 18: W (++) ← set(p) 19: end if 20: end for 21: return W C. TEXT VECTORIZATION Following the earlier-mentioned text processing procedures, including text preparation, text representation and sentiment feature extraction, a review text can be converted as a vector sequence of m × n (where m donates the quantity of words in the comment while n donates the dimension of the word vector). Nevertheless, the unequal quantity of words in different review texts results in dimensions of inequality, which makes it particularly necessary to unify the dimensions of text vectors. Usually, a review text can be expressed as a vector with an identical dimension of 1×n by calculating the average word vector of all words. However, it is not consistent with the practical circumstance as the average word vector implies the equal sentiment contribution of each word. In order to address the problem, a weighted calculation method of text vectorv based on the contribution of sentiment words is proposed in the paper, i.e.,v The weight w i is computed by the formula (3).
where s denotes the sentiment value of the word obtained from the dictionary.

D. DIMENSIONALITY REDUCTION
Through text vectorization, a review text can be represented as a vector of 1 × 300. Nevertheless, the feature redundancy existing in the 300-dimensional text vector makes the training of the sentiment analysis model at the risk of overfitting. Meanwhile, the higher the dimension of the text vector, the more complex the calculation. Consequently, it is an indispensable part of text processing to project the text vector to a lower-dimensional vector space under the premise of retaining valuable semantic properties of review texts. Principal Component Analysis (PCA), being an unsupervised learning method, can effectively extract the ''main components'' of data to achieve dimensionality reduction without manual intervention [47], [48]. In addition, PCA ranks each component based on its importance in the process of dimensionality reduction, and prepends important components to guarantee that the information of the original data is retained to the greatest extent. Meanwhile, PCA ensures the orthogonality between components, thus effectively eliminating the mutual influence between the original data components.
Assuming that there are n comments in the review texts set X = {X 1 , X 2 , · · · , X n } and each comment has m-dimensional The average value barx of each feature can be calculated by the formula (5) The covariance matrix C of n comments is as follows: The eigenvalue λ and eigenvector u of the covariance matrix C are solved by eigen decomposition.
Each eigenvalue λ corresponds to an eigenvector u. The larger the λ, the higher its importance. Consequently, a descending sequence is created by arranging λ from high to low.
Then, (10), so that the dimensionality of X i is reduced from m to k.

E. SENTIMENT CLASSIFICATION
The purpose of sentiment classification is to automatically identify sentiment polarities of review texts, thereby investigating the consumer experience of online commodities in terms of the ''positive'' and ''negative'' respectively. As a binary classification method, Support Vector Machines (SVM) is extensively applicated in statistical classification and regression analysis [49]. Possessing the merits of small generalization errors and low computational cost, SVM is particularly applicable to the classification of small samples with complex features [50].
In light of SVM principles, firstly, the text vector is mapped into a space, where a maximum interval hyperplane is established. Then, the hyperplane is separated to maximize the distance between the two parallel hyperplanes. The linear formula of the optimal dividing line of the hyperplane can be defined as: where a = a 1 ; a 2 ; · · · ; a ρ denotes a normal vector, defining the hyperplane's directions. ρ denotes the quantity of eigenvalues. X denotes the text vector. b denotes displacement, which determines the distance from the origin to the hyperplane. Assuming that P(x 1 , x 2 , · · · x ρ ) is a point in the hyperplane, the distance d from the point to the hyperplane can be calculated by the formula (12): VOLUME 10, 2022 According to formula (12), the formula (13) can be obtained.
where y i denotes the label of the point in the hyperplane. The formula (13) can be simplified as Combining the above formula, the formula (15) can be obtained.
The distance between the parallel hyperplanes is 2 a . It illustrates that maximizing the distance is equivalent to minimizing a . Thus, the optimal classification problem can be transformed into a problem of quadratically constrained programming, i.e., By solving formula (16), the optimal hyperplane parameters a and b can be obtained to construct the sentiment classification model.

F. SENTIMENT TOPIC EXTRACTION
Sentiment topic extraction is a process to determine the themes of reviews by identifying related sentiment words. As different customers have varied requirements and concerns, online reviews contain rich emotional information and cover diverse sentiment topics. With the help of sentiment topic extraction, we are able to fully mine the useful information hidden in the customers' reviews and gain clearer insight into their practical demands. In addition, by making use of the topics extracted from the reviews, we can comprehensively understand the customers' major concerns on certain product and dig out important factors influencing the product experience.
As a document-level statistical model, Latent Dirichlet Allocation (LDA) has the advantages of being interpretable and easily predicted, which has been extensively applicated in topic modeling presently [51]. LDA model posits that each document is composed of diverse topics and the generation of each word is ascribed to one of the topics in a document. It can not only effectively explore the underlying emotional topics within review texts, but also extract the deep semantic relationship between vocabulary and comment documents. LDA model contains a three-layer structure of documents (X ), topics (Z ) and words (W ). Supposing that there are q emotional topics Z i (i = 1, 2, · · · q) distributed in review texts set X = {X 1 , X 2 , · · · X n }, and p words X i = W 1 , W 2 , · · · W p within each comment of X . Let α and β be the prior parameters of Dirichlet function, and θ be the multinomial distribution parameter of sentiment topics in comments, which obeys the Dirichlet prior distribution with hyperparameter α. Similarly, let ϕ be the multinomial distribution parameter of words in the emotional topic, which obeys the Dirichlet prior distribution with hyperparameter β. The LDA model is shown in Figure 3. As mentioned previously, the LDA model posits that each comment is randomly mixed by a certain proportion of various topics, where the mixing proportion obeys a multinomial distribution.
Each topic is formed by mixing each word in proportion, where the mixing proportion obeys a multinomial distribution as well.
The generation probability of the word W i in the condition of review X i is as follows: Therefore, the generation probability of each word can be obtained as: By optimizing the distribution of documents X i and words W i in the LDA model, the solution results of θ and ϕ are able to be updated, thus obtaining emotional topics and featurerelated words.

IV. EXPERIMENTAL PROCESS AND RESULTS ANAYLSIS A. DATA SOURCE
In order to assess the performance of the proposed method, an investigation of the e-commerce product experience is conducted in the research. The experimental data are derived from online reviews of best-selling English editions of Chinese classic Tao Te Ching on Amazon. As the largest book online shopping mall in the world, Amazon enables customers to make comments in the product review section in terms of book content, printing quality, reading experiences, etc. Meanwhile, it divides all reviews into ''positive reviews'' and ''critical reviews'' in accordance with readers' star ratings, which provides sufficient data with explicit sentiment polarities for the investigation. Based on the information above, a total of 5480 reviews of the best-selling Tao Te Ching books have been collected as a data source, with 4678 positive reviews and 802 critical reviews. Subsequently, 3742 positive comments and 642 negative comments are randomly chosen as the training set and the remaining 936 positive comments and 160 negative comments as the testing set. Samples of the review texts are shown in Table 1.

B. SENTIMENT CLASSIFICATION RESULTS
BosonNLP is an open-sourced sentiment dictionary automatically constructed based on millions of emotionally labeled data from network platforms such as microblogs, news, forums, etc. [52]. With extended coverage of non-normative texts, the dictionary includes many Internet terms and informal abbreviations, which is widely appreciated when dealing with social media content and word-of-mouth reviews across industries. As a consequence, BosonNLP is adopted as the benchmark dictionary for sentiment analysis in the paper. In order to prepare optimal data for sentiment analysis sufficiently, as described earlier in Section 3, several procedures including text preprocessing, sentiment dictionary extension, sentiment feature extraction, text vectorization and dimensionality reduction have been carried out sequentially.
In the procedure of dimensionality reduction, the contribution rate which evaluates the amount of sentiment information of the original data retained in the dimensionality-reduced data is adopted to determine the reduced dimensions. The contribution rates of different dimensions are calculated using formula (21), as shown in Figure 4. In this research, the contribution rate is set as 98%, thus reducing the dimension of the text vector from 300 to 181.
where CR denotes the contribution rate of the first k components of the dimensionality-reduced data. The eigenvalue λ can be obtained by formula (8). To evaluate the sentiment classification performance of the proposed method objectively, the VADER-based, BosonNLP-based, KNN-based, SVM-based and BERT-based methods are adopted as the comparison in the experiment. Additionally, the indexes of Precision, Recall and F-score are utilized to determine the performance of the above methods.
The sentiment classification has the following four possible outcomes: True Positives (TP), the quantity of instances where the classifier accurately forecasts the positive reviews as positive.
False Positives (FP), the quantity of instances where the classifier inaccurately forecasts the negative reviews as positive.
True Negatives (TN), the quantity of instances where the classifier accurately forecasts the negative reviews as negative.
False Negatives (FN), the quantity of instances where the classifier inaccurately forecasts the positive reviews as negative.
Precision is the probability of actually being positive among all predicted positive samples, which is calculated as: Recall, also specified as sensitivity, measures percentages of the factual positives that are accurately predicted as positives, which is calculated as:  F-score, capturing the properties of both Precision and Recall, can comprehensively evaluate the accuracy and coverage of the prediction results, which is calculated as: Table 2 shows results of text sentiment classification using different methods. It can be seen that the proposed method outperforms other methods in terms of the indexes of Precision, Recall and F-score. Currently, one of the major goals of sentiment lexicon construction is to enhance its cross-domain adaptability. However, this makes the lexicon show different performance in dealing with texts from various domains. As Hutto et al. reported [54], the VADER lexicon displays obvious performance difference in social media texts, product reviews, movie reviews, etc. Apparently, it remains challenging to achieve the performance stability of the dictionarybased methods. In addition, based on the Transformer to construct a multilayer bidirectional Encoder network, BERT used ''pre-training + fine-tuning'' mode for sentiment analysis [55]. Nevertheless, it ignores the unequal contribution of sentiment features, which makes certain important sentiment features run the risk of being submerged. In comparison, the comprehensive performance of our method is more stable.

C. READING EXPERIENCE ANALYSIS
According to the statistics of collected comment data, positive emotions account for 85%, which indicates that the majority of readers have an overall appreciation of the Tao Te Ching books available on Amazon. Yet, 15% of the readers have negative attitudes for a variety of reasons. In order to comprehensively understand customers' reading experiences and explore elements that influence their positive and negative attitudes, sentiment topics are extracted from the positive and negative reviews separately. Considering that perplexity is to measure the merits and demerits of probabilistic models and a lower perplexity suggests a better fit, the optimal number of topics can be selected by evaluating the perplexity of LDA models fit with varying numbers of topics. Consequently, the perplexity of topic extraction with different number of topics is calculated using formula (25), as shown in Figure 5. It indicates that the topic extraction of positive and negative review texts has the lowest perplexity, when the number of topics is 3. Accordingly, three topics are extracted from positive and negative review texts respectively, as shown in Figure 6 and Figure 8. Meanwhile, on the basis of co-occurrence frequencies of keywords in the reviews, the networks of keywords in review texts are constructed, as demonstrated in Figure 7 and Figure 9. (25) where X denotes the set of review texts. n and N d denote the number of review texts and that of words in each review text respectively. W d denotes the word in the review text and p(W d ) denotes the probability of the word W d in the review text. As shown in Figure 6, three topics are identified from positive reviews, that is, comprehensive evaluations, reading inspirations and recommendation degrees. The first topic dimension reflects readers' general positive impressions of Tao    thought'', ''helpful insight'', ''spiritual philosophy'', ''inner peace'', ''teach way'', ''subtle wisdom'', ''seek wisdom'', ''live taoism'', etc. Accordingly, we can see that readers' discussion of this topic centers on the Taoist philosophical thoughts and wisdom delivered by Tao Te Ching books, which provide beneficial enlightenment for people in aspects of daily life, study and work. The third topic dimension is about readers' recommendation degree on the books, with salient terms as ''five'', ''star'', ''four'', ''highly'' and ''recommend'', ''valuable'', etc. ''five stars'' in Amazon's star rating denotes a full score, indicating the highest recommendation degree of the books.
To sum up, through the reading experience analysis, we found that readers' concerns regarding online Tao Te Ching books focus on the aspects of translation, book content, writing style, book quality as well as practical values, which constitute major factors affecting their reading experiences. Elements including informative content, beautiful writing, good practicability, etc. largely contribute to readers' positive attitudes towards the books, while poor qualities in printing, packaging, translation fidelity and readability generally result in readers' negative emotions.

V. CONCLUSION
The rapid progression of the Internet has expedited the prosperity of e-commerce. Increasingly, the commercial value of online reviews has become prominent. Against this background, sentiment analysis has become a promising tool in excavating the value of online reviews in commercial activities. In this paper, we proposed a fusion sentiment analysis method for e-commerce product experience analysis by making full use of the advantages of sentiment dictionary in emotional information identification and the strong generalization of machine learning algorithms. Inspired by the sensitivity of sentiment dictionary to emotional information, our method uses the sentiment dictionary to extract sentiment features. On the basis of this, sentiment polarities of review texts are identified by the SVM algorithm, and sentiment topics are extracted from reviews with distinct sentiment polarities respectively through the LDA model. In the present age of information explosion, new sentiment words and expressions that are not timely included in the dictionary make the emotional information at risk of being omitted. Accordingly, we extend the dictionary based on semantic similarity to enlarge its coverage. Meanwhile, considering the unequal contribution of sentiment words in reviews, we introduce the weighting method to measure the sentiment contribution. The comparison experiment results show that the proposed method outperforms other methods in terms of Precision, Recall, and F-score. In addition, the feasibility and validity of the proposed method have been verified through the investigation of consumers' reading experiences of online books on Amazon. The results demonstrated that the method can accurately determine reviews' emotional tendencies and identify customers' major concerns from reviews with different sentiment polarities.
The theoretical implications of this research are mainly manifested in two aspects. To begin with, given the method in the present study shows good performance in sentiment classification and topic extraction, future research can use the methodology as a fundament for investigation of consumers' emotional tendencies and major concerns regarding other products or services from e-commerce platforms. Second, the present research combines textual analysis techniques with machine learning algorithms to identify and analyze sentiment polarities and sentiment topics from a large number of online reviews, thus achieving effective quantitative analysis of the sentiment information within the reviews. The study enriches research regarding value mining of online comments and the application of sentiment analysis in the business domain.
As for the practical implications, the research displays that the proposed method enables us to capture elements affecting the reading experience from a large number of reviews in an objective manner. The findings assist customers in comprehensively assessing online books, providing guidelines for them to make efficient purchasing decisions. More significantly, this research not merely provides valuable insights for authors in book design and content optimization, but also facilitates publishers in tracking readers' demands and developing reasonable marketing policies. Also, the method can be applied to experience analysis of other e-commerce products, so as to provide powerful support for optimizing products and services. Overall, this paper illustrates the practical significance and application value of the fusion sentiment analysis method in the business field.
As the commodity attributes grow to be more diverse, the sentiment information contained in consumers' reviews has become increasingly abundant. The present research explores the online product experience from two aspects, namely, positive and negative, without implementing multi-dimensional sentiment analysis. Future research will conduct fine-grained sentiment analysis based on diverse affective aspects (such as surprise, like, anger, disgust, etc.) to investigate user experiences of online products or services more comprehensively.