A Multi-Criteria Collaborative Filtering Approach Using Deep Learning and Dempster-Shafer Theory for Hotel Recommendations

This paper addresses the problem of multi-criteria recommendation in the hotel industry. The main focus is to analyze user preferences from different aspects based on multi-criteria ratings and develop a new multi-criteria collaborative filtering method for hotel recommendations. Particularly, the proposed recommendation system integrates matrix factorization into a deep learning model to predict the multi-criteria ratings, and then the evidential reasoning approach is adopted to model the uncertainty of those ratings represented as mass functions in Dempster-Shafer theory of evidence. Finally, Dempster’s rule of combination is utilized to aggregate those multi-criteria ratings to obtain the overall rating for recommendation. Extensive experiments conducted on a real-world dataset demonstrate the effectiveness and efficiency of the proposed method compared with other multi-criteria collaborative filtering methods.


I. INTRODUCTION
Nowadays personalized recommendation plays a vital and indispensable role in most online service platforms such as e-commerce and new media websites to help users by providing suggestions regarding information that might be interest to them based on their preferences or historical data. For example, recommendation service has been incorporated into online market platforms such as e-commerce website [1], music/video/movies online store [2]- [4], and tourism [5]. In the tourism domain, travelers often spend time to search for hotels to stay from online travel websites based on their own requirements; such as a business traveler will prioritize the price and location of the hotel, while a tourist can put convenience and cleanliness first. Moreover, a traveler can only select one hotel at a time and it is quite inconvenience to change the choice if wrongly chosen. Therefore, a good The associate editor coordinating the review of this manuscript and approving it for publication was Yin Zhang . hotel RS is especially helpful for saving time for travelers and reducing advertising costs for hotel owners.
Generally, RS aims to suggest items for users that they might have an interest in. Basically, RS methods can be categorized into three groups: Collaborative filtering approaches recommend items based on the user-item historical interactions such as ratings or feedback; Content-based approaches extract the mutual information between user and item to state the recommendation, these approaches are suitable with video, audio, and text data; while hybrid RS approaches can integrate multiple ways to state the recommendation [6]. Most RSs rely on the rating history of users to items, which supports learning the user preferences, item characteristics, and some additional correlation information between users and/or items [7], [8]. Then, the unknown rating of a user to an item can be predicted using such information. Such an approach is usually considered as user-based and itembased collaborative filtering methods, which take the average rating from k-nearest neighbor of user/item to predict the VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ unknown rating [9]. On the other hand, matrix factorization or singular value decomposition (SVD) approach can learn the preferences of each user over items by optimizing the embedding matrices for all users and items [10], [11]. However, the finding of optimal values for embedding matrices is time-consuming compared to items-based and user-based collaborative filtering methods. Recently, deep learning has emerged as a powerful technique for handling collaborative filtering problems because it can handle nonlinear transformation, representation learning, sequence modeling, and flexibility; those are end-to-end differentiable and provide suitable inductive biases for RS [6].
As for the problem of RS in the tourism domain, the collaborative filtering method is suitable for hotel recommendation because of its characteristics. However, a single-criteria rating cannot capture the information of all aspects of user experiences. In the Tripadvisor platform, perhaps the most well-known website for hotel browsing and booking, it uses a multi-criteria rating system to represent the rating of users to hotels [12]. These criteria include the overall rating and seven sub-criteria ratings: Check in/front desk, Cleanliness, Service, Location, Room, Value, and Business Service. Thus, a multicriteria RS should be developed to deal with such multicriteria ratings. Several attempts have been made on developing multi-criteria RS, and most of them are based on the conventional methods for single-criterion CF problem [13]. The overall rating is computed by aggregation of sub-criteria ratings in which the basic techniques such as average [14] or weighted-sum [15] can be used for aggregation. However, these basis aggregation methods do not take into account the uncertainty of sub-criteria ratings. As mentioned, users may have a critical bias in some criteria, and in that case, the overall rating may eliminate these facts.
In this study, we aim at developing a new collaborative filtering method that combines a deep neural network (DNN) model with evidential reasoning based on Dempster-Shafer theory of evidence for multi-criteria RS. In particular, we first design a DNN model that integrates a matrix factorization technique with a multilayer perceptron for predicting criteria ratings, each of which is considered as a piece of evidence for making prediction of the overall rating. As predicted criteria ratings are inherently associated with uncertainty, we propose a new method based on Dempster-Shafer theory of evidence for modeling criteria rating predictions and develop a discounting-and-combination scheme for multi-criteria rating aggregation to obtain the overall rating. A real-world multi-criteria rating dataset extracted from the Tripadvisor platform is used for experiments to demonstrate the effectiveness and applicability of the proposed method.
The rest of this paper is organized as follows. Section II briefly recalls some related work and Section III provides preliminaries on matrix factorization, multilayer perceptron with ReLU units and Dempster-Shafer theory of evidence that form the basis for the development of the proposed recommendation method detailed in Section IV. Section V describes the experimental results and analysis.
Finally, Section VI wraps up the paper with conclusions and future work.

II. RELATED WORK
The collaborative filtering problem has a long history alongside the development of the Internet. The neighborhoodbased methods use the similarity between users and/or items to predict the degree of preference [16], [17]. Besides that, the model-based approaches also give significantly outcome by learning the correlation between users and items; the common model-based approaches are Latent Semantic Analysis [18], Support Vector Machines [19], Bayesian Clustering [20], and Singular Value Decomposition (SVD) [11], [21]. Among these models, SVD is one of the most widely used techniques and it also has a positive effect on the deep learning based approaches.
In terms of related work on multi-criteria RS, Jannach et al. [22] use basic techniques such as the item-based or user-based collaborative filtering methods to predict the sub-criteria scores, and the overall rating is then obtained by an aggregation function. Moreover, several deep learning based approaches for handling multi-criteria RS problem have been also investigated in [23], [24]. Recently, Hong and Jung [25] proposed a multi-criteria tensor model that considers not only user preferences and multiple rating but also incorporating cultural factor into recommendation processes. Note that in this model the overall rating is equally treated as individual criteria ratings and tensor factorization is applied for prediction of unobserved users' preferences without using an aggregation operator.
An aggregation technique is required for combining the multi-criteria ratings, such aggregation approach could be simple as average [14] or weighted-sum [15]. For more complex aggregation methods, Nilashi et al. [5] introduced to use an adaptive network-based fuzzy inference system (ANFIS) to learn decision rules for prediction of the overall rating. Moreover, Nassar et al. [23] introduced the first DNN model for solving this problem. More recently, Shambour [26] employed deep autoencoders to exploit hidden relations between users with regard to multi-criteria preferences and the arithmetic mean is used for aggregation to compute the overall rating prediction; while Sinha and Dhanalakshmi [27] adopted the Social Spider Optimization (SSO) for seeking the optimal weights of the DNN models of multi-criteria recommender systems.
Different from the aforementioned work, in this paper we first design a DNN model for predicting individual multicriteria ratings, which are considered as pieces of evidence supporting for the overall rating prediction. We then propose to model these criteria rating predictions by means of the so-called mass functions in Dempster-Shafer theory of evidence. This approach allows us not only to represent the uncertainty inherently associated with criteria rating predictions but also to develop a flexible framework for multi-criteria rating aggregation taking the criteria weights into account.

III. PRELIMINARIES
In this section we first introduce the problem of multi-criteria collaborative filtering and then recall basic concepts used in the development of the proposed method.

A. MULTI-CRITERIA COLLABORATIVE FILTERING PROBLEM
Let U , I , C be the sets of users, items, and criteria with cardinalities of N U , N I , and N C , respectively. Let R be a dataset consisting of the known multi-criteria ratings r uic which represent the rating of user u to item i at the criterion c, for u ∈ U , i ∈ I , and c ∈ C. The rating values r uic are normalized, for example, Tripadvisor use a Likert scale of five values {1,2,3,4,5} for presenting {Terrible, Poor, Average, Very_Good, Excellent}, respectively. In Figure 1, the dataset R can be illustrated by a 3rd-order tensor, which is usually a sparse tensor because each user often has ratings on just several items in practice. The collaborative filtering problem aims at predicting criteria ratingsr uic that are unknown for certain users, items, and criteria as well as their overall ratingsr ui0 .

B. MATRIX FACTORIZATION FOR COLLABORATIVE FILTERING PROBLEM
Matrix factorization is the most used technique for collaborative filtering problem, which simultaneously decomposes the inference rating information of users and items into same length matrices. Whereas, the singular value decomposition (SVD) is the most naive approach for factorizing the single-criteria collaborative filtering [10], [11]. In particular, users and items are mapped into f -dimensional latent factor spaces, such that the prediction ratings are the inner-product on that space, and f is the embedding size. Note that, for the single-criteria collaborative filtering, each user can give a single rating for an item only, let r ui andr ui be the known rating and unknown rating, respectively. Let q i ∈ R f be the embedding vector of item i and p u be the embedding vector for user u, then the prediction rating of user u to item i is the inner-productr ui = p T u q i . These values of p ∈ R N U ×f and q ∈ R N I ×f can be learned by a machine learning technique to minimize the loss with the known ratings in the dataset R. Formally, the objective function can be defined as: where λ is the parameter controlling the sparsity of embedding matrices. The above objective function then can be optimized by Stochastic Gradient Descent (SGD), Adam optimizer, or Alternating Least Square (ALS) [10]. SVD can be also extended further upon addition the bias for each user and item, which can capture the variations of individuals. Let b u and b i denote respectively the bias embedding variables for user u and item i, the new objective function with basis information is defined as: being the average rating of all known ratings.
In this research, the SVD approach is adopted as a layer in our deep learning model.

C. MULTILAYER PERCEPTRON (MLP)
MLP is the basic feed-forward DNN with multiple hidden layers of threshold activation functions between the input layer and output layer. MLPs can be referred to as the universal approximators which can be represented by stacked layers of nonlinear transformations of activation functions [6]. Because the activation functions are not necessarily strictly binary classifiers, MLPs are suitable for predicting multicriteria Likert-scale scoring [6].
In this paper, a customized MLP to estimate the preferences of users and items is proposed to handle multi-criteria RS in the travel domain.

D. RECTIFIED LINEAR UNIT (ReLU)
In context of DNN, ReLU is a simple neuron unit that uses an activation function to give the positive part of the input [28], [29].
where x is the input value. This is also called as ramp function. ReLU is widely used in DNN for feature extraction, computer vision, and speed recognition with high effectiveness compared to basic activation functions such as logistic sigmoid and hyperbolic tangent [29].
Our DNN model defines multiple hidden layers of ReLU to increase the accuracy of the SVD layer.

E. DEMPSTER-SHAFER THEORY (DST)
DST, also called evidence theory or theory of belief functions, provides a general framework for modeling and reasoning with uncertain and incomplete information as well VOLUME 10, 2022 as a powerful tool for combining evidence from multiple sources [30], [31]. In the context of DST, a frame of discernment X is defined as a finite set of mutually exclusive and exhaustive hypotheses, e.g. the set of all possible answer to a given question, and a piece of evidence regarding the true answer to the question is represented by a so-called mass function m : 2 X → [0, 1] satisfying: where 2 X is the power set of X . For A ∈ 2 X , the quantity m(A) is interpreted as a measure of the belief exactly allocated to the hypothesis ''the true answer is in A''. Two evidential operations that play an important role in the evidential reasoning are discounting operator and Dempster's rule of combination [30]. The discounting operation is used when a source of evidence represented by a mass function m is known to have probability α of reliability. Then we can discount the mass function m at discount rate of (1 − α), resulting in a new mass function m α defined by When two distinct sources of information providing two corresponding pieces of evidence on the same frame X represented by two mass functions m 1 and m 2 , we can then use Dempster's rule of combination to generate the combined mass function, denoted by m ⊕ = (m 1 ⊕ m 2 ) (also called the orthogonal sum of m 1 and m 2 ), which is defined, for any A ∈ 2 X \ ∅, as follows is the combined mass assigned to the empty set before normalization. Clearly, Dempster's rule of combination is only applicable when m ⊕ (∅) < 1. As for decision making, a mass function m encoded the available evidence must be transformed into a so-called pignistic distribution function [32], denoted by Bp m : X → [0, 1], which is defined as follows In this study, we apply DST for modelling uncertainty associated with individual criteria ratings predicted by a DNN model and aggregation of these criteria ratings for the overall rating. Recall that the set of rating grades used in hotel rating consists of five levels {1, 2, 3, 4, 5} representing {Terrible, Poor, Average, Very_Good, Excellent}, respectively. The use of DST for multi-criteria collaborative filtering problem proposed in this paper is basically similar to the evidential reasoning approach for multiple attribute decision-making under uncertainty as studied in [33]- [35].

IV. THE PROPOSED RECOMMENDATION METHOD
In this section we describe the proposed recommendation method consisting of two main phases of training and prediction: (1) In the first phase, a DNN model for predicting individual criteria ratings will be trained and the optimal parameters for DST modelling of uncertainty associated with predicted criteria ratings will be also determined using the collected data; (2) In the second phase, the trained DNN model will be used first to predict unknown criteria ratings, which will then serve as sources of evidence represented by DST modelling for predicting the overall rating. The outline of the proposed method is graphically illustrated in Figure 2.

A. MULTI-CRITERIA DNN MODEL
The proposed DNN model first starts with an embedding layer in which each user and item are mapped to a f -dimensional vector using the SVD method. Therefore, the model uses a 2-order tensor p u ∈ R N U ×f for embedding users and a 2-order tensor q i ∈ R N I ×f for embedding items. However, instead of using inner-product of p u and q i to predict the rating value for user u over item i, p u and q i are concatenated into a single vector before forwarding to the next layer, as similar to the approach taken by Nassar et al. [23]. The flowchart of the proposed DNN model is depicted in Figure 3.
Then the DNN model continues with a series of layers of ReLU neurons. Because the output of the SVD layer has 2 × f neurons, the first rectified linear layer receives 2 × f signals. In this study, the number of hidden layers is designed to be around 3 and 6 to optimize the performance of the Tripadvisor dataset.
Note that the number of neutrons in the output layer of the proposed DNN model must fit with the number of criteria, which is 4 for our testing dataset. For optimization of the proposed DNN model, basis optimizers such as Stochastic Gradient Descent (SGD) or Adam can be applied [36]. where masses are determined based on the probability density function (PDF) estimated using the training data.
Particularly, let denoter uic the predicted rating of user u to item i on criterion c. Then, the mass function representing   where PDF(r uic , σ , x) is the Gaussian's probability density function: x−r uic σ 2 (11) and the masses are normalized such that: Figure 4 graphically illustrates two specific cases of evidence modeling. It is worth noting here that the softening parameter σ used in Gaussian's probability density function PDF is also estimated using the training data. In other words, it can be learnt from the known criteria ratings r uic and their overall ratings in the training dataset R. More particularly, the optimal value of σ is determined by minimizing root mean squared error (RMSE) score as follows:  whereR σ * is the set of all predicted overall ratings, each of which is obtained by using DST aggregation of individual criteria ratings generated by applying softening parameter σ * .

C. EVIDENCE COMBINATION FOR OVERALL RATING PREDICTION
In the previous step, each predicted criterion ratingr uic is represented by a mass function mr uic defined by (10)-(12). Then we have N C mass functions representing N C pieces of evidence from criteria predictionsr uic , for c = 1, . . . , N C . Consider further that each individual criterion c is associated with a weight w c representing its relative importance in contribution to the overall rating, we now combine these N C mass functions mr uic taking their relative importance into account to generate a combined mass function for predicting the overall rating. This can be done within the framework of DST making use of the discounting and Dempster's rule of combination. Formally, the overall mass function, denoted by mr ui , generated by combining mr uic 's taking their weights w c 's into account is defined by: where m w ĉ r uic is the discounted mass function as defined in (5)-(6), while ⊕ is Dempster's combination operator as defined by (7) above. Specifically, for two criteria c 2 and c 2 with their weights w 1 and w 2 , respectively, by (5) (17) where K is the normalizing factor defined by The mass assigned to the whole set H of the combined mass function is called the ignorance score, which essentially reflects the ignorance regarding prediction of the overall rating as a result of multi-criteria rating combination. Intuitively, the higher the weight of a criterion rating, the less ignorance it attributed to the combination. This observation motivated us to determine criteria weights based on the data. In particular, the weight w c of a criterion c is defined as the normalized inverse-value of the difference between the criterion's average rating and the overall average rating as follows: wherer c is the average rating score of the criterion c andr is the average overall rating. Practically, the interpretation of this weighting method is that the larger the difference of a criterion's rating and the overall rating, the less important of the criterion's rating in contribution to the overall rating. It is of interest to note that the weighting technique studied in [15] can also be used to determine the criteria weights. Finally, the aggregated mass function mr ui , i.e. (14), is considered as uncertain assessment of the overall rating of user u over item i, denoted byr ui . In order to make prediction for the overall ratingr ui , this mass function mr ui is eventually transformed into its corresponding pignistic distribution function, denoted by Bp ui via (9) and then the overall ratingr ui is obtained as the following expectation: where particularly , for any l ∈ H 37286 VOLUME 10, 2022 As stated, the proposed recommendation method follows a multi-criteria-rating-prediction-and-combinationbased approach that interestingly allows for incorporating the uncertainty associated with multi-criteria rating predictions into the combination by means of DST to generate the overall rating prediction. This makes our model different from the previously developed models mostly based on the aggregation-function-based approach introduced in [14].

D. DNN_DST ALGORITHM
The main steps of the proposed method previously described are summarized into Algorithm 1 below.  Figure 3, which takes user u and item i as the inputs and returns N C outputs. 2: Train the DNN model using known multi-criteria ratings r uic in the dataset R. 3: Learn the optimal softening parameter σ for evidence modeling using (13). 4: Represent N C individual criteria ratings as N C pieces of evidence as detailed in Section IV.B. 5: Combine N C pieces of evidence using (14) and make prediction of the overall rating using (19). 6: return Trained multi-criteria DNN model, σ .

Algorithm 1 DNN_DST Algorithm
In the following section, we present experiments conducted on a real-world dataset to demonstrate the applicability and efficiency of the proposed multi-criteria recommendation method.

A. DATA SET
A real-world dataset of 533,430 multi-criteria reviews from 291,793 users for 8,297 hotels in Vietnam extracted from Tripadvisor.com is used for experiments. After eliminating isolated users and hotels, the testing dataset comprises of 274,572 multi-criteria ratings from 84,579 users to 6,854 hotels with a sparsity of 99.9526%. The collected criteria are value rating (27.507%), location rating (27.478%), cleanliness rating (27.341%), service rating (7.511%), and the overall rating (100%). A small portion of the testing dataset is shown in Table 1. In order not to affect the processing speed of the tensors, the missing ratings are filled with the average rating on each row in the dataset.

B. EVALUATION METRICS
To evaluate the predicting performance of the proposed method on the Tripadvisor dataset, we divide the dataset into training and testing datasets using 5-fold cross validation technique. Let denote R and R * the training and testing datasets, respectively. The recommendation systems give the outcome setR of predicting ratings which has the corresponding pair of users and items as in the testing dataset R * . To measure the effectiveness, the basis loss functions and coefficient of determination adopted are as follows: • Mean absolute error (MAE): MAE measures the average magnitude of the errors in a set of forecasts: • Root mean squared error (RMSE):

Coefficient of Determination (CoD): This metric can
show the proportion if the variance in the dependent variable that is predictable from the independent variable(s).
wherer is the mean rating of R * . In terms of complexity analysis, the processing time including training and testing time of the proposed and compared methods are also measured.
Note that, for MAE, RMSE, and computation time, lower values mean better performance.

C. COMPETITORS
To demonstrate the efficiency and advantages of the proposed method, we conducted experiments using not only the conventional approaches such as CF_user [16] and CF_item [17] but also advanced models such as SVD [11] that is based on matrix factorization techniques, SVD++ [37] that extends the SVD model by considering implicit information to further improve the prediction accuracy, and RNN4Rec [38] that is a recurrent neural networks (RNN) based approach for sessionbased recommendations. As those models were developed for the single-criterion CF problem, for applying them to the multi-criteria CF problem, we deploy the models for criteria rating prediction and use the arithmetic mean to predict the overall rating as done in [26]. This strategy basically follows the aggregation-function-based approach presented in [14]. The multi-criteria versions of SVD, SVD++ and RNN4Rec are denoted by MSVD, MSVD++ and MRNN4Rec, respectively. The Python implementations of the aforementioned models can be found in LibRecommender 1 package in the PyPI repository.
In terms of multi-criteria CF approaches, the state-ofthe-art methods such as HOSVD_ANFIS [5], [39] and DNN_DNN [23] are implemented for comparative study.  HOSVD_ANFIS provides a multi-criteria recommendation model that not only improves the recommendation quality and predictive accuracy but also is able to handle the scalability and sparsity problems in multi-criteria collaborative filtering. Basically, this model integrates the HOSVD [37] for dimensionality reduction and cosine-based clustering with ANFIS for inducing fuzzy rules to predict the overall ratings. DNN_DNN method consists of building two deep neural networks, one for predicting criteria ratings and the other for learning an aggregation function to be used for the overall rating prediction. In addition, we have also implemented other combinations of the multi-criteria rating prediction and aggregation approaches mentioned above. The first combination is denoted by DNN_ANFIS, which integrates a DNN model for predicting multi-criteria scores with ANFIS model for the overall score prediction; the second combination is denoted by HOSVD_DNN which uses HOSVD to train the prediction model for multi-criteria ratings and a DNN for the overall score aggregation. These DNN models are trained using the LibRecommender package and Tensor-Flow 2 library.
All experiments were conducted using a high-end computer with Intel Xeon G-6240M 2.6GHz (18 Cores x4) CPU. 2 https://github.com/tensorflow/tensorflow For the proposed method, we use PyTorch 3 library to train the DNN model, evidential reasonning 4 package to analyze the uncertainty, and pyds 5 package to aggregate the criteria ratings. Figure 5 demonstrates the effectiveness of different optimizers which can be used to optimize the used DNN model with different learning rates. As the result, different optimizers have different optimal learning rate values for the testing dataset. In particular, Root Mean Square Propagation (RMSprop) and Stochastic Gradient Descent (SGD) can achieve the lowest MAE scores, because SGD tends to take less computation time than RMSprop, we recommend using SGD to optimize the proposed method. For the rest of the experiments in this paper, the SGD optimizer is used by default.

D. PERFORMANCE ANALYSIS
We also conduct an experiment to see how the selection of softening parameter σ affects the performance of the proposed method. Figure 6 depicts the relationship between soft-   ening parameter σ and the performance in terms of RMSE. From this experiment, the optimal σ values for the Tripadvisor dataset are between 1.3 and 1.4, and we set σ = 1.35 for the following experiments.

E. IGNORANCE ANALYSIS
Additionally we conduct an experiment to analyze the ignorance regarding prediction of the overall ratings resulted from multi-criteria rating aggregation. Figure 7 shows the ignorance scores over the criteria weights involved in criteria rating combination for the testing dataset. Specifically, the ignorance scores resulted from evidence combination of all pairs of criteria are shown.
It is shown that the first criterion (Value criterion with its weight w 1 ) highly contributes to the ignorance score in criteria rating combination. This can be intuitively explained that because most of travelers prepay the hotel fee done long before the date they start staying at the hotel. Thus, they might be satisfied with the price long before the time they are providing ratings that are usually given during or right after travelers leave the hotel. Consequently, it makes Location, Cleanliness, and Service become more important and sensitivity. The numerical results also give the same support, VOLUME 10, 2022 the Value has the largest difference from the Overall rating value, which is 0.596 on average, while the other criteria Location, Cleanliness, and Service take 0.512, 0.492, and 0.509, respectively. The second fact that can be seen from the figure is that the Location (w 2 ), Cleanliness (w 3 ), and Service (w 4 ) are similar in their contribution to the ignorance score in criteria rating combination.

F. PERFORMANCE COMPARISON
The proposed method (DNN_DST) was first compared with conventional CF approaches, namely User-based (CF_user) and Item-based (CF_item) methods. Figure 8 shows that DNN_DST outperforms these two conventional methods in terms of recommendation accuracy. However, as DNN_DST needs time for training the DNN model, it takes significantly more time compared to the work of finding the nearest neighbors.
Then, as previously described, a comprehensive experiment was conducted to evaluate the performance of the proposed DNN_DST method by comparing it with the recent state-of-the-art methods for the multi-criteria CF problem, namely MSVD, MSVD++, MRNN4Rec, DNN_DNN, HOSVD_DNN, HOVSD_ANFIS, DNN_ANFIS. Figure 9 shows the experimental results with evaluation metrics of MAE, RMSE, CoD and computation time according to four hyperparameters used for tuning. Table 2 summarizes the average scores of the compared methods over different tuning parameters. As the results indicated, MSVD is the fastest one but also the most inaccurate approach because of its simplicity. Besides that, DNN_DNN method which is somewhat sophisticated as using two separated DNN models shows relatively good results in both accuracy and computation time. Furthermore, DNN_ANFIS and HOSVD_ANFIS tend to be the slowest methods as significant additional time is required in those methods for training the ANFIS network, while their results are not better than that of the proposed approach as well as other DNN-based methods.
Especially, the results show that in most cases the proposed DNN_DST method achieves the best performance with the lowest MAE, RMSE and the highest CoD among compared methods, and it is also the second fastest method.

VI. CONCLUSION
In this study, we proposed a new approach named DNN_DST that integrates a DNN model with Dempster-Shafer theory of evidence for multi-criteria CF problem. Essentially, the proposed approach first designed a DNN model that incorporates a SVD technique as its first layer for predicting multi-criteria ratings, and then adopted the evidential reasoning approach for modeling those ratings as pieces of evidence by means of mass functions to be aggregated for prediction of the overall rating. Interestingly, by modeling criteria ratings predicted from the DNN model as pieces of evidence, we are able to analyze the uncertainty inherently associated with these predictions to be incorporated into multi-criteria rating aggregation. We also proposed a data-driven method for determining criteria weights which represent relative importance of criteria ratings in prediction of the overall rating. Experiments conducted on a real-world dataset showed that the proposed recommendation method outperforms other state-of-the-art methods in terms of MAE, RMSE, and CoD scores in most testing cases, while it also shows a comparable efficiency compared to the existing CF techniques.
As for the future work, we plan to explore the following two aspects to further improve the proposed DNN_DST approach. First, we will investigate the problem of how to design a unified architecture for the proposed DNN_DST approach that allows to optimize multiple parameters simultaneously so as to significantly reduce the training time while maintaining the recommendation performance of the proposed model. Second, we will further investigate methods for estimating softening parameter σ and for determining criteria weights and evaluate the impact of these parameters on modeling uncertainty associated with criteria rating predictions and ignorance resulted in multi-criteria rating aggregation. Moreover, it is also interesting to investigate how the proposed approach can be extended for sequential recommendation models [40], [41] and graphical recommendation models [42]. Finally, we will explore the applicability of the proposed approach in real-world systems. VOLUME 10, 2022 QUANG-HUNG LE received the Ph.D. degree in computer science from the University of Engineering and Technology, Vietnam National University, Hanoi, in 2016. He is currently a Lecturer with the Department of Information Technology, Quy Nhon University, Vietnam. His research interests include natural language processing, machine learning, artificial intelligence, and recommender systems. He has published a number of conferences and journal papers in these areas. VAN-NAM HUYNH (Member, IEEE) received the Ph.D. degree in mathematics from the Vietnam Academy of Science and Technology, in 1999. He is currently a Professor at the School of Knowledge Science, Japan Advanced Institute of Science and Technology (JAIST). His current research interests include machine learning, data mining, AI reasoning, argumentation, multi-agent systems, decision analysis and management science, and kansei information processing and applications. He also serves as an Area Editor for International Journal of Approximate Reasoning, the Editor-in-Chief for International Journal of Knowledge and Systems Science, and the Editorial Board Member for the Array journal. VOLUME 10, 2022