Dynamic Collaborative Filtering Based on User Preference Drift and Topic Evolution

Recommender systems are efﬁcient tools for online applications; these systems exploit historical user ratings on items to make recommendations of items to users. This paper aims to enhance dynamic collaborative ﬁltering on recommender systems under volatile conditions in which both users’ preferences and item properties dynamically change over time. Moreover, existing collaborative ﬁltering models mainly rely on solving data sparsity by adding side information to improve performance. We propose a model to capture the user preference dynamics in the rating matrix by using a joint decomposition method to extract user latent transition patterns and combine latent factors together with the associated topic evolution of review texts by using topic modeling based on the dynamic environment. We evaluate the accuracy on real datasets, and the experimental results show that the model leads to a signiﬁcant improvement compared with the state-of-the-art dynamic CF models.


I. INTRODUCTION
The use of recommendation systems (RSs) has become increasingly widespread in the field of online applications. While browsing, users can be provided with suggestions based on the types of movies, books or musics that they search for. Well-known companies, including Amazon, LinkedIn, and Netflix, employ systems that make product recommendations that match the buyer's interests to induce customers to make further purchases [1]. Using previous ratings, recommendation systems wish to find and understand users' preferences and tastes by analyzing online information. Most recommendation models for predicting ratings are based on collaborative filtering (CF), where the user's preferences are predicted based on the relationship between users and items [6], [7], [45], [46].
In fact, users' preferences may drift over time, which tends to frequently redefine their tastes [4], [63]. The changes in users' preferences can originate from substantial reasons, like personality shift, or transient and circumstantial ones, like seasonal changes in item popularities. Disregarding these The associate editor coordinating the review of this manuscript and approving it for publication was Pasquale De Meo. temporal drifts in modeling users' preferences can result in unhelpful recommendations. For example, in the movies domain, the user can change his rating behavior, her preference for genre over language, or start favoring drama over comedy. Additionally, besides rating information provided by users, users can provide feedback in the form of tags or written product reviews. Such user-generated content can serve as a useful source for deriving explanatory information that may increase the user's understanding of the underlying criteria and mechanisms that led to the results. Many e-commerce leaders such as Amazon and Netflix have made recommender systems a crucial part of their services assisting enhance user satisfaction and loyalty. The user preference dynamics have a significant impact on RSs. Therefore, accurately estimating user's interests dynamically is challenge in this research domain, and it has attracted considerable attentions by users from systems.
Many of the researches have been carried out to consider temporal dynamics, in which users' preferences drift over time. For example, [25] used the drift and the decay factors for tracking the changes of the global behaviour of users and items to perform rating predictions but did not use the personal behaviour. [21] provided a novel algorithm to VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ compute the time weights for different items by decreasing weights to historical data. This approach deemphasizes the importance of historical data; however, the historical data might be helpful to facilitate other users' recommendation. [24] proposed a model for capturing a customer's purchasing interest on products recommendation. The model focused on the current stage and propose a clustering-based method and a multi-modal graph ranking method to obtain the recommended item list. Additionally, [37] proposed a method for detecting a user's time-variant pattern in order to improve the performance of collaborative filtering. Further, social media systems, considered by [59] for dynamic user preference and recommendation, used intrinsic interests and temporal context-aware mixture models. However, most of them only consider the temporal dynamics between users and items within a relatively short time period, which cannot explicitly capture the exact interest of users. Another issue with CF approaches is data sparsity. The operation of missing rating prediction in the user-item interaction will be drastically reduced if the data are limited. The sparse data leads to generate potentially untrustworthy predictions [2], [35]. Although CF models can effectively predict temporal rating data exist, their capability of tracking the dynamics of user preferences is limited on sparse data, and there are few dynamic CF models that both consider drift on the basis of users' preferences and handle data sparsity [53], [60].
In recent years, review texts, which provide additional useful information that can be interpreted as user interests, have contributed to improving the prediction accuracy [13]. Reviews are created by users to provide their opinions to support the reason why they give a rating to appearance items and which feature of the item applies most to its ratings. To solve the rating sparsity issue, the review topics can be helpful for enriching the available ratings by using topic modeling to mine the underlying structure hidden in the unstructured texts automatically. For example, [34] considered the temporal changes in user and item latent features together with the associated review texts in a single learning stage. Additionally, [36] proposed hidden factors as topics (HFT) to exploit review topics and latent factors to enhance the rating, and [38] showed that review texts and topic profiles indeed correlate with ratings. They used review topics to weight ratings, but they did not address the review topic evolution. To consider the dynamics of user preferences, we assume that the review topics dynamically change over time, and we focus on discovering hidden review topics based on the underlying temporal patterns guiding the topic evolution.
There are a few approaches that have handled user preference drift and other subproblems, such as data sparsity [29], [30], [42], [55]. Recently, OCF-DR [29] use neighborhood factor to track the drift of users' preference. TCMC [42] was proposed by combining multimodal information with scores to reduce data sparsity, but this model also considered multimodal data that did not uncover the hidden topic factor from reviews to explain the score on each time step.
Motivated by this idea, it is interesting to learn how RSs can analyze text reviews and ratings to solve the data sparsity problem based on user preference dynamics and achieve higher accuracy. The contributions of this work are summarized as follows: • We propose a novel model to capture changes in users' preferences in the rating matrix and adapt the influence of review texts to improve the accuracy for data sparsity over the time evolution.
• We exploit users' opinions as review texts to find hidden topic evolutions that can explain why a user rates the score by using a topic modeling technique at different time steps. We believe that the review texts play an important role in users' decisions.
• We evaluate the accuracy on seven widely used datasets to demonstrate the effectiveness of the proposed model, and the experimental results indicate that the model leads to a significant improvement compared with the state-of-the-art dynamic CF models.
The remainder of this paper is organized as follows. Section II provides a brief discussion of related work. Section III presents the proposed recommendation model and how to address an optimization problem regarding the models. Then, Section IV provides the empirical experimental results of the model compared with the state-of-the-art approaches using well-known datasets. Finally, conclusions, discussions and future work are given in Section V.

II. RELATED WORK
In this section, we review several popular RS approaches, including (1) dynamic CF and (2) topic modeling based on contents.

A. DYNAMIC COLLABORATIVE FILTERING
The existing models of CF RSs are classified into memory-based (neighbor-based) and model-based approaches [46]. Memory-based CF approaches are based on the fact that similar users have similar patterns of rating behavior and users provide similar ratings to similar items. Recently, [6] present a novel method for RS known as IPWR (improved PCC weighted with rating preference behavior) similarity measure. It takes into account the user RPB (rating preference behavior) towards an item rating to improve standard Pearson correlation coefficient similarity measure method. The model result performs better than state-of-theart similarity measure methods. Conversely, model-based CF approaches use the collection of ratings to learn a model, which is then used to make rating predictions. Newly, [7] proposed the model to consider users' preference similarity with social trust (i.e., both explicit and implicit trust) and create a more complete unified rating profile. They define function significance by considering users' positive and negative opinions to enhance the RS performance of CF-based when only a limited set of ratings is available. However, the models ignore the user preference dynamics domain.
Dynamic RSs have become a new trend in the research field [1], [10], [44], and traditional CF models do not concentrate on changing users' preferences over time; they assume that users' preferences are stationary [4]. Hence, the challenging problem with the approach taken by most researchers is that they mainly ignore the drifting preference. To consider the temporal dynamics, time information has been introduced into CF approaches to model users' dynamically evolving preferences and interests [4], [21], [25].
Since data generally change dynamically over time, there are memory-based models that use time-aware approaches, such as the time-weighted CF [21] proposed to improve the quality by using the weights of rating scores. DTDM [47] exploits clustering and a time impact factor matrix to monitor the degree of user interest drift in the class based on the nearest neighbors' time factor coefficient matrix. Another approach, [15] considers the user's response and the dynamic interests of social user and proposes a similarity measurement. However, the main drawback of techniques is the requirement of loading a large amount of in-line memory.
In contrast, the dynamic CF of model-based approaches can instantly respond to user requests [4], [18], [25], [37], [56]. For example, modeling approaches based on probabilistic matrix factorization, such as time singular value decomposition (TimeSVD++) [25], add the time change into the neighborhood model to be combined with the latent factor for capturing users' preferences. Moreover, there are models based on tensor factorization [43], [57], such as Bayesian probabilistic tensor factorization (BPTF) [57] where proposed ratings are represented as triples (user, item, time); then, these triples are organized into a three-dimensional tensor. The models learn the global evolution of latent features, and ratings are predicted using the inner product of the latent factor vector, but the disadvantage of the model is that it is not sensitive enough to capture the local changes in preferences.
However, it is widely known that CF approaches suffer from a data sparsity problem [39], [41], [50]. Recently, a few temporal dynamic models have considered this problem. Temporal collective matrix factorization (TCMF) [42] was proposed considering both temporal dynamics and multimodal information for addressing data sparsity, but they exploit only implicit feedback of user comments and ignore the hidden meaning of additional information. Moreover, TMRevCo [55] considers temporal dynamics and side information. This approach uses a more appropriate item correlation measure in the cofactor and associates the item factors of the cofactor with that of MF, but it focuses on item correlation dimensions, which is a different method than our work.

B. TOPIC MODELING BASED ON CONTENTS
Topic modeling has been applied to interpret contents in unstructured texts. It can be used to automatically extract valuable hidden elements. Conventional topic modeling techniques such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) are widely used to infer latent topical structures from documents [9], [14], [20], [23].
In contrast, nonnegative matrix factorization [27], a distinct topic mining method compared to the above probabilistic perspective, models the underlying components as coordinate axes, and each document corresponds to a unique point in the latent topic.
As mentioned in the previous section, a sparse dataset leads to a lack of common rated items, and items cannot be selected to make effective predictions. Hence, the contents are useful information for improving performance. Recently, topicmodel-driven recommendation domains are demonstrated in many existing studies [3], [8], [11], [52]. Collaborative topic regression (CTR) [52] utilizes LDA to discover the hidden latent structure of scientific articles. It incorporates the latent topic spaces with traditional collaborative filtering to recommend scientific articles to users of an online community. However, their main assumption is that each item only corresponds to one stationary text, regardless of different users who interact with the item. This does not satisfy that each item has multiple reviews. Moreover, Shivam Bansal et al. [8] proposed a latent semantic-based approach by using topic modeling, which automatically provides the best suggestions to determine the most relevant user-job connection mapping according to the skills and preferences of a user.
By incorporating users' review information, some works exploited the meanings of reviews to help the system grasp what the user is most interested in and which aspect of the item contributes most to the rating to overcome the data sparsity problem [13], [32], [51], [62]. For example, HFT model [36] combines ratings with review texts for product recommendations. They focused on the hidden factors in product ratings with hidden topics in product reviews, and exponential transformation was adopted to guarantee that the topic probability distribution of each review was associated with the user or item topic factor. The aspect-aware latent factor model (ALFM) [16] exploits reviews to analyze each user's attention to different aspects of the target item and then integrates the attention weights into the matrix factorization. By associating latent factors with aspects, the aspect weights are integrated with latent factors for rating prediction.
These methods discover static topics. They ignore explaining that the topics that the user prefers can change between past and present, while models that utilize topic modeling to identify the interest of users change over time [19]. In recent years, topic evolution has referred to model topics changing over time [5]. This research helps to overcome information overload and represent complex and dynamic phenomena [58]. The earliest work on topic evolution proposed topic detection and tracking (TDT) [5]. This approach mainly focused on the topic evolution of text streams, which includes topic content evolution and topic strength evolution. Moreover, [49] presented a time-based collective factorization model for topic discovery and monitoring in news to discover and track hidden topics, and this model relies on the underlying temporal patterns guiding the topic evolution. Additionally, this model is able to connect topics between different time slots. VOLUME 8, 2020 On the other hand, some of the works [12], [17], [54], [64] have explored deep neural networks to perform an in-depth understanding of textual item content and achieved impressive effectiveness by generating more accurate item latent models. In dual-regularized matrix factorization (DRMF) [54], a multilayered neural network model that stacks a convolutional neural network and a gated recurrent neural network is applied to generate independent distributed representations of contents of users and items. Moreover, in deep cooperative neural networks (DeepCoNN) [64], reviews are first processed by two CNNs to learn representations of users and items, which are then concatenated and passed into a regression layer for rating prediction. A limitation of the model is that the performance decreases greatly when reviews are unavailable in the testing phase.
For review information, both topic modeling and deep neural network are used to extract features to represent items, and regularization strategy is mostly used to combine with matrix decomposition. However, deep learning is well known to operate similar to a black box. It requires a massive amount of data to fully support its rich parameterization and provides explainable predictions seem to be a substantially challenging task [33], [40]. Meanwhile, topic modeling can provide the word co-occurrence relation to supplement for information loss. Hence, in this paper we focus on discovering hidden review topics by using topic modeling to guide the topic evolution.
In summary, the state-of-the-art strategies either do not capture users' preference shifts or analyze the topic evolution of user feedback reviews; consequently, they do not perform both of these tasks simultaneously.

III. PRELIMINARIES
In this section, we will present some preliminaries and the formal definition of a CF recommendation, and then we will present the motivation for our work.
We establish some terminologies for the task of dynamic CF to fill in an incomplete rating matrix whose ratings are associated with timestamps. For each user and item, we are given a set of data, where each data point is represented by the following five-tuple: (user, item, rating, reviews, timestamp) Specifically, we denote the set of users as U = The preference rating is represented by matrix R ij , and we exploit the additional information from the collection of review texts as D = {d 1 , d 2 , · · · , d n } to find the hidden factors of items by using the topic distribution and link them together. Nonnegative matrix factorization (NMF) is a powerful matrix decomposition technique in which the domain of the data is inherently nonnegative and parts-based decompositions are desired [28]. In general, NMF seeks a nonnegative matrix W ∈ R m×d and a nonnegative matrix H ∈ R d×n , and d is the number of latent factors such that the factorized matrix D ≈ WH . For squared loss, NMF finds a low-rank approxi-mation to a data matrix D by minimizing the Frobenius norm of D − WH fro under nonnegativity and scaling constraints on the factors W and H.
Capturing users' preferences is a main task in the prediction process. In recent years, the nonnegative constraint of NMF in latent factor space has been used to model a user's preferences [18], [31]. The model simplifies the rating by characterizing the features of both users and items to extract latent factors from a rating matrix. Specifically, given matrices U ∈ R i×k and V ∈ R k×j , corresponding to the number of k latent factors, the predicted rating is thenR ij ≈ UV T . The goal is to find entrywise nonnegative matrices U and V by minimizing the following objective function: where · F is the Frobenius norm. Our model is formally defined as follows: Definition 1 (User-Item Rating): The user-item interaction matrix R ∈ R i×j is factorized into the user factor matrix U ∈ R i×k and the item factor matrix V ∈ R k×j (k is the dimension of the latent factor) at each time t. The predicted ratingsR t ij ≈ U t V t are estimated at the present time period t and the previous time period t-1 by minimizing the following objective function: where · F is the Frobenius norm.
to learn the dynamics of users' preferences from time period t − 1 to t. Definition 3 (User-Item Reviews): The review comes with an overall rating R ij to indicate the overall satisfaction of user i with item j. Given text reviews, D ij is a collection of reviews of item set V written by a set of users U. Review topics are the aspects of an item that a user discusses in a review. To learn the topic, the item factors V j,k are linked with topic distributions θ i,j d,k by using topic modeling and softmax transformation [36] where θ i,j d,k implies the kth dimension of the topic distribution of the review d written by user i for item j. Let κ be a parameter to control the 'peakness' of the transformation.
Definition 4 (Review Topic Evolution): As time passes, the topic of review collection evolves. The topic models find the topic distribution at different time steps to capture the topic evolution over the time span.

IV. PROPOSED METHOD
In this section, we describe the proposed method, shown in Figure 1. Given a set of historical ratings and reviews, the first task is to capture the latent user factor drift between consecutive time points to track the user transition factor. To consider both the user preference dynamics and topic evolution, we also learn the hidden topic evolution on reviews via topic modeling and transform the topic distributions of reviews and latent factors to improve the prediction performance.

A. CAPTURING THE USER FACTOR DRIFT
Motivated by the fact that there is a change in user latent vector U i in each time step, we consider a model for the dynamics of the user latent factors in a time series. We assume that user interests evolve smoothly during one time period. Let us learn the relationship between U t i and U t−1 i from the previous time t − 1 to the present time t.
The conventional NMF assumes that the data and the latent factors are static. Clearly, this assumption is not appropriate for temporally drifting data. For capturing the user factor drift, we apply state transition matrix Z (t) i , to represent the mapping of the user latent factor between time steps with the linear dynamical systems (LDS) [22], [48], which is an effective way to reveal the relationship between user latent factors at each time step, where U . The above discussion motivates the following objective function that is optimized every time step for the following minimization problems: where the transition hypermatrix Z i ∈ R D×D for user i models the transition of user i's preferences in the latent space from the previous time to the next. Accordingly, for each time step t, given R (t) and U (t−1) , joining the above two decompositions, we derive: We propose a model to extend NMF for tracking the drift of the user latent factor, as in (2) and the state transition concept, as in (5), as a unified model over the user factor matrix and item factor matrix to avoid overfitting and to capture the rating and the preference drift in the period, as shown by the following joint minimization problem: where · 2 F denotes the Frobenius norm. The first term in the cost function represents the minimization of the errors between the observed data and the recovered data. The second term shows the minimization of the errors of estimating the transition matrix over t − 1 transitions for capturing the user factor drift.
For a clearer understandings, we explain user facter drift based on temporal dependence between the latent user vectors pattern toward the first factor; the transition matrix 0 1 1 0 represents the alternating changing pattern between two latent vectors. In a higher-dimensional latent space, several patterns could occur at the same time. At the current time t, we learn the latent user vector U t i and latent item vector V t j for each user i and each item j, as well as the transition matrix Z i ∈ R K ×K from all the ratings collected up to time t.

B. LEARNING THE TOPIC EVOLUTION OF REVIEWS
The reviews can address the rating sparsity problem because the discovered review topics can be used to enhance the ratings in CF. Given a corpus D, which contains reviews of users toward items d i,j |d i,j ∈ D, i ∈ U, j ∈ V , we assume that a set of latent topics (i.e., K topics) covers all the topics that users discuss in the reviews. We consider the review topic evolution in each time period T = {t t , t t−1 , t t−2 , · · · , t t−n }.

1) LEARNING BY LATENT DIRICHLET ALLOCATION
LDA is a generative probabilistic model for a given text collection. The basic idea of LDA-based topic models is that documents are represented as random mixtures over latent topics, where each topic is characterized by a distribution over words [9]. Topic models are applied in many models to incorporate explicit/implicit data [8], [20]. Essentially, there is an assumption that K latent topics are hidden in the given N document corpus, and each topic is represented as a multinomial distribution over the M words in the vocabulary extracted from the above data collection. Each document is generated by sampling a mixture of these topics and then sampling words from that mixture. More precisely, the generative process for each document in the text archive is illustrated as follows: 1. For the n th (n = 1, 2, . . . , N ) document d in the whole N -document corpus, choose θ n ∼ Dirichlet (α); 2. For each word w n,m in document d, (a) Choose topic assignment z n,m ∼ Multinomial( θ n (b) Find the corresponding topic distribution ϕ z n,m ∼ Dirichlet (β); (c) Sample a word w n,m ∼ Multinomial( ϕ z n,m . Finally, the hidden unobserved random variables, topic-word distributions ϕ and document-topic distributions θ could be learned through Gibbs sampling and variational expectation-maximization (EM) algorithm via maximizing the probability P(D|α, β).
Two parameters need to be inferred in this model: one is the distribution θ of document-topic, and the other is the distribution ϕ of topic-word.

2) LEARNING BY NONNEGATIVE MATRIX FACTORIZATION
NMF is a linear algebra method that embeds the original high-dimensional data into a low semantic space from nonnegative hidden structures, which are viewed as coordinate axes in the transformed space with geometric perspectives [28]. The goodness of this formalization is evaluated by the square loss between the original term-document vector d ij and the linear combinations K k=1 θ k β kn . Therefore, the following exists: where each d n of N reviews in a given corpus is supposed to be linearly combined by K factors, · 2 F marks the square of a matrix's Frobenius norm, v and λ are two hyperparameters 86438 VOLUME 8, 2020 with nonnegative values, and I K is a K × K diagonal matrix. The smaller v is, the closer it is to orthogonal θ, while the larger λ is, the sparser v n will be.
Topic evolution indicates that the same topic shows dynamism and differences over time. The evolution of the topic is reflected in two aspects. First, the topic intensity changes over time. We use the distribution of topic k with time t to define the intensity of the topic. We analyze the topics from Mar. 2001 to Aug. 2012. The different changes in the intensities of the topics will be calculated at different times, which are divided into seven time slices each year. The intensity of the topic, is shown by the following: where δ t k is the intensity of a topic, D is the number of documents, and θ d,k is the distribution of topic k.
Through this model, we can obtain the topic distributions of users and reviews in time slice t. We apply the items' topic factors fully influence topics of their related reviews. Exponential transformation is adopted to guarantee that topic probability distribution of each review is associated with item topic factor as shown in Equation 4. In addition, the output of topic model is the topics distribution as K-dimensional vector for each review. The proposed model utilizes integrated factor formed by hidden topic and latent factors to model users' rating behaviors. While topic factor is jointly learned from both review texts and rating information, latent factor is mainly from rating information. After training model, each word in this model will be displayed as K dimensional hidden topic vector with element as probability for appearance of this word in corresponding topic. We sum up the values of all words in review for each dimension to construct representation of each review.
The example of output and our feature vector is illustrated in Figure 2 on Automotive dataset. The figure show top 5 words associated with 3 random topics. Note that LDA-based is a probabilistic mixture of mixtures model for grouped data. The observed words within the reviews are the result of probabilistically choosing words from a specific topic, where the topic is itself drawn from a document-specific multinomial that has a global Dirichlet prior. This means that words can belong to various topics in various degrees. In topic models, each reviews is a probability distribution of a series of topics, where each topic is a conditional probability of words. The topics in text reviews evolve over time and it is of interest to model the dynamics of underlying topics.

C. JOINT USER FACTOR TRANSITION AND TOPIC EVOLUTION
Our model attempts to combine the idea of user factor transition for rating prediction and topic modeling to uncover latent topic factors in review texts. In particular, we correlate the topic factors with the corresponding latent factors of both users and items. Specifically, as demonstrated in the previous section, we learn a topic distribution θ d,ij for each review d ij . This actually records the extent to which each of K topics is discussed by user i for item j.
where · 2 F denotes the Frobenius norm. The first term in the cost function represents the minimization of the errors between the observed data and the recovered data using the NMF t times. The second term shows the minimization of the errors of estimating the transition matrix over t − 1 transitions to capture the user factor drift, and . 1 stands for the L1 norm. The temporal regularization λ Z (t) − I 2 F controls the extent to which we bias the decomposition toward U (t−1) . Thus, the λ parameter ∈ (0, ∞) balances present and past information; it quantifies the extent to which the model is past (i.e., λ → ∞) or present oriented (i.e., λ → 0).
The objective of our model is to learn the optimal U and V for accurately modeling users' ratings while simultaneously obtaining the most likely topics according to reviews with the constraint of the transformation function. Thus, we reach the following objective function for our model using NMF for uncovering hidden topics. However, we can find a local minimum for the objective function using multiplicative updates as introduced by [14]: Considering the Karush-Kuhn-Tucker (KKT) [26] firstorder conditions applied to our problem, we derive the following: where is the elementwise product. From the loss function L in Equation (11), we derive the gradients according to each parameter: By substituting the corresponding gradients into Equation (14), we derive the following update equations: For example, substituting Eq. (15) into Eq. (14), we have Then, the MU rule for V is derived, i.e., Recall that our goal is to simultaneously optimize the parameters associated with ratings (U and V ) and the parameters associated with topics (θ and β). The topic distribution θ is fit by V . As presented above, U and V are fit by MU in Equations (19) -(21), while β is updated through Equation (8). Therefore, we design a procedure that alternates between the following two steps following algorithm 1.

Algorithm 1 The Proposed Algorithm
Input : Matrices R (t) , R (t−1) for rating interaction, TF-IDF matrix D (t) for review collection, regularization parameter{λ, α 1 , α 2 , α 3 , }, factor size k, iterMax, subIterMax Output: Predicted dataR (t) 1 Randomly nonnegative initialize U (t) ∈ R i×k , V (t) ∈ R k×j , Z (t) ∈ R k×k ; 2 while (iter < ltermax) or (J (all) not converged) do 3 Update V (t) according to iterative Eq. (19)  Accordingly, the generative process for finding J (t) review from reviews in the corpus D by using NMF-based and LDA-based methods are different. The generative process of the LDA-based method for review can be simply described by the following conditional distribution, We multiply overall reviews in the dataset and all words in each review. The two terms in the product are the likelihood of these particular topics (θ u,v d,k ) and the likelihood of particular words for this topic (β kn ). The sub-process iterates through all reviews d and all word positions n and updates their topic assignments. We assign each word to a topic (an integer between 1 and k) randomly, with probability proportional to the likelihood of that topic occurring with that word and set k as topic factor with probability proportional θ u,v d,k β kn . Alternatively, the NMF-based process reconstructs words' occurrence counts in each review. The reconstruction loss can be formulated by the squared loss between the original term-document vector d n and the linear combinations K k=1 θ u,v d,k β kn . However, both the proposed NMF-based and LDA-based require the topic proportions (θ d,k ) that are not sampled from a Dirichlet distribution but are instead determined based on the value of V j,k taken during the previous step.
For example, we examine the rating prediction results following byR (t) = U (t) V (t) for each user in Table 2 with a user (user id = 550). In summary, our temporal model is good at capturing the preference dynamics and thus yields improvement on the performance.

V. EXPERIMENTS
In this section, the recommender system of the proposed model is evaluated. The experimental data preparation and results are presented and discussed.

A. DATASETS
We use seven datasets from Amazon 1 which are collected by Stanford University [36]. Statistics of the datasets are shown in Table 3. These datasets are a collection of user ratings with a corpus of text reviews and timestamps corresponding to various types of items that are available on public websites. The numerical rating scores of all reviews lie in the range of 1 to 5 and can only be integers. For example, the baby dataset consists of 122,150 ratings and 98,630 reviews from 19,526 users on 7,060 items. We are interested in all the interaction difference datasets from 1998 to 2014, even for users or items 1 Datasets: http://jmcauley.ucsd.edu/data/amazon that have only one review. Figure 4 shows the ratings drift based on the average ratings per sliding time window over time. An abrupt shift in ratings occurred for the Baby dataset in 2003, with a sudden jump from approximately 1.98 to greater than 2.91. The second significant temporal effect is that the ratings in the Video game dataset tend to increase over time.
Additionally, we discuss how the training sets and test sets are constructed and how to measure the performances of the model before reporting the results. First, we sort the data chronologically and construct training and testing subsets by shifting a time period. Then, we split the dataset into T time periods corresponding to 10 months. Given a set of training months, we predict the preference for a user at a test month. We declare a time period equal to 10 months, and we consider all the past months (1 st -9 th ) of each time period as the training set and the last month (10 th ) as the test set. For example, Automotive is selected for 90 months, which split the dataset into T = 9 time periods, corresponding to 10 months. The 1 st -9 th month of each period is used as the training set and the 10 th as testing set. Therefore, we have nine different test sets of the 10 th , 20 th , 30 th , 40 th , 50 th , 60 th ,70 th , 80 th and 90 th test months. For each test month, we average the results overall users and report the mean values. The size of the training and testing set are shown in Table 4.

B. REVIEW DATA PREPROCESSING
In preprocessing the review data, we attempt to clean our review data as much as possible using the natural language VOLUME 8, 2020  toolkit (NLTK) 2 in Python. The idea is to remove the punctuation, numbers and special characters in one step using regex replace, which will replace everything, except letters, with a space. Then, we will remove shorter words because they generally do not contain useful information. Finally, we will make all the text lowercase to nullify case sensitivity. After that, we remove the stop words from the review data because they are mostly cluttered and hardly carry any information. Stop words are terms such as 'it', 'they', 'am', 'a', 'by', 'doing', 'it', and 'how'.
To remove stop words from the documents, we have to tokenize the text, i.e., split the string of text into individual tokens or words.

C. EVALUATION METHODS
To evaluate the performance of the RS based on user preference drift and topic evolution, we employed the common evaluation metric, as follows: RMSE : the root mean square error is widely used for evaluating the performance of RS models. In this paper, we present the prediction accuracy of the recommender in terms of rating predictions, which is given by the following: where N ratings of a test month, the ground-truth ratings in R and the predicted ratings inR, where lower values of RMSE indicate higher performance in the rating prediction.

D. COMPARED METHODS
We conduct experiments on each dataset to assess the performance of the proposed method in comparison with baseline recommendation models: TimeSVD++, Bayesian temporal matrix factorization (BTMF), hidden factor topic (HFT), and temporal collective matrix factorization (TCMF).

1) TimeSVD++
This approach is a baseline temporal recommender model that extends SVD++ by introducing a time-variant bias for each user and item in every individual time step to address the dynamics of user preferences, but it does not exploit auxiliary information to enhance the accuracy due to data sparsity [25].

2) BAYESIAN PROBABILISTIC TENSOR FACTORIZATION
The ratings proposed by the BPTF model are represented as triples (user, item, time). These triples are organized into a three-dimensional tensor. Finally, the tensor is decomposed, and ratings are predicted using the inner product of the latent factor vector [57].

3) HIDDEN FACTOR AS TOPICS
In HFT, topics in review texts are associated with item parameters using LDA topic modeling, which has been shown to perform well in terms of rating predictive accuracy, to overcome data sparsity, but temporal data are not considered [36].

4) BAYESIAN TEMPORAL MATRIX FACTORIZATION
BTMF is a temporal recommender model that extends SVD++ by introducing a time-variant bias for each user and item in every individual time step. It achieves the best performance among the matrix factorization schemes by introducing priors for the hyperparameters of TMF to capture the conditional distributions of users, items and user feedback on items [61].

5) TEMPORAL COLLECTIVE MATRIX FACTORIZATION
The TCMF model captures preference dynamics through a joint decomposition model that extracts the user temporal patterns from the rating and multimodal information between consecutive time points. However, the model does not track the hidden topic factors from review information to discover the evolution of users' preference topics [42].

6) TMRevCo
The model is based on matrix factorization model that factorizes rating matrix into latent user and item factors for 86442 VOLUME 8, 2020  rating prediction. The model focus on dynamic user factor and item correlation measure in CoFactor and associate the item factors of CoFactor with that of matrix factorization [55]. All these approaches are tested in the same experimental environment. In the compared approaches, regularization parameters, learning rate, size of latent factors are optimized experimentally. The most precise values are either determined by empirical results or suggested by the original papers. For temporal recommendation models including TimeSVD++, we train the model by fitting the data in the training set and learn model parameters by the stochastic gradient descent (SGD) algorithm as suggested in [25], which ignores the review texts when generating the prediction in the test month and the factor dimension(f ) set to 10. The settings of BPTF given by [57] are adopted in our experiments; consequently, we set µ 0 = 0, ν 0 = K , β 0 = 1, W 0 = I,ν 0 = 1,˜ = 1, andW 0 = 1, where L is 200, K is set from 5 to 50, with additional Z 0 = I in BTMF as suggested in [61] and we chose the best one, which has the latent space dimension of 10 in the hyper-priors. Meanwhile, the compared models are considered regarding coping with data sparsity, including HFT and TCMC. For HFT, the offset (B) and bias terms (B i ) are initialized by average ratings and global bias item, parameters are fit using L-BFGS and we set n = 0.1 as suggested in [36]. TCMC is set with regularization parameters: λ = β = γ = 1, as suggested in [42], and we fix = 10 −4 , maxIter = 10 3 and obtain the best results on the temporal regularizer parameter λ = 1. While, TMRevCo, we set λ c = 0.001 and λ r = 0.05 as suggested by [55].

E. EXPERIMENTAL RESULTS
The results are evaluated in terms of RMSE and are shown in Table 5. We compare the proposed method with the temporal models TimeSVD++, 3 BPMF, 4 and BPTF. 5 Additionally, we also compare our method with the HFT model, which incorporates the rating and hidden topics factor to enhance ratings and address data sparsity but ignores the preference dynamics. While, TMRevco model combines the dynamic user factor of TimeSVD++ with the hidden topic of each review texts mined by the topic model and applies item correlation measure in CoFactor and associate the item factors of CoFactor. Meanwhile, the TCMF model considers the  user preference drift and solves the data sparsity problem based on the accuracy of the recommendation, as it does not track the hidden topic evolution of reviews that may dynamically change over time. The proposed model outperforms the compared models, and the best performance is indicated in bold font. This occurs because the proposed model observes the dynamics of user interactions on ratings and reviews to capture user preference drifts and address the data sparsity problem with hidden topic review evolution. Figure 6 shows a comparison of the proposed method with state-of-the-art approaches in terms of RMSE. The results also show the effect of performace accuracy for the examined models over the number of latent factors(k). We observe that all models improve their learning ability and the best average result is on k = 10.
From the results of our experiment, we can see the changes of topic interests. For example, the intensity of each topic is shown in Figure 7. The ordinate represents the intensity of the top 3 topics in the Baby dataset in each time slice, which is calculated with equation (10). Figure 7 shows us that the intensity of topic 1 is higher than that of the other topics during the time slice of 4-7. This result demonstrates that the influence of topic 1 is strongest. In addition, the intensity of each topic changes with time. This result shows that each topic will go through an evolutionary process that is consistent with reality. Next, we also track topics from reviews automatically using dynamic LDA-based and NMF-based methods in each time period. These methods take different approaches to model texts, i.e., corresponding to probabilistic generative processes and geometrically linear combinations, respectively. Therefore, in Figure 5 we compare the result to determine which is better for different topic numbers (5,10,15,20,50) for both methods corresponding to the datasets, and the hyper parameters α and β in LDA are set to 1 and 0.01,respectively. The iterative parameters of NMF are set to 100 iterations.
Obviously, the proposed NMF-based method tends to outperform the LDA-based method for short review topic mining and is positively affected in the case of dynamics, which leads to draw some analysis. However, LDA and NMF can both automatically learn topics from short texts, but they do so in different manners to model texts, i.e., corresponding to a probabilistic generative process and geometrically linear combinations, respectively. Meanwhile, the general review texts are short and sparse, and due to this manner, ordinarily exhibit an absence of data regarding the word-word cooccurrences, which is undoubtedly not gainful for statistical topic models such as LDA. Furthermore, our model adopted in the LDA-based method is the stochastic Gibbs sampling for each word in texts, which would introduce significant variances in learning and inference, especially for sparse and short texts. However, the NMF-based approach first encoded the whole corpus with term frequency-inverse document frequency (TF-IDF) weights, which not only considering the frequency but also induces discriminative information (IDF) for each word. In conclusion, the greater amount of information encoded and definite algorithms likely enable the tendency of NMF to produce higher-quality topics than does LDA from short texts [14] and leading to improve accuracy.

VI. DISCUSSION
Summarizing our results, our model can be used as a novel form of temporal dynamics for tracking the evolution of users' preferences over time to capture the transition of users' preferences in latent factor space and exploit users' opinions as review texts to find hidden topic evolutions that can explain why a user rates the score by using a topic modeling technique at different time steps. However, such temporal models e.g., time SVD++, BPTF, BTMF, focus only rating interactions. They ignore data sparsity problem. The sparse data leads to low prediction accuracy. On the other hand, HFT is a model that can exploit review topics and latent factors to enhance the rating and solve the rating sparsity issue, but does not capture preference drifts. While, TCMC combines multimodal information with scores to reduce data sparsity, but this model also considered multimodal data that did not uncover the hidden topic factor from reviews to explain the score on each time step. For TMRevCo, the model considers temporal dynamics and item correlation dimensions on reviews but does not capture the transition of users' preference each time step. Our results clearly show that the proposed model outperforms the compared models. This occurs because the proposed model observes the dynamics of user interactions on ratings and reviews to capture user preference drifts and address the data sparsity problem with hidden topic review evolution.

VII. CONCLUSIONS
This paper aims to enhance dynamic CF on recommender systems under volatile conditions in which both users' preferences and item properties change dynamically over time. Moreover, the existing CF models mainly rely on solving data sparsity by adding side information to improve the performance. We proposed a model to capture the dynamics of users' preferences in the rating matrix by using a joint decomposition method to extract user latent transition patterns and combine latent factors, together with the associated topic evolution of review texts by using dynamic topic modeling. We evaluated the accuracy on seven real datasets for solving VOLUME 8, 2020 the data sparsity problem, and the experimental results show that the model leads to a significant improvement compared with the baseline and state-of-the-art method.
In the future, we plan to extend this work to incorporate external users' interests, such as social relationships and fashion trends that can change over time. Another interesting extension to our model would be the ability to capture evolving, emerging and fading topic interests of users to improve the prediction performance of the dynamic recommendation system.