CRecSys: A Context-Based Recommender System Using Collaborative Filtering and LOD

Linked Open Data (LOD) is an emerging Web technology to store and publish structured data in the form of interlinked knowledgebases like <monospace>DBpedia</monospace>, Freebase, Wikidata, and Yago. It uses structured data from multiple domains, and it can be used to conceptualize a concept of interest. Recently, researchers have shown that incorporating contextual features in recommender systems improves rating prediction accuracy. However, identification of contextual features for building context-aware recommender systems is a major bottleneck. To this end, in this article, we present the development of a context-based recommender system, <monospace>CRecSys</monospace>, for item ratings prediction in movie domain. <monospace>CRecSys</monospace> extracts item-based contextual features from the underlying dataset and generates an <monospace>RDF</monospace> graph to model items and their contextual features for computing context-based items similarity using graph matching techniques and item-based collaborative filtering. It uses LOD and two well-known movie data sources – Rotten Tomatoes and IMDB for item profiling using a dataset of 1300 movies. <monospace>CRecSys</monospace> is experimentally evaluated over two movie datasets, one is generated by the authors and second is the <monospace>MovieLens-1M</monospace> benchmark dataset. <monospace>CRecSys</monospace> is also compared with ten baselines and two state-of-the-art recommendation methods, and performs significantly better. It is also empirically established that <monospace>CRecSys</monospace> is able to effectively deal with some of the open challenges like <italic>cold-start</italic> and <italic>limited content</italic> problems of the traditional recommender systems.


I. INTRODUCTION
The open nature of the Web 2.0 has resulted in an uncontrolled generation of a plethora of information leading to the information overload problem. E-commerce is one of the exponentially growing web-based services which is adding hundreds of thousands of products and services, and attracting a large number of buyers and sellers on daily basis. As a result, e-commerce platforms are also facing the information overload problem. To deal with the information overload problem in e-commerce, academia, and industry, researchers have proposed various recommendation techniques to filter out irrelevant items and recommend only those items to users that are relevant to their requirements, interests, and profiles [1]. Most of the world's large corporations are The associate editor coordinating the review of this manuscript and approving it for publication was Zhe Xiao . successfully using one or another form of recommender system technologies to facilitate their customers. Among them notable are Netflix's movie recommender system, Amazon's product recommender system, and Last.fm's songs recommender system. Although researchers have proposed several recommender systems for different domains, and organizations are successfully using their customized recommender systems, there are several challenges like cold-start, black box recommendation, limited content, and data sparsity problems that lower down the efficacy of the recommender systems. To this end, many researchers have considered the development of context-aware recommender system as a possible solution, which incorporates contextual features for recommendation. However, most of the existing approaches have used only user-decision contextual features, ignoring the item-based contextual features.

A. WHY CONTEXT IN RECOMMENDER SYSTEMS?
A recommender system predicts ratings and recommends items based on users' interest, browsing history, and preferences. In the recommendation process, recommender systems select user-centric relevant items using filtering algorithms like collaborative filtering and content-based filtering. The filtering algorithms do not consider the services and conditional usage of items, where conditional usage represents different conditions from a user's perceptive to consume an item, such as what products are used by which users and when. For example, a user may prefer to watch different movies at different place (home or theater) with different companion (family or spouse) on different time (weekend or weekday). To satisfy the constraint of the conditional usage, contextual information needs to be incorporated in traditional recommender systems for improving their recommendation and rating prediction accuracy. Abowd et al. [1] defined context as follows: ''context is any information that can be used to characterize the situation of an entity such as person, place, or object which is relevant in the interaction between the entity and an application, including the user and application themselves''. On the other hand, Cantador and Castells [2] defined context as ''the background topics under which activities of a user occur within a given unit of time''. Contextual features can be grouped into three categoriesuser-based, item-based, and decision-based [8]. Figure 1 presents a multi-layer contextual graph illustrating all three categories of the contextual features [8]. Though, contextaware recommender systems improve rating prediction and recommendation accuracy, there are still some open challenges that hamper their performance. The first issue is the lack of dataset for context learning. Although there exist some datasets like LDOS-CoMoDa [25] and DePaulMovie [27], but they are not live and comprehensive because they are generated using questionnaire-based approaches. The second issue with existing context-aware recommender systems is that they are based on user decision, which adds a third dimension, such as time, location, companion, or place in the existing traditional recommender systems and they do not consider user-or item-based contextual features.

B. LINKED OPEN DATA
Linked Open Data (LOD) is a collection of multiple knowledgebases, such as DBpedia, Freebase, and Yago that are interlinked with each other and contain structured data related to different domains. It is developed using standard web technologies like HTTP, URI, and Resource Description Framework (RDF), where URI is a unique identifier to represent entities and RDF is a data model to represent data in a machine understandable triplet form subject, predicate, object . The nucleus of LOD is DBpedia, which is connected to every other knowledgebase of the LOD. The linked datasets are used in many applications, including the development of item-based context-aware recommender systems. Catherine and Cohen [11] and Noia [12] discussed the procedure of using knowledge graphs in recommender systems, and emphasized that the integration of knowledge graphs in recommender systems can address various challenges and issues like data sparsity, limited-content, and cold-start problems. Data sparsity is a major issue in recommender systems which arises due to unavailability of sufficient ratings for the items. Lacking sufficient ratings on data items degrades the performance of the recommender systems because it is difficult to find most similar items in the system [47]. Similarly, limited content problem arises due to unavailability of sufficient contents for the items. This problem further leads to the over-specialization issue in which recommender systems are not able to recommend novel items [47]. Finally, cold-start problem can occur in recommender systems for both users and items, in the cases where either some users rated very few items or some items received very few ratings. One of the major drawbacks with cold-start problem is that it degrades the overall performance of the recommender systems [47].

C. OUR CONTRIBUTIONS
In this article, we present the development of a contextbased recommender system, CRecSys, for movie domain which uses item-based collaborative filtering (IBCF) for rating prediction and recommendation. It is a major extension of one of our previously published conference papers [46], by considering larger datasets, additional evaluation metrics, comparison with many baselines and state-of-the-art methods from different perspectives, including cold-start users and limited content problem. In line to [17], CRecSys applies Natural Language Processing (NLP) and Information Extraction (IE) techniques, including LDA over a contextrepresenting movie dataset to identify various contextual VOLUME 8, 2020 (both representational and interactional) features, such as topic, subject, genre, certification, and cast-performance for movie profiling. It generates a labeled RDF graph to model items and their contextual features for computing contextbased similarity between the items. In order to compute similarity between items using contextual features, CRecSys uses node-based graph matching techniques over the RDF graphs. The efficacy of CRecSys is established through experiments and validations using different evaluation metrics, such as error-based metrics (Mean Absolute Error and Root Mean Square Error) and decision support-based metrics (Precision, Recall, and F-score). CRecSys is also compared with ten baselines and two state-of-the-art methods, and performs significantly better. On empirical analysis, we found that CRecSys is able to effectively deal with the cold-start and limited content problems, in comparison to the baselines and state-of-the-art methods.
The rest of the paper is organized as follows. Section II presents a brief review of the existing literatures on contextaware recommendation, LOD-based recommender systems, and similarity computations using graph-based techniques. Section III presents a brief introduction of the preliminary concepts. Section IV presents a detailed description of the proposed CRecSys and context-based recommendation approach, including contextual feature extraction, labeled RDF graph generation, contextual feature-based semantic similarity computation, and rating prediction using itembased collaborative filtering. Section V presents the experimental setup and evaluation results. Finally, section VI concludes the paper with future directions of research.

II. RELATED WORKS
In this section, we present a brief review of the existing literatures that have used both item and user-decision based contextual features for rating prediction and recommendation. We also review the approaches that have utilized LOD and graph techniques to design recommender systems. Finally, we present various graph-based techniques that are used to compute similarity between the nodes of a graphs.

A. CONTEXT-AWARE RECOMMENDER SYSTEM
The aim of recommender systems is to recommend most similar and relevant items based on the users' profile, interest, preference, and interactions. In the existing literature, researchers have presented numerous content-based, collaborative filtering, and hybrid filtering recommendation techniques. A Recommender System (RS) is generally represented using user and item dimensions as R : user × item → rating. To handle the recommender system challenges and to improve the rating prediction and recommendation accuracy, Adomavicius et al. [3] introduced a third dimension, context, to incorporate contextual features and defined recommender system as R : user × item × context → rating, termed as Context-Aware Recommender System (CARS). In such systems, contextual features are incorporated through contextual modeling, pre-filtering, and post-filtering algorithms.
As discussed in [8], context can be divided into three categories -(i) user-based context, (ii) item-based context, and (iii) decision-based context. In the existing literatures, incorporation of item contextual features to design contextbased recommendation techniques are rare. To this end, Dourish [5] explained that an item contextual features can be categorized into representational and interactional contextual features. Representational contexts are the attributes of the users and items defining their characteristics. For example, in movie domain, genre, sub-genre, cast, director, and certification can be considered as representational contextual features. Both user-related and item-related representational contextual features are explicitly encoded in datasets and do not change over time. Moreover, representational contexts are delineable, stable, and separate from activity [5]. On the other hand, interactional contextual features are extracted from the review documents written by the users. Interactional contexts are relational property between object and activity which are dynamically defined information and arises from the activities. We use both representational and interactional contextual features for item profiling. Yao et al. [8] introduced the construction of a Multi-Layer Contextual Graph (MLCG) which includes user, item, and user-decision based contextual features. The proposed approach used implicit feedback data as contextual features to design MLCG and then applied ranking algorithms for context-based recommendations. Allahyari and Kochut [17] proposed a probabilistic topic model which incorporates movie contexts with user interests, and the contextual information is represented as a subset of the items' feature space. To extract contextual information of movies, an external knowledgebase, DBpedia, is used. In line to [17], the proposed CRecSys also uses LOD to extract contextual information of movies.
The approaches discussed so far used item contextual features to design context-based recommendations. However, there are various approaches which have also used userdecision context features in context-based recommendation techniques. These approaches can be used for the incorporation of context in traditional recommender systems and for the extraction of contextual features. The matrix factorizationbased CARS, also known as Context-Aware Matrix Factorization (CAMF), was initially introduced in [37], where authors proposed three new models -CAMF-C (context not dependent on items), CAMF-CI (context represented in item-context pair), and CAMF-CC (single model for each context-item pair). The paper used non-probabilistic matrix factorization to split user-item rating matrix into two small matrices. On the other hand, probabilistic matrix factorization in CARS was used as point-of-interest, and ratings of unrated items were determined based on the review helpfulness votes [38], [39]. Ning and Karypis [40] introduced Sparse Linear Method (SLIM), a new traditional matrix factorization approach to predict top-N recommendations for sparse ratings on unrated items. The paper handled high sparsity challenge and reduced the learning time of the models. In this direction, Zheng et al. [41] extended SLIM and proposed a new matrix factorization approach, Contextual Sparse Linear Method (CSLIM).
The contextual features used in the approaches discussed above are either explicit (directly obtained from entities) or implicit (obtained from a monitored system for user-item interactions). Baltrunas et al. [28] introduced a music-based recommender system, InCarMusic, which predicts songs based on user-decision. In addition to these, there are some CARS in which contextual information is obtained through inference method. Hariri et al. [31] proposed a contextbased music recommender system where latent topics were extracted using topic modeling techniques and used as contextual features. Similarly, Lahlou et al. [32] introduced a text classification technique to infer contextual features from review documents. Table 1 presents a list of datasets used in CARS. Although, there are various approaches for CARS, most of them used synthetic contextual feature-based datasets and questionnaire approaches for integrating contextual features, by considering only user decisions and ignoring the user and item contexts. Moreover, the sparsity in the available datasets is high because it is difficult to find all contextual features for users through inference mechanism. To handle these issues, our proposed CRecSys generates a real-world item-based contextual feature dataset where both explicit and inference methods are used to identify contextual features. The generated dataset is able to handle various issues with CARS, such as data sparsity and limited content problems. The contextual features of the users are extracted by applying LDA over the review documents generated by them.

B. LOD IN RECOMMENDER SYSTEMS
Before the introduction of LOD, most recommender systems used semantic-aware and ontology-based approaches for rating prediction [18]. LOD represents and publishes textual data in structured format using graph-based data model and semantic web technologies, generating labeled RDF graphs. Incorporation of LOD in an existing recommender system requires content-based and collaborative filtering methods [19]. In the last few years, researchers have presented various LOD-based recommender systems. Passant [20] proposed a music recommendation system using DBpediabased features. The authors evaluated the proposed semantic similarity measure using linked data properties. In another approach, Noia et al. [21] proposed a content-based recommender system for movie domain using three knowledgebases viz. DBpedia, LinkedMDB, and Freebase.
Oliveira et al. [22] integrated LOD with recommender system and used LinkedMDB and DBpedia knowledgebases. The proposed method predicts a rating for a given input item, recommends items based on the history of the user, and recommends items to users not based on his/her history rather recommends the trending items. Musto et al. [23] presented a graph-based recommender system using LOD. The authors first applied the PageRank algorithm on the LOD-based features and then fed them into the recommender system. In continuation to this work, Musto et al. [24] developed a hybrid recommender system using popularity, content, collaborative filtering, LOD, bipartite graph, and tripartite graph-based features. Different feature combinations were given as input to three classification models for rating prediction and recommendation. The accuracy values of these approaches confirm the efficacy of the LOD-enabled recommender systems that provide significantly better rating prediction in comparison to the content-based, collaborative filtering, matrix factorization, and PageRank-based recommendation algorithms. However, LOD-based features are hardly used in existing context-based recommender systems, except the one presented in [17]. Our proposed CRecSys uses contextual features extracted from LOD to develop efficient context-aware recommender system.

C. GRAPH-BASED SIMILARITY MEASURES
Graph-based similarity measures can be used to compute semantic association between the nodes of a graph that can be words or documents. There are various measures to compute similarity between two or more graphs [14], [16], and the most common approaches to compute inter-graph similarity are based on graph isomorphism, maximum/minimum common sub-graph (super graph), and iteration. To compute similarity between two graphs using an iterative method, first an initial similarity score is assigned to each nodes of both the graphs, and thereafter the similarity scores are repeatedly updated using a function, such as [sim ( The updation process is repeated until the values converge to a stationary distribution. In this direction, Kleinberg [34] proposed an iterative method to identify authoritative information in hyper-link environment that was further modified in [16]. The iteration-based similarity measure by Blondel et al. [16] is given in equation (1), where E A and E B are the sets of edges for graph G A and G B , respectively. Zager and Verghese [35] improved equation (1) and presented it for both edge and node similarity calculation, as given in equations (2) and (3), respectively. In these equations, S e (u, v) represents the edge similarity score for edge u ∈ G A and edge v ∈ G B , and S n (u, v) represents the node similarity score for the nodes u, v ∈ G A .
Heymans et al. [36] proposed to consider both similar and dissimilar terms for compute node-and edge-based similarity. To identify similar terms, the original graph and its complement are used; whereas, for dissimilar terms, each graph and complements of all other graphs in the graph network are used. One major issue with these approaches is that they do not consider all natural and desirable properties of graphs, such as fixed values range of similarity score, assignment of zero similarity to nodes that have no in-degree or out-degree, and reflexivity of nodes for graphbased similarity computation [10]. Our proposed approach applies graph matching algorithm presented in [10] over the labeled RDF graphs, which represent LOD-based contextual features.

III. PRELIMINARIES
This section presents a brief description of various concepts like item context, LDA, and notion of similarities in graphs that are used to design our proposed CRecSys.

A. ITEM CONTEXTUAL FEATURES
The contextual features of an item represent constraints and contexts for their consumption by the users. For example, in movie domain, certification, cast, sub-genre, based on, and director are the contextual features. The items along with their contextual features can be defined as I nc k , where n = 1, 2, . . . , m and {c 1 , c 2 , . . . , c k } are m items and k contexts, respectively. Items have multiple contexts, wherein each context can have set of values. For example, I 1(c 1 ,c 2 ) = { genre : horror , sub-genre : dystopia } represents two contextual dimensions, c 1 and c 2 of item I 1 , where c 1 is genre and c 2 is sub-genre. The contextual values of c 1 and c 2 are horror and dystopia, respectively. As discussed in section II, most of the existing CARS have used user-decision as context; however, only few approaches have used both user-and itembased contexts. Allahyari and Kochut [17] presented an itemdriven contextual features-based method for rating prediction and recommendation. The authors used actors, genres, directors, and other item-driven contextual features, which are extracted using LOD. Figure 2 presents contextual features representation of the ''Sicario'' movie using DBpedia and Wikidata knowledgebases.

B. LATENT DIRICHLET ALLOCATION
Latent Dirichlet Allocation (LDA) is a generative probabilistic and statistical modeling method to extract topics from text corpus [6]. The basic idea behind LDA is that groups of contextually similar terms constitute different topics.  shows each of the N words and its assigned topics. The parameters α and β represent per-document topics and pertopic word distributions, respectively. Moreover, θ and φ represent topic and word distributions, respectively in the text corpus, z represents the topic assigned to the N th word in M th document, i.e. w. Table 2 presents a set of topics and their associated word distribution generated by LDA over the review documents of The Revenant, Sicario, and The Visit movies.

C. NOTIONS OF SIMILARITY IN GRAPHS
There are numerous graph matching algorithms to find similarity between two graphs [13]- [15] which has wide-range of applications in Web searching, social network analysis, biological network analysis, and so on. The most widely used notions of similarity in graphs are -isomorphism, edit distance, statistical methods, and iterative methods. A brief description of these notions of similarity is presented in the following paragraphs.
• Isomorphism: It is based on bijective function to identify structural similarity between the nodes of two or more graphs. The adjacency matrices of two isomorphic graphs are structurally same [13].
• Edit distance: It calculates the minimum cost for transforming a graph into another form. The number of edit operations like addition and deletion of nodes and edges determines the cost of graph transformation. It is used to compute the cost function for an optimal match between two graphs [14].
• Statistical methods: This is a family of similarity measures defined using graph statistics like diameter, degree distribution, and betweenness that are used to evaluate the similarity of graph structures [15].
• Iterative methods: These methods compute similarity between nodes or edges of a graph based on their overlapping neighbors. Iterative methods are applied when the nodes of a graph (or two graphs) have overlapping neighbors [16]. The proposed CRecSys uses node-based iterative method to compute similarity between the nodes of each pair of directed graphs. At each iteration, the similarity score distribution of the nodes of each graph converges towards a static distribution. Equation (4) presents an iterative process to compute similarity between the nodes u and v, where x in (u, v) and x out (u, v) represent the common in-neighbors and out-neighbors, respectively between the nodes u and v. The iterative process is repeated until the difference between the similarity of two consecutive iterations (k) th and (k + 1) th is less than or equal to a threshold δ, i.e. |x k+1

IV. PROPOSED METHODOLOGY
This section presents a detailed description of the proposed CRecSys to predict ratings of unrated items using contextbased semantic similarity. Starting with the extraction of contextual features from LOD and movie data sources, this section further proceeds with the discussion of labeled RDF graph generation, semantic similarity computation, and rating estimation. A detailed description of each module of CRecSys is presented in the following sub-sections.

A. CONTEXTUAL FEATURES EXTRACTION
This section presents the extraction process of contextual features for movie domain. The contextual features can be categorized into two groups -representational contextual features and interactional contextual features [5]. The representational contextual features are priory known for the items; e.g., genre, director, cast, and certificate in movie domain. On the other hand, interactional contextual features are inferred from interactions among users and items written in the form of reviews. The representational contextual features are extracted from both LOD and movie data sources, whereas interactional features are extracted only from movie data sources. Table 3 presents a list of notations and their descriptions used in rest of the paper.

1) LOD-BASED CONTEXTUAL FEATURES
As discussed earlier, LOD is a cloud of multiple knowledgebases that are interlinked to each other. We have used LOD to extract representational contextual features. LOD is a collection of RDF statements that are modeled as a labeled directed graph containing 3−tuples, (R, L, S), where R = {r 1 , r 2 , . . . , r R } is a set of resources (nodes), L = {l 1 , l 2 , . . . , l L } is a set of links (predicates), and S = {s 1 , s 2 , . . . , s S } is a set of statements, wherein each statement represents the association between the underlying pair of resources through a link known as triple. For example, a triple statement r 2 , l 2 , r 3 ∈ S represents that resources r 2 and r 3 ∈ R are linked through l 2 ∈ L. In an RDF, resource nodes are subjects/objects like movie title, director, cast crew, and musician, whereas labeled edges are predicates representing the association between the subject and object. Contextual feature of a resource r ∈ R in LOD is presented in property, value pair like acted, SandraBullock , as shown in figure 4 for the resource node ''Gravity'', where act is a contextual feature and Sandra Bullock is its value. In LOD, contextual features of a resource node r are the incoming and outgoing predicates. Contextual features in the proposed CRecSys are the descriptive features of items. Figure 4 presents node resources and their corresponding contextual features. In this figure, ''The Martian'' and ''Gravity'' movies are the resource nodes, r = {The Martian, Gravity}, predicates (links) contain the contextual features, l ={direction, topic, act, genre, subject}, ← − r = {Matt Damon, Sandra Bullock}, and − → r = {Survival, Solitude, Space} represent the incoming and outgoing contextual values, respectively for predicate l. The contextual features along with contextual values for a resource r is computed using equation (5). In equation (5), ← − r cf and − → r cf represent contextual features and values (in/out) for the resource node r, as given in equations (6) and (7), where v represents the contextual value.
For example, using equations (5), (6) and (7)   In order to extract contextual features from LOD, first, URIs of the movies are identified and mapped to the respective movie names. The movie-mapped URIs use SPARQL end-points to extract contextual features from LOD cloud using RDF triples. Table 4 presents few exemplar contextual features of ''The Martian'' movie extracted from DBpedia in the form of property, value pair. It should be noted that the contextual features like dbo:wikiPageExternalLink, dbo:thumbnail, and dbp:image that do not provide semantic information are filtered out.

2) MOVIE DATA SOURCE-BASED CONTEXTUAL FEATURES
The movie data sources (IMDB and Rotten Tomatoes) are used to extract both representational and interactional contextual features. There are few representational features like certification and ratings which are not available in LOD. Therefore, movie data sources are used to extract such features. Interactional contextual features show the interaction between users and movies in terms of ratings and reviews provided by the users on movies. The reviews provide valuable information about various aspects and context of the items, such as entities, events, entity actions, keywords, special events, and comparison (with other entities), containing various contextual information like how, when, where, and with whom a user consumed an item. We have applied LDA over movie review documents to identify contextual features.

B. RDF GRAPH GENERATION
After extraction of contextual features from LOD and movie data sources, next task is to generate RDF graphs using these features. Figure 5 presents an RDF graph in which subjects and objects are represented as nodes using rectangles and ovals, respectively, and predicates connect the pair of resources (subject and object) and represented using labeled dashed edges. The generated RDF graphs represent movies in a triplet form like subject, predicate, object . For example, in Interstellar, director, ChristopherNolan triplet, Interstellar and Christopher Nolan are the subject and object represented using rectangle and oval, respectively, and director is the predicate connecting the underlying subject and object.

C. CONTEXT-BASED SEMANTIC SIMILARITY
In this section, we formulate the proposed CRecSys approach to compute contextual feature-based similarity between items. In line to [10], [16], CRecSys calculates similarity between two nodes based on their overlapping contextual features. The proposed semantic similarity metric holds the following properties.
• Two nodes i and j, where i ∈ G A and j ∈ G B are said to be similar if they do not have any in-neighbors and out-neighbors, i.e., they are isolated nodes [36]. This is only applicable to directed graphs. We have modified this property in this study such that two nodes i and j are completely dissimilar (i.e., similarity score is 0) if the nodes do not have any in-neighbors and out-neighbors.
• The similarity score between a pair of nodes (i, j) is in a particular range. This property defines that the computed semantic similarity between nodes is always within a range (0, 1).
• Every node of a graph is related to itself. This is equivalent to reflexive property and defines that every node or edge in a graph is similar to itself.
The extracted contextual features of items (nodes) are assigned an initial weight. Existing state-of-the-art methods do not distinguish between rarely and frequently occurring contextual features and assign a binary value of 1 and 0 to the matched and unmatched nodes, respectively. Unlike binary assignment in the existing approaches, our proposed approach assigns higher weights to rarely occurring features and lower weights to frequently occurring features, as defined in equation (8). In this equation, f l and f m are the features of node i and j, respectively; and C ( f l ) represents the completeness score of feature f l . The distinctive features are highly informative in comparison to frequently occurring features, which are less informative. The completeness of a feature f is calculated using equation (9), where n represents the number of resources (i.e., movies) containing the contextual features f l , and N represents the number of resources in the labeled RDF graph.
Thereafter, an initial similarity score is computed between each pair of resource nodes i and j using equations (10) to (12). The initial similarity value is based on the in-degree and out-degree of i and j, as presented in equations (10) and (11), where deg in (i), deg in (j) and deg out (i), deg out (j) represent the cardinality of in-degree and out-degree of nodes i and j, respectively.
Finally, context-based semantic similarity (CBSS) between each pair of resource nodes i and j is updated in an iterative manner, based on the completeness between their in-neighbors and out-neighbors, as given in equations (14) and (15), respectively. In these equations, sim is the similarity between i and j based on contextual features (f i and f j ) using equation (8), and inSim(i, j) and outSim(i, j) represent indegree and out-degree-based similarity between i and j. This process is repeated until convergence, i.e.,

D. RATING ESTIMATION
This section presents the process of rating estimation for the unrated items using item-based collaborative filtering (IBCF) and CBSS in line to the work reported in [7]. To estimate the rating value of a user (say u) on an unrated item (say i), first top-k items, I k u , similar to i are identified using CBSS. Finally, estimated rating of u on i,r ui , is computed as the weighted average of rating of u on each j ∈ I k u , i.e. r uj , such that each r uj is adjusted using corresponding CBSS as weight, as given in equation (16).r However, there is a great chance that certain users may provide low or high rating because of their critical nature or biases, adversely affecting the estimated rating. To handle this issue, like [7], we have used first-order approximation in rating estimation, as given in equation (17), where b ui = b u +µ+b i , µ is the average rating of all movies, and b u and b i are the observed user rating deviation and item rating deviation, respectively.

V. EXPERIMENTAL SETUP AND RESULTS
This section presents the experimental evaluation of our proposed CRecSys method. Starting with a brief description of the dataset curation process, evaluation metrics, and various baseline methods, it presents performance evaluation results of CRecSys in comparison to baseline methods and two state-of-the-art methods, MORE (MOvie REcommendation) [4] and PICSS (Partial Information Content Semantic Similarity) [33], which used semantic similarity-based method for rating prediction. It also presents a comparative analysis of CRecSys in comparison to the baseline methods and state-of-the-art methods to deal with the cold-start users. Finally, it presents an empirical analysis of CRecSys, showing the impact of LOD to deal with the limited content problem.

A. DATASET CURATION
As discussed earlier, existing datasets of movie domain, such as LDOS-CoMoDa [25] and DePaulMovie [27] generally do not contain item-based contextual features. Therefore, we crawled and constructed a new movie dataset from IMDB, Rotten Tomatoes, and DBpedia. We named the curated movie dataset as Movie LOD . For this, we developed a crawler 1 in Python using urllib 2 and beautifulsoup 3 libraries to extract data from the aforementioned movie data sources. Since LOD provides SPARQL endpoints with each domain to consume linked data on the Web, we have used SPARQL Wrapper, 4 a python library, to access the endpoints of linked data. Table 5 presents a brief statistics of our curated dataset.
The crawled movie dataset contains contextual features like based on, about, and cast with different categories of data, such as users' reviews and ratings, as shown in table 5. The movies at IMDB are rated on a 10−point scale by the users, where 1 and 10 represent the lowest and highest ratings, respectively. We retrieved a total number of 191050 ratings from 49080 different users of 1300 movies. Thereafter, we applied LDA over movie review documents to identify latent topics that represent contextual features. In addition, data retrieved using the subject property (dct:subject) of LOD are filtered and segmented into sub-genres using the phrases like about and based on. Finally, contextual features are modeled as a labeled RDF graph.
As described in table 5, we have also used a benchmark dataset, MovieLens-1M, 5 which is frequently used for empirical evaluation of the movie-based recommender systems. In order to generate contextual features for all movies of the MovieLens-1M dataset, each of them is mapped to a DBpedia entry, and a mapping table 6 is used to identify its contextual features, in line to [17].

B. EVALUATION METRICS
This section presents a detailed description of the metrics that are used to evaluate our proposed CRecSys rating prediction model, and to perform comparative analysis with the baselines and state-of-the-art methods. The evaluation metrics used in this study are briefly described in the following paragraphs.
• Error-based metrics: These metrics are used to evaluate prediction error of the filtering algorithms by computing difference between actual and predicted ratings. We have used MAE and RMSE error-based metrics for evaluating the rating prediction methods. MAE is defined as the average of the absolute differences between the actual and predicted ratings over a set of items [9], as given in equation (18). On the other hand, RMSE is computed as the square root of the average of the square of the differences between the actual and predicted ratings, as given in equation (19). RMSE measures the intensity of data in context to best fit to a line and penalizes large error values. In equations (18) and (19), r ui andr ui are the actual and predicted ratings, respectively of user u on item i, and T represents the test dataset.
• Decision support-based metrics: This category of metrics evaluates the accuracy of a recommender system based on the recommended list of items to a user. It presents the evaluation results in terms of Precision, Recall, and F-score. In context of recommender system, Precision and Recall are computed using the sets of relevant and recommended items to users. Relevant items are those items that are liked by the users and contain actual ratings, whereas recommended items are the set of predicted items containing predicted ratings.
Precision is the fraction of relevant recommended items to the total number of recommended items, as defined in equation (20). On the other hand, Recall is the fraction of relevant recommended items to the total number of relevant items in the dataset, as given in equation (21). Finally, F-score is the harmonic mean of Precision and Recall, as given in equation (22).

C. BASELINE METHODS
In order to establish the efficacy of the proposed CRecSys method, we have considered 10 baseline methods viz. non-negative matrix factorization (NMF), singular-value decomposition (SVD), SVD++, three variants of k-nearest neighbors (KNN), normal predictor, baseline, slope one, and co-clustering for experimental evaluation. A brief description of these baseline methods is presented in the following paragraphs.
• NMF, SVD, and SVD++ methods are based on matrix factorization (MF), where user-item interaction matrix is factorized into two new matricesuser interests and item feature. MF helps to identify latent features and preferences of users and items.
• Co-clustering method is based on pair-wise interactions of two simultaneous entities. Co-clustering-based rating prediction for items is presented in equation (23), where C i and C u are the average rating of users cluster and items cluster, respectively, and C ui represents the average rating of co-cluster (C ui ) [42].
• Slope One based rating prediction method is based on user and item average ratings for rating prediction, as presented in equation (24). In this equation, dev j,i represents the rating deviation of item j on item i, R j is the set of relevant items, and u is the user's average rating [43].r • K-nearest neighbors (KNN) is a memory-based collaborative filtering approach which uses user-item rating matrix to predict ratings. The neighborhood formation in KNN is formulated using users or items-based similarity approaches. The KNN-based rating prediction is presented in equation (25), where sim(i, j) represents similarity between users i and j, r uj represents user u rating on item j, and k represents the number of similar users to u. To compute centered-KNN, equation (25) is modified asr ui + µ u , where µ u is the mean rating of the users. Similarly, for KNN-baseline, equation (25) is modified asr ui + b ui , where b ui = b u + µ + b i , µ is the mean of the user ratings, and b u and b i are the users' and items' observed rating deviations, respectively.
• Baseline-based rating prediction method uses both user and item biases to predict ratings for unrated items, as presented in equation (26). In this equation, µ represents the average ratings (users or items), and b i and b u are the observed items' and users' rating deviations, respectively.r • Normal predictor predicts random ratings for users through normal distribution method, N (µ, σ ) on ratingbased training set, where µ and σ represent mean and variance, respectively, and computed using maximum likelihood estimation. The predicted ratingr ui uses µ and σ to compute ratings on unrated items, as presented in equation (27) and (28), respectively. In these equations, R train is the rating-based training set and r ui is the rating given by user u on item i.   • PICSS [33] used LOD to compute semantic similarity between items for rating prediction. PICSS computes information content of each feature in LOD and uses modified Tversky ratio model to compute semantic similarity between two items, as given in equation 31.
In this equation, PICSS(i, j) computes the semantic similarity between items i and j and PIC(F i ) represents the set of partial information content for item i in the dataset. The partial information content of an item shows its appropriateness in the dataset. The more the information is shared in the dataset, the less it is distinctive in the dataset. This signifies that resources (items) with more distinctive features are more informative.  figure 6.
Similarly, CRecSys performs 8.51% better in terms of MAE and 9.17% better in term of RMSE in comparison to SVD++ over MovieLens-1M dataset, as shown in figure 7.  It can be observed from figure 8 that CRecSys performs 4.62% better in terms of Precision, 5.77% better in terms of Recall, and 4.72% better in terms of F-score, in comparison to SVD++ over Movie LOD dataset. Similarly, it can be observed from figure 9 that CRecSys performs 6.65% better in terms of Precision, 7.83% better in terms of Recall, and 7.28% better in terms of F-score, in comparison to SVD++ over MovieLens-1M dataset.

F. COMPARATIVE EVALUATION RESULTS OF CRecSys WITH STATE-OF-THE-ART METHODS
In this evaluation, CRecSys is compared with two state-ofthe-art methods, MORE [4] and PICSS [33], which have also used LOD to compute semantic similarity between items for rating prediction. Though both MORE and PICSS are similar to our proposed work, they only consider features extracted from LOD to compute semantic similarity. On the other hand, CRecSys uses multiple movie data sources, LOD, and review documents to extract contextual features, which seem very important to determine most similar movies. For comparative evaluation, the test dataset includes only those users in the dataset who have rated more than 5 movies.

G. DEALING WITH COLD-START USERS
In this section, we present an empirical evaluation of CRecSys in comparison to the baselines and state-of-theart methods to deal with the problem of rating prediction for cold-start users. The cold-start problem occurs when some users rate very few items, and rating prediction for such users VOLUME 8, 2020   is still an open challenge in the field of recommender system. Cold-start problem may occur for items as well, when some of the items receive very few user ratings. However, in this study, we have considered rating prediction for cold-start users only.

1) CRecSys VS. BASELINE METHODS
To perform empirical evaluation of CRecSys and baseline methods, we considered all those users who rated at most 5 items as cold start users, as used in [44], [45] as well. Hence, we repeated the same experiment, discussed in previous section, on the dataset containing only cold-start users. Figures 10 and 12  One possible reason behind such results is data sparsity, which is very high for cold-start users. In baseline methods, SVD++ performed best because it considers implicit ratings. But, CRecSys performed significantly better in comparison to SVD++ because it uses contextual features that help to identify similar items for those who have rated very few items. In comparison to SVD++, CRecSys showed an improvement of 19.79% and 16.35% over Movie LOD and 5.67% and 4.92% over MovieLens-1M datasets in terms of MAE and RMSE values, respectively. Similarly, CRecSys also performed better in terms of Precision, Recall, and F-score values over both datasets.

2) CRecSys VS. STATE-OF-THE-ART METHODS
In this section, we present a comparative evaluation of CRecSys vs. state-of-the-art methods viz. MORE [4] and PICSS [33] to deal with the problem of cold-start users. Like previous experiment, we considered the dataset containing only cold-start users, i.e., users who rated at most 5 items. Figure 14 presents the comparison results of CRecSys with FIGURE 11. Comparative performance evaluation of CRecSys vs. baselines methods in terms of Precision, Recall, and F-score to deal with the problem of rating prediction for cold-start users over Movie LOD dataset.     [33] in terms of Precision, Recall, and F-score to deal with the problem of rating prediction for cold-start users.
PICSS. CRecSys improved MAE and RMSE by 12.2% and 8.26% over Movie LOD and 4.82% and 3.15% over MovieLens-1M dataset. Similarly, it can be observed from figure 15 that CRecSys outperforms both MORE and PICSS in terms of Precision, Recall, and F-score values over both datasets.

H. IMPACT OF LOD TO DEAL WITH THE LIMITED CONTENT PROBLEM
In this section, we present an empirical evaluation of CRecSys to deal with the limited content problem. As discussed earlier, limited content problem arises due to unavailability of sufficient contents for items. To handle this issue, we have integrated LDA and LOD to CRecSys to find most similar items (movies). Table 8 presents top-5 movies similar to the ''Avengers: Age of Ultron'' movie using both LDA and LOD, and LDA alone. It can be observed from this table that similar movies identified using both LDA and LODbased features are highly similar in terms of subject, genre, and themes, in comparison to the similar movies identified using only LDA-based features. Table 9 presents the impact of LOD over the MAE and RMSE values when it is integrated with CRecSys. It can be observed from this table that, in comparison to CRecSys LDA at k = 30, CRecSys LDA+LOD has improvement by 5.37% and 5.31% in terms of MAE and RMSE values, respectively. Thus integration of LDA and LOD not only resolves limited content problem, but also improves the MAE and RMSE values.

VI. CONCLUSION AND FUTURE WORK
In this article, we have proposed a context-based recommender system, CRecSys, which computes semantic similarity between movies using contextual features and predicts ratings of the unrated movie items. We have also curated a movie dataset containing contextual features from two movie data sources and LOD. The main advantage of integrating movie data sources and LOD is to define movie context in a broader perspective. We have also extended item representation approach by incorporating latent topics extracted using LDA from movie review documents. The novelty of CRecSys lies in predicting ratings using items' contextual features and item-based collaborative filtering. The efficacy of CRecSys is evaluated using well-known metrics like MAE, RMSE, Precision, Recall, and F-score. Moreover, it is compared with ten baseline recommendation methods and two state-of-the-art methods -MORE [4] and PICSS [33], and performs significantly better. One of the distinguishing advantages of CRecSys is to handle the cold-start problem effectively in comparison to the baselines and stateof-the-arts methods. The overall rating prediction results of CRecSys for cold-start users are significantly better than the baselines and state-of-the-art methods. It is also found that incorporation of LOD improves the performance of CRecSys, mainly to deal with the limited content problem. Application of deep learning techniques, mainly word representation models, over textual data to identify contextual features seems one of the promising directions of research for the development of context-aware recommender systems.