Addressing the Cold-Start Problem in Collaborative Filtering Through Positive-Unlabeled Learning and Multi-Target Prediction

The cold-start problem is one of the main challenges in recommender systems and specifically in collaborative filtering methods. Such methods, albeit effective, typically can not handle new items or users that do not have any prior interaction activity in the system. In this paper, we propose a novel two-step approach to address the cold-start problem. First, we view the user-item interactions in a positive unlabeled (PU) learning setting and reconstruct the interaction matrix between users and warm items, detecting missing links and recommending warm items to existing users. Second, an inductive multi-target regressor is trained on this reconstructed interaction matrix and subsequently predicts interactions for new items that enter the system. To the best of our knowledge, this is the first time that such a two-step PU learning method is proposed to address the cold-start problem in recommender systems. To evaluate the proposed approach, we employed four benchmark datasets from movie and news recommendation domains with explicit and implicit feedback. We compared our method against three other competitor approaches that address the cold-start problem and showed that our proposed method significantly outperforms them, achieving in a case an increase of 16.9% in terms of NDCG.


I. INTRODUCTION
In the era of digitization and e-commerce, people use online platforms to find their desired products and services. Online platforms can provide an enormous catalogue of items or services to their users, nevertheless, usually each user is interested in a very small fraction of such a catalogue. This makes the role of personalization and recommender systems pivotal. Recommender systems (RSs) are intelligent methods that learn users' preferences and recommend relevant items to them. RSs use user-item interaction history data as well as The associate editor coordinating the review of this manuscript and approving it for publication was Fabrizio Messina . other types of available information, such as item and user side-information (i.e., features that describe the users/items in the system), to infer users' preferences. Generally, there are two main categories of RSs: content-based (CB) and collaborative filtering (CF) recommenders. CB RSs recommend items whose attributes match the target user profile. However, the main pitfall of CB RSs is that they typically provide over-specified recommendations and are unable to recommend any diverse content. On the other hand, CF RSs use the interactions of other users to infer the preferences of the target user. While CF RSs provide more surprising and usually more accurate recommendations compared to CB RSs [1], [2], they also suffer from some weaknesses. One of VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ the main pitfalls of CF RSs is that they can not serve new items or users that do not have any prior interaction data in the system. This issue is denoted as the cold-start problem and it is particularly challenging [3], hindering the performance of many applications of RSs. For instance, e-commerce websites need to satisfy the new users in order to gain their trust. Another example stems from the real-estate market where an online platform should immediately recommend a newly advertised property to the relevant users. Therefore, handling the cold-start problem both effectively and efficiently is crucial for RSs. Two types of cold-start problems are distinguished. New entities (users or items) are called hard cold-start entities when no interactions exist for them or soft cold-start entities when the number of known interactions is very limited [4]. In this paper we focus on hard cold-start entities as they are the most challenging. Apart from cold-start entities, the system cold-start problem may occur [5], where a new RS is introduced to the users, which is not the focus of this paper.
Different approaches have been applied over the years to address the cold-start problem. The simplest way is to serve the new entities (users or items) with non-personalized recommendation such as popularity-based recommendations. Another approach relies on the exploitation of the user or item side-information to predict interactions between such new entities and ones that already exist in the system. Thus, the side-information is employed on top of the collaborative information between users and items to enable a CF RS that serves new items or new users. In this study, we focus on the second approach, handling new items by utilizing the relevant side-information.
Moreover, some approaches try to surpass the cold-start problem by employing classification or regression models built on the side-information of users or items. Such approaches handle missing user-item pairs (i.e. not recorded interactions) as negative instances while training their models. This is denoted as the closed-world assumption [6] and it is often applied as it allows for effective machine learning models to be utilized for solving the recommendation problem. However, we argue against such an assumption, as user-item pairs without any prior interactions are effectively unlabeled data and should not be considered as negative instances. To this end, we consider the recommendation problem as a positive unlabeled (PU) learning task [6]. PU learning is the setting where a learning model is trained on only positive and unlabeled data [6]. This setting naturally fits the recommendation problem, as a typical RS has access to positive user-item interactions (e.g. clicks, likes, scores, purchases) and the rest are unlabeled. The latter means that albeit there are no recorded interactions between the user and the item, the item might be interesting to the user if it was presented to him/her. Typical RSs are driven through inference models, such as matrix completion (factorization) or graph learning methods. Such methods are typically transductive, meaning that the model requires the new items to be already present in the training process. Whenever a new item arrives, the model has to be re-trained (sometimes partly) in order to be able to provide any predictions. This is a crucial bottleneck that often impairs the performance of online RSs or even makes their application impossible. Although there are a few approaches that extend typical matrix completion or graph learning methods to incorporate side-information, most of them rely on plain neighborhood information and they often underperform when it comes to new items or users.
On the other hand, multi-target prediction (MTP) models can learn from the set of existing users or items and their related features that are already available in the system (i.e. training set). Next, the trained model can be used to predict interactions (probabilities) between cold-start items and the users of the platform. More specifically, multi-target prediction (MTP), also referred to as multi-output prediction, is an extension of standard classification or regression tasks where models learn to predict multiple outputs at the same time [7]. The fundamental assumption behind MTP is that each instance is associated to multiple targets, which are correlated with each other. Therefore, beyond the obvious computational advantages of such a methodology over learning a separate model per target, the model can benefit from existing correlations between the targets and therefore improve its predictive performance. MTP can be divided into multi-target classification (i.e., the targets have categorical values) and multi-target regression (MTR). A special case is multi-label classification where one has only binary values for each target.
In this article, we address the recommendation task through the scope of PU learning, proposing an effective two-step approach to address the cold-start problem in CF. More specifically, in the first step we reconstruct the user-item interaction matrix via semi-supervised learning and collaborative filtering. This way, we identify possible links between users and warm items (items with previous interactions), mitigating sparsity and class imbalance. This inferred set of interactions consists of positive, reliable negative, and predicted user-item interactions. In the second step we train a multi-target regressor (MTR) using relevant side-information of warm items as the features and the inferred user-item interactions as the targets. Then we are able to handle coldstart items using the trained MTR and the features of the new items, thereby accurately and efficiently predicting the user preferences for these new items. As it is known that there is no single model that performs generally best on all problems, we do not commit ourselves to a single algorithm in the different steps of our approach. Instead, in each step we use the model that performs the best among candidate models in a validation set.
For evaluation purposes, in this paper, after having built our two-step PU learning model, we compare it against other methods from the literature, such methods address the coldstart problem in recommendation using different approaches. In this study we focus on cold-start items, as most benchmark datasets contain rich item related feature representations. Nevertheless, cold-start users can be treated in the same way.
Our contributions can be summarized as follows: • We propose a novel two-step learning approach to address the cold start problem in recommendation. Our method is the first approach that combines collaborative filtering and multi-target prediction into a PU learning framework.
• We conducted a thorough evaluation study testing our method in the domains of both movie and news recommendation, showing that our approach achieves superior performance to all its competitors.
• We show to the RS community that recommendations for new items (users) can benefit from prior user-item matrix reconstruction. This paper is organized as follows: in Section II studies about addressing the cold-start problem in RSs are reviewed. Next, in Section III, we discuss our proposed two-step approach. In Section IV, we describe the datasets and the experimental setup of our evaluation study. Next, the results of comparing the proposed method to different methods addressing the cold-start problem in the literature are reported and discussed in Section V. Finally, we conclude and outline some directions for future work in Section VI.

II. RELATED WORK
One way to address the cold-start problem is to serve the new entities with non-personalized recommendations such as random-based, recency-based or popularity-based recommendations. Wang [8] proposed a non-personalized approach called ''ZeroMat'' by using Zipf's Law for the user-item rating distribution which performs better compared to the random-based RS w.r.t. relevance and fairness. In this paper we exploit side-information to address the item cold-start problem. To provide personalized recommendations, one can use a hybrid approach that switches to CB or knowledgebased RSs when there is no available interaction for the new entities [9]. Kawai et al. [10] proposed two hybrid approaches based on content-based filtering and Latent Dirichlet Allocation (LDA) to address the cold-start problem. The difference between these two proposed approaches is whether the topics of the side-information are independent of the topics of the items. Tahmasebi et al. [11] proposed another hybrid approach based on profile expansion to address the coldstart problem. They used user's demographic information to augment the user neighborhoods and expanded the interaction matrix with additional ratings using two heuristic strategies. Feng et al. [12] proposed a hybrid approach which combines Probabilistic Matrix Factorization (PMF) and Bayesian Personalized Ranking (BPR) to address the user soft cold-start problem. Using this combination, their model is capable of exploiting both explicit and implicit feedback from users.
Another direction to address the cold-start problem in CF methods is to extend the CF methods, such as matrix factorization, with side-information in order to serve new entities. Collective Matrix Factorization (CMF) [13], [14] is an extension of matrix factorization where instead of factorizing only the interaction matrix between users and items, it collectively factorizes the interaction matrix as well as the item/user side-information matrix based on a common low-dimensional feature space. Saveski and Mantrach [15] extended the CMF optimization problem by adding non-negativity constraints on the factorized matrices for the sake of interpretability of the factors. They also considered the manifold assumption in the objective function, i.e., if two items are close in the real feature space they should be also close in the learned lowdimensional feature space. They called this method Local Collective Embeddings (LCE).
When it comes to PU learning, there have been many approaches that employ a combination of clustering and classification techniques to treat PU learning tasks. For instance, Liu and Peng [16] proposed a clustering-based method followed by an extension of tf-idf to identify strong negative samples prior to document categorization. In [17], k-means was combined with Rocchio [18] to mine strong positive as well as reliable negative examples. k-means was once more employed in [19] to extract strong negative and positive examples, employing SVM for the end task of classification. PU learning for categorical data was addressed in [20] where strong negative and positive samples were identified using kNN and a distance measure denoted as DIstance Learning for Categorical Attributes (DILCA) designed specifically for categorical data. Most of these techniques were originally designed for classification tasks without an extension to more complex tasks such as recommendation. Last, most PU learning methods focus on the detection of reliable negative samples prior to the application of a classifier, discarding the rest of the unlabeled data. In our approach, we discard no information, instead we assign a fuzzy score to ambiguous user-item pairs and let the multi-target prediction models learn from the whole data corpus.

III. METHODOLOGY
In this section we explain the proposed approach. We use the notations defined in Table 1. In recommendation tasks usually there are two main sets of entities, the users and the items. Let U = {u 1 , u 2 , . . . , u m } and I = {i 1 , i 2 , . . . , i n } be two finite sets, representing users and items, respectively. The already known interactions between such items and users are stored in an interaction matrix Y ∈ R m×n , which can contain ratings when the user feedback is explicit or binary values (y(u i , i j ) ∈ {0, 1}) when the user feedback is implicit. In both cases this interaction matrix is typically very sparse, i.e. there is typically a tiny percentage of positive user-item interactions while most of the pairs are marked as zero. This setting inherently falls under the scope of PU data. This means that user u i likes item i j if we have a positive rating but when y(u i , i j ) = 0 the result is inconclusive. Indeed, a zero value is ambiguous and could mean that the user does not like the corresponding item, but could also mean that the item has not yet been presented to the user. VOLUME 10, 2022 More specifically, the task of a CF-based RS, given the sparse interaction matrix between users and items, is modeling the user preferences over the unseen items and generating ranked lists of recommendations. As it was mentioned, the hard cold-start problem occurs when new entities enter the system and there are no historical interactions for these new entities in the interaction matrix Y. Therefore CF-based RSs are unable to learn the preferences of these new entities.
In this paper we focus on the item cold-start problem, where a set of new items I C = {i c 1 , i c 2 , . . . , i c w } are entering the system and the RS should recommend them to relevant users. The only information that is given for these new items is their side-information. For instance, in a movie recommendation task, item-related side-information could be movie genres and cast. For the warm items the feature matrix X ∈ R n×f and the interaction matrix Y ∈ R m×n are given, while for the cold-items only the feature matrix X C ∈ R w×f is given (w is the number of cold-start items and f is the number of item features). The interaction matrix Y C ∈ R m×w is not observed, i.e., it contains only zeros. In this paper we propose a recommendation approach that recommends these new items I C to most relevant users. Our methodology is motivated by the profound link between RSs and PU learning. More specifically, we propose a new methodology that treats user-item interaction data as PU data. Our approach first reconstructs the interaction matrix Y, detecting any missing links between users and items that are already present in the system. This way, our approach forces matrix Y to become less sparse, mitigating class imbalance and removing part of the noise that innately exists due to the limited user feedback. We subsequently tackle the cold-start problem by handling new items exploiting related side-information. More specifically, we propose the training of multi-target prediction models, such as tree-ensembles, upon such a reconstructed and information enriched interaction matrix. The underlying assumption is that the multi-target prediction model would learn from a substantially less sparse and more informative interaction set.

A. RECONSTRUCTING THE SPARSE INTERACTION MATRIX
The first step of the proposed approach is to learn the users' preferences on the warm items. As it is shown in Fig.1, we fit a CF-based RS on the interaction matrix to learn the user preferences. The fitted model (f CF ) is then used to reconstruct the whole interaction matrix between users and warm items. The elements of this reconstructed matrix (Ŷ) are: Based on Eq.1 the reconstructed matrixŶ contains the real feedback from users when it is available or the predicted feedback when it is missing.
The choice of f CF only depends on the type of feedback. Below, we illustrate for some example CF methods dealing with explicit or implicit feedback, how they can be plugged into our approach. In the absence of a priori preference, we propose to compare multiple CF methods in a validation set and then select the best performing one to reconstruct the interaction matrix between users and warm items.

1) RECONSTRUCTING THE INTERACTION MATRIX WITH EXPLICIT FEEDBACK
Pure Singular Value Decomposition (SVD) [21] and Nonnegative Matrix Factorization (NMF) [22] are CF methods that decompose the interaction matrix to two low-rank matrices for users and items. The learned user and item matrices in NMF contain non-negative values. The interaction matrix can be reconstructed with these two CF methods using Eq. 2: where p u i and q i j are the learned latent features of user u i and item i j respectively. SLIM [23] is a linear method that learns the sparse aggregation coefficient square matrix W using the optimization problem regularized with L1 and L2 norms. The interaction matrix can be reconstructed with SLIM using Eq. 3: where x u i is the rating vector of user u i and w i j is the learned sparse size-n column vector of aggregation coefficients for item i j . User-based and item-based KNN (UKNN and IKNN) are memory-based CF methods that predict the missing interactions using the interactions of neighbor users/items. The missing scores in the interaction matrix are predicted by UKNN and IKNN using the weighted average of the scores of neighbor users/items. The weight of each neighbor is the similarity of its interaction vector with the interaction vector of the target user/item.

2) RECONSTRUCTING THE INTERACTION MATRIX WITH IMPLICIT FEEDBACK
Bayesian Personalized Ranking (BPR) [24], Weighted Approximate-Rank Pairwise (WARP) [25] and Weighted Regularized Matrix Factorization (WRMF) [26] are CF methods for implicit feedback. BPR is a learning-to-rank CF method that uses pairwise preferences to learn users' and items' latent features. (WARP) [25] is another CF method for implicit feedback that was initially proposed for annotating images, but later on was used as a learning-to-rank RS. Weighted Regularized Matrix Factorization (WRMF) [26] uses the alternating-least-squares optimization approach to learn parameters. All of these three methods learn users' and items' latent features (q and p) and therefore the interaction matrix can be reconstructed using Eq. 2.
MVAE [27] is a CF RS for implicit feedback based on variational autoencoders with the assumption that the user logs are from a multinomial distribution. Given a trained MVAE recommender on X , the interaction matrix can be reconstructed by: where f θ is the decoder with the learned parameters θ, p u i = µ φ (x u i ) represents the learned latent features for user u i and µ φ is the learned encoder. Last, the reconstructed matrixŶ with the mentioned methods or any other CF method may need re-scaling to have the same scale as the original interaction matrix Y.

B. CASTING THE COLD-START RECOMMENDATION PROBLEM AS MULTI-TARGET REGRESSION
The second step of the proposed approach is to fit a MTR using warm items as training instances and users as targets (See Figure 2). Features of the warm items X are considered as inputs to the MTR and the reconstructed matrixŶ from the previous step is used as target set: The trained MTR f MTR is then used to predict the scores of the users for the cold-items X C : whereŶ C is the predicted preferences of the users on the cold-items. Then these predictions are used to decide for each cold-item which users should receive it as a recommendation. MTR versions of tree-ensemble algorithms, such as Random Forests (RF) [28] or Extremely Randomized Trees (ERT) [29], have been proved very effective. RF consists of a collection of multiple decision trees. The tree growing process is driven by a splitting criterion, selecting the best split. Many such splitting criteria exist, with variance reduction being the most typical one. A key factor of RF is the diversity that is enforced among the trees by utilizing bootstrap replicates of the training set as well as implementing a random selection mechanism of the features during the tree learning process. ERT is an extension of RF where, similar to RF, each tree of the ensemble is trained using a random subset of the features as split-candidates in each node. The difference in ERT is that for every feature from the κ selected ones, one split threshold is picked at random. Next, the best split from the κ picked ones is selected.
Both RF and ERT have been innately extended to MTP by transferring the splitting criterion to the multi-output space. More specifically, the criterion is computed over the whole set of outputs, typically as the sum of the variance of each output.
Tree-ensemble learning algorithms are computationally very efficient. They are inductive methods, naturally memory efficient, and can be very easily parallelized, as every tree in the ensemble can grow independently. Last, tree ensembles are also known for their innate interpretability, since they can first provide the user with a feature ranking, disclosing the features that are crucial for a prediction, and second, provide a set of rules that explain a specific prediction. The latter can be further leveraged with existing tree-approximation strategies [30]. Other MTP models also exist, for example, multi-target K-Nearest Neighbors Regressor (KNNR) is a simple approach based on KNN, where the predictions are based on averaging the outputs of the k nearest neighbors of the test sample.

IV. EXPERIMENTAL SETUP A. DATASET DESCRIPTION
In this paper we used 4 datasets from two different domains with explicit and implicit feedback. In particular, we used MovieLens-1m and MovieLens-20m datasets [31] (hereafter, we refer to these datasets as ML-1m and ML-20m, respectively), which contain users' explicit ratings on movies and Adressa [32] as well as Globo [33] datasets which contain user implicit feedback on news articles. These datasets are described in Table 2. In the movie datasets the genres and cast of movies are available and used as item features. In the Adressa dataset, for each news article the related keywords, authors and topic are available and considered as features describing the news articles. For the Globo dataset, the generated article embeddings by a deep neural network model [34] based on article text and tags are used as item features.

B. EXPERIMENT DESIGN
As mentioned in the previous section (Sec. III), we propose to select the best CF model and the best MTR model among candidate models before generating recommendations for the cold-start items. Users and items with less interactions than a threshold are dropped from the experiments. 1 We use a cross validation scheme to avoid any information leakage between the model selection step and evaluation of the coldstart recommendation task (See Fig. 3). We first randomly split the dataset items to two disjoint parts, one for the model selection step (Fig. 3a) and the other for evaluating the coldstart recommendation task (Fig. 3b). In the first part, we select the best CF model and the best MTR based on 5-fold cross validation (CV). For selecting the best CF model five interactions per item are considered as test interactions in the test fold. For selecting the best MTR the items in the test fold are considered as test items. The hyperparameters of the models (CF and MTR) are internally tuned 2 in one of the folds. In the selected fold the parameters are tuned again based on CV. To tune the hyperparameters of CF methods and cold-start RSs, we used the ''forest_minimize'' from ''scikit-optimize'' library and for MTR methods we used ''GridSearchCV'' from ''scikit-learn''. For CF methods and cold-start RSs we used NDCG, and for MTR methods we used MAE to select the best hyperparameters.
Then, when the best CF model, the best MTR as well as their corresponding hyperparameters are fixed, we use the second part of the datasets to evaluate the item cold-start recommendation task. The second part of the datasets is also 1 This threshold is 30, 100, 100 and 50 for ML-1m, ML-20m, Adressa and Globo respectively. 2 The detailed information about the selected hyperparameters is reported in Appendix VI. split in 5-fold CV and each time we use the items in the test fold as the test cold-items to evaluate our proposed approach. The items in the other four folds are combined with the items from the model selection part, for a final matrix completion calculation and for training a final MTR model, using the selected approaches.
Last, we evaluate statistically significant differences between two methods by employing a Wilcoxon signed-rank test [35] (α = 0.01) to show that our proposed approach is significantly better compared to the second best comparative method in the results. Typically for one to apply such a test, more than ten paired independent observations are required. As we did not have such volume of data at our disposal, we computed the test on the different folds and repeated the experiments with three different random seeds (overall 15 paired observations).

C. COMPETITOR APPROACHES
In this section, we first present the CF and MTR approaches from which we select the best performing models for the two parts of our approach. In the first step, which is selecting the CF model, SVD [21], NMF [22], UKNN [36], IKNN [37] and SLIM [23] are considered for the datasets with explicit feedback and BPR [24], WRMF [26], WARP [25] and MVAE [27] are used for datasets with implicit feedback. For the second step RF, ERT and KNNR are used as MTRs in the proposed approach.
Finally, to evaluate our proposed approach we compare it against the following competitor approaches that address the cold-start problem: • CB: the classic CB RS that aggregates users' previous interactions to create user profiles and then recommends users whose profiles have the highest cosine similarity with the cold-item features.
• CMF: the Collective Matrix Factorization (CMF) [14] method explained in Section II. In this method we collectively factorize the interaction matrix and the item feature matrix.
• LCE: the Local Collective Embeddings (LCE) [15] approach explained in Section II, which extends CMF with non-negativity constrains and the locality assumption.

D. EVALUATION MEASURES
To evaluate the models in each step of the proposed approach we use different evaluation measures. To select the best CF RS in the first step, we use three measures, namely recall, MAP 3 and NDCG. 4 Recall is a standard information retrieval measure that reflects the proportion of relevant items that are recommended. MAP and NDCG are rank-sensitive relevance measures. In the second step, we use MAE 5 averaged over the targets to evaluate predictions of MTRs. Finally, NDCG and MAP are used to evaluate the proposed approach and the other competitor methods that address the cold-start problem.

V. RESULTS AND DISCUSSION
The results of applying the proposed approach with different base models are summarized in Table 3, 4, 5 and 6. As it is shown in Table 3, in both datasets with explicit feedback (ML-1m and ML-20), SLIM performs the best w.r.t. all three evaluation measures. For implicit datasets, as reported in Table 4, MVAE has the best performance in all three evaluation measures compared to the other CF-based RSs. Therefore, we select SLIM and MVAE to reconstruct the interaction matrix between users and items for datasets with explicit and implicit feedback, respectively. Next, using 3 Mean Average Precision. 4 Normalized discounted Cumulative Gain. 5 Mean Absolute Error.  the reconstructed matrices and items' features, three MTRs are trained and evaluated. As reported in Table 5, the best performing MTRs are RF and ERT in our explicit and implicit datasets, respectively.
The results of applying our proposed approach PULCO 6 with the selected CF and MTR method and the competitor methods that address the cold-start problem are summarized in Table 6. As shown in the table, PULCO has superior performance over all the competing methods and statistically significantly outperforms the second best approach in all datasets. In the explicit feedback datasets, the LCE method is the second best while in the implicit datasets, CMF is the second best performing approach. Although the CB approach is quite simple and straightforward, it performs relatively well in comparison to CMF and LCE. A possible reason is that the item feature space is relatively rich and therefore the CB approach can effectively model user profiles.
In the proposed approach we used different base models for each step of the algorithm. We selected the CF baselines based on the results of the award winning paper [38], which showed that the memory-based approaches (UKNN and IKNN), SLIM and MVAE outperform recent complex deep neural network based approaches. We selected RF, ERT and KNNR, as they are well-established multi-target regression models and are also computationally efficient.  Nevertheless, we should highlight that the proposed approach does not depend on a specific CF or MTR model and it is robust enough to accommodate other possible combinations as well, handling the cold-start problem both efficiently and effectively.
Our proposed approach fits perfectly to real-case recommender systems, as the latter have typically limited interactions in the user-item matrix. This means that most of the possible user-item interactions have not been recorded, nonetheless, it is likely that the user would be interested in some items in case they are presented to him/her. Our method takes into account this fact, reconstructing the userwarm item matrix prior to handling any cold-start items. This way, the inferred set of interactions consists of positive, reliable negative, and predicted interactions between users and warm items. As it is reflected in the obtained results, this methodology gives us an advantage over the competitor methods.
In real-life recommendation tasks, before a system is put in production, usually several types of CF methods are compared via A/B testing with real users in order to select the best performing one. This effectively removes the need for the CF selection step of our proposed approach. Once this best performing CF method is known, it can be used to reconstruct the interaction matrix for the first step of our proposed approach. Multiple repetitions of such comparisons between several CF approaches will not be needed in such real-life applications, something that also extends to the second step of our proposed approach. Furthermore, the second step with the tree-ensemble methods can be parallelized and implemented effectively. Serving the cold-start items does not therefore present a large computational burden on the whole recommendation task. This is a crucial advantage of inductive models over transductive competitors. The latter require the new items to be already present in the training process, which is not usually possible. In case recommendations have to be provided for new items, transductive models need to be re-trained, a process that is typically computationally very expensive and therefore makes the online application of corresponding RSs particularly difficult.

VI. CONCLUSION
In this paper we have addressed the cold-start problem in recommendation through PU learning. More specifically, we have deployed an effective two-step approach integrating powerful CF methods with fast and accurate multi-target prediction models. In the first step, we reconstructed the interaction matrix between users and warm items via a collaborative filtering recommender, identifying reliable negative useritem pairs and assigning a score to the rest of the (ambiguous) unlabeled data. Next, in the second step, we trained a multitarget regressor on warm item features and the reconstructed interaction matrix from the first step, efficiently predicting scores for hard cold-items. We showed that the proposed approach significantly outperforms the extended versions of matrix factorization for cold-start problem, i.e., collective matrix factorization and local collective embeddings models, as well the content-based recommender system in all four datasets considered. The proposed approach is flexible and robust in the sense that (1) it does not depend on the type of feedback (implicit or explicit) as well as the choice of CF or MTR models, and (2) it does not require retraining the model when a new cold item arrives.
The application of our work to the online learning setting would be a great direction for future research. In addition, the extension of our work to the field of multi-view learning, where one could integrate multi-modal feature sets of items or users to handle item or user cold-start cases, would be interesting. Last, it would be very interesting to extend this approach to pairwise learning handling user-item pairs by integrating user and item feature sets in a unified framework.