A Novel Time-aware Food recommender-system based on Deep Learning and Graph Clustering

Food recommender-systems are considered an effective tool to help users adjust their eating habits and achieve a healthier diet. This paper aims to develop a new hybrid food recommender-system to overcome the shortcomings of previous systems, such as ignoring food ingredients, time factor, cold start users, cold start food items and community aspects. The proposed method involves two phases: food content-based recommendation and user-based recommendation. Graph clustering is used in the first phase, and a deep-learning based approach is used in the second phase to cluster both users and food items. Besides a holistic-like approach is employed to account for time and user-community related issues in a way that improves the quality of the recommendation provided to the user. We compared our model with a set of state-of-the-art recommender-systems using five distinct performance metrics: Precision, Recall, F1, AUC and NDCG. Experiments using dataset extracted from "Allrecipes.com" demonstrated that the developed food recommender-system performed best.


I. INTRODUCTION
T HE internet has become an important part of people's daily lives and used in various tasks, ranging from leisure (i.e., chatting with other users, shopping, searching for hotels, travel deals) to professional development (i.e., using a web platform to develop professional services) [1]- [4]. The tremendous amount of information from tens of thousands of sources that can be accessed by a user as part of his/her request creates important uncertainty and ambiguity that can easily divert the user from his original request [5]- [7]. Although search engines have attempted to address the problem of redundancy of information in recent decades, they have not been very successful in personalizing search results and reducing the amount of noisy information [8], [9]. Many of these systems return the same results even for users with completely different profiles and interests. In recent years, researchers have become more interested in recommendersystems as one of the most successful personalization tools on the web [10]- [12]. It can be used to help the user identify the right service, reduce the information overload, guide the user towards some personalized behavior, and find user's favorite items within a large amount of information, among others. In a typical recommender-system, users' interests are discovered and items and services are recommended accordingly.
In a variety of lifestyle applications and services, food recommendation plays an important role as a tool assisting users to change behavior and adopt healthy lifestyle [13]- [15]. Typically, food recommendation attempt to provide the user with a personalized food recommendation in terms of recipes, scale of change and time required to achieve specific objectives that might be associated with diet requirement or any lifestyle demand [16]- [18]. Traditionally, research in food recommendation has seen little attention when compared to recommendation in other leisure and entertainment fields (e.g., music, book, shopping recommendation systems), possibly due to cultural barriers and difficulty to predict what people might like to eat. Although, lifestyle and diet related illnesses, such as obesity and diabetes, account for almost 60% of total deaths [19].
The process of generating a food recommendation is often viewed as a machine learning task [20], [21]. Therefore, it is crucial to understand user's food preferences accurately to build an effective food recommendation. Even for building health-oriented food services, the user can only be encouraged to pursue a recommendation if the recommended food matches his taste preferences [22]- [25].
In recent decades, many recommender-systems have been developed to predict person' preferences and/or guide his choice according to some predefined objectives [15], [19], [26]- [30]. Although previous food recommender-systems have shown good performance in learning persons' preferences by mapping user's historical interactions with food items and recipes, these systems still suffer from the following drawbacks: 1) Ingredients of foods: Most previous food recommendersystems [29], [30] rely primarily on historical ratings of users to draw upon food recommendations through a collaborative filtering approach that ignores food ingredients. This is due to the observation that a given food is usually preferred by an individual because it contains ingredients, he/she may like to eat. This may overlook some important aspects in the recommendation. For example, foods containing chicken wings may be a person's favorite food, while he/she may be allergic to some types of spices that can be used during the food preparation. Therefore, collaborative filtering recommender-systems may not be enough to account for such user's preferences and constraints. 2) Time factor: Traditional recommender-systems [19], [26]- [28] are based on the premise that users with similar preferences in the past will have similar tastes in the future. Accordingly, these recommender-systems use static data and ignore potential changes in user's food preferences, diet or life style that can occur over time in realistic scenarios. 3) Cold start users and cold start foods: Due to the fact that users often rate just a few foods, traditional collaborative filtering-based food recommender-systems have difficulty recognizing active user neighbors or similar foods. Accordingly, collaborative filteringbased food recommendation are only able to suggest foods to users who have rated enough foods. Cold start users, who have rated only few food items, are thereby ignored. Similarly, new food items (food cold start) that have not attracted yet enough ratings from users are ignored as well by such a collaborative filtering-based approach. 4) Users' community: Another issue, which is again ignored in existing recommender-systems, is the user's neighborhood or community aspect. Intuitively, community aspect can be utilized to predict the rating of unseen food item and the success likelihood of a given diet, extrapolating from active users' activities in the neighborhood. Typically, community aspect can be handled using clustering-based models. Nevertheless, it has been shown that such an approach also suffers from several other difficulties as well, which are somehow inherent to clustering techniques employed (e.g., optimal number of clusters, efficiency of similarity measures employed).
To defeat the above mentioned shortcomings, an original combination of collaborative filtering-based and contentbased recommender-system that simultaneously tackles the aforementioned four issues is developed in this paper. Especially, the proposal takes into account users' similarity as well foods' similarity based on their ingredients, while taking into consideration time factor and user's community aspects. The method is called Time-aware food recommender-system based on Deep Learning and Graph Clustering (TDLGC). In short, TDLGC recommends the favorite foods to the active user in two phases: (1) User-based rating prediction, and (2) Food-based rating prediction. In the first phase, by considering the users' community and users' similarity matrix, the user-based rates are predicted. In the second phase, by utilizing a deep learning-based clustering algorithm, the initial foods are grouped into several clusters and then, the rating of unseen foods is predicted. Following these two phases, the Top N foods are recommended. The proposed model has several novelties compared to previous food recommendersystems as follows: 1) Ingredients-aware food recommender-system: Unlike traditional collaborative-based food recommendersystems, our model integrates both collaborative filtering-based model (user-based phase) and contentbased model (food-based phase). As a result, a set of foods that both suit the user's preferences and utilize his/her previous ratings are recommended. 2) Time-aware food recommender-system: A novel timeaware similarity measure that takes into account changes in food preferences or diet over time is developed in this paper. This makes the proposal suitable to handle cases where users change his/her rating / preferences over time. 3) Trust-aware food recommender-system: A trust-aware food recommender-system is developed to overcome the cold start user and cold start foods problems of the traditional collaborative filtering-based food recommender-systems. Our proposed model builds a trust network of users based on trust (followerfollowing) statements to predict user ratings efficiently. The trust network generation plays an important role in addressing the neighbor selection problem. Trust statements can be used to predict the rating of unseen items in food recommender-systems since there is a high correlation between users' trust and user ratingsbased similarity measure. The user's trust network and the user ratings-based similarity are integrated in this study to address the data sparsity problem utilizing knowledge that is stored outside of the user's local neighborhood of similarity. 4) Community-aware food recommender-system: Contrary to previous works where users' communities are not considered in the food recommendation process, our model explicitly accounts for such aspects where the optimal number of users' clusters is determined automatically. Moreover, using a graphical like representation where edge weights are calculated according to user ratings-based similarity and trust network, the proposed method accommodates sparse datasets.
This paper is organized as follows. Section 2 reviews the previously used models for food recommender-systems. Section 3 describes the problem formulation and details our developed model. The experimental results and comparison with the state-of-the-art food recommender-systems are discussed in Section 4. Finally, Section 5 presents the conclusion and outlines future work perspectives.

II. BACKGROUND
The first part of this section introduces preliminary concepts in recommender-systems. This is followed by a concise review of previous food recommender-systems, their limitations, and associated challenges.

A. RECOMMENDER-SYSTEM
Considering the growing volume of information available in the World Wide Web, along with the decreasing quality of its content, users face a significant time constraint when it comes to searching for the information they need [31]- [33]. This is especially true for commercial and/or entertainment online services where, due to the overwhelming abundance of products, identifying relevant selection that would attract user's interest is very challenging [34], [35]. As part of their decision-making process, recommender-systems attempt to provide suggestions based on user's tastes, behaviors, and context [36]. Typically, recommender-systems use different input mechanisms depending on the employed model and algorithms to generate a recommendation [37], [38], [39], [40]. In this context, input data sources may consist of the following: User Profile: includes demographic and economical attributes such as age, gender, place of birth, education, occupation, place of residence that can be employed for a better performance in identifying and predicting user's interests.
User Rating: concerns the numerical values assigned to specific items as part of user's review. Often, Likert scale [41], which assumes the user's attitude can be captured on a continuum scale from strongly agree to strongly disagree (i.e., value 5 for strongly agree, 1 for strongly disagree, 3 for undecided), is employed for this purpose.
User Comments: concern the textual description made by the users on specific items to report their opinion and thoughts on those items. In this case, Natural Language Processing (NLP) techniques will be required by the system to process textual information and infer useful insights [42].
User Behaviors: concern the user's evaluation inferred from user's click and website's visit. Often, metrics related to number of site's visit, browsing history, the length of time spent on a particular web page, types of items purchased are often employed to assess a such behavior [43].
User Search Keywords: concern the list of keywords employed by the user to search for specific items. This reflects the search behavior of the user when seeking the items of interest.
On the other hand, depending on the nature of recommender-system application, the output can be as follows: Recommendation List: corresponds to the list of items that best suit user's preference / search query and/or popularity. This list may be presented to the user as the top N items, by filling in links on the corresponding web pages, or by removing negative suggestions.
Predicted Rate: concerns the rating that can be assigned to new items according to the similarity scores with the already rated items. Predicted values must coincide with the user-item matrix, i.e. predicted rate must be within the range of actual rate.
In case where users' feedback and comments are available, the automatic textual analysis of such comments can also reveal useful insights that may enhance the quality of the recommendation. Nevertheless, such an analysis is not popular as yet, possibly due to the computational complexity and inherent challenges associated with natural language processing tasks [44].
The variety of data sources that can be used by recommender-systems can be categorized into user-item ratings, user attribute profiles, and item attribute profiles. This yields three types of recommender-systems: collaborative filtering-based model, content-based model, and hybridbased model. In collaborative filtering, the past interactions of the user, such as his user-item ratings are analyzed. In content-based analysis, the item profile is analyzed, while, in hybrid-model, mixed methods are created by combining several types of recommendation methods (i.e., collaborative and content-based) [45].
A number of previous works [4], [12], [36], [45] considered the recommender-system problem as a classification task. They used the inferred user and item as feature vectors, and the rating as the class label. In [12] after formulating the recommender system as a classification problem, Funk-SVD decomposition is utilized for user and item features extraction to expand the two-dimensional location feature vector. Then, a decision tree-based classification algorithm is employed for final rating prediction. Yu et al. [45] formulated the collaborative filtering-based recommender system as an ensemble learning problem. Then, in order to select a combination of user-and item-auxiliary domains, Pareto Ensemble Pruning is used. VOLUME 4, 2022

B. FOOD RECOMMENDER-SYSTEM
The increasing variety of foods and busy lifestyles has made it difficult for people to make healthful food decisions in a way to lower their risk of chronic diseases [46]- [48]. The use of food recommender-systems is seen as a tool to help users to change their eating habits or suggest healthier food choices [49], [50]. Diet and food are complex domains that present many challenges for recommendation technologies. It is therefore necessary to collect thousands of food items and ingredients for making appropriate recommendations. Moreover, since food items and ingredients are often combined in recipes rather than consumed separately, building a system that recommends both entities becomes challenging. This subsection summarizes relevant studies in food recommender-systems.
Trang et al. [51] reviewed research on recommendersystems in the healthy food domain. According to their analysis, food recommendation plays an essential role in supplying users with food items that meet their preferences and nutritional needs as well as encouraging them to adhere to positive eating habits. Moreover, they pointed out that future work is needed on issues regarding user profile analysis, recommendation techniques, changing eating habits, explanation generation and group recommendation.
Similarly to other types of recommender-systems, food recommendation can also be categorized into three types: collaborative filtering, content-based, and hybrid model. In [30], three different collaborative filtering-based food recommender-systems were proposed. In the first method, food is recommended according to user's preferences. The second one recommends healthy foods only, while the third one recommends both healthy and user-friendly recipes. Nevertheless, these food recommender-systems ignore food ingredients in the recommendation process, which negatively impact their performances. Similarly, Adaji et al. [52] suggested a personalized food recommender-system that takes into account user's personality. Instead of collaborative filtering-based food recommender-systems such as [30], this study utilizes user's community information to deeply analyze users' preferences. Because personality traits are often characterized by many similar characteristics, they have used to provide personalized food recommendations. By generating two types of networks to capture the relationships between food ingredients, Teng et al. [53] put forward a content-based food recommender-system. In their model, recipe ratings are accurately predicted with characteristics extracted from ingredient networks and nutritional data. Similarly, Yang et al. [54] developed a personalized nutrientbased food recommender-system that aids users in meeting their nutritional expectations, dietary limitations, and finegrained food preferences. Recent studies [26], [27], [53] indicated that food contents should be considered in collaborative filtering-based models. In this regard, in [27], a hierarchical attention approach was developed to simultaneously integrate collaborative information and food contents to improve the performance of the food recommender-system. Meng et al.  [28] put forward a heterogeneous multi-task learning mechanism for food recommendation. Even though this method can yield reasonable recommendations in multi-task learning scenarios, it ignores potential relationships among users. Moreover, it ignores time factor, which renders the method unable to accommodate the change in user taste over time. By integrating food nutrition and user preferences in [19], a general model for daily food plan recommendation was developed.
In [26], a deep learning-based food recommender-system using a graph convolutional network was suggested. Despite exploiting ingredient-ingredient, ingredient-food, and food-user interactions, this method does not utilize useruser interactions such as follower-flowing network and user community. Table 1 summarizes the main characteristics of the previously studied models. Based on this investigation, the following observations can be drawn from the reviewed food recommender-systems: • Time factor in user rating is ignored. • With the exception to [52], user communities are ignored, and the analysis of user-user interaction is rather light. • Trust relationships among users, when available, are neglected. • Only few works attempted to integrate user ratings and food content, which is a vital component of any food recommender-system [26], [27], [53]. In order to overcome these shortcomings, this paper proposes a novel hybrid recommender-system. Through the use of user-based and content-based models, as well as trust network and user communities, the proposed method addresses the four aforementioned issues simultaneously, while attempting to improve the final accuracy of the recommendersystem.

III. PROPOSED METHOD
In the sequel of the current study, we shall assume i) the existence of a user community that conveys a minimum trust level among its individuals; (ii) each user carries his own ratings about a set of food items (each food item is constituted by a number of ingredients) that describe his own diet preference (s); (iii) user's preference can potentially change over time and these historical changes are fully known.
The developed recommender-system should therefore account for the above three assumptions. For this purpose, the cornerstone idea of our TDLGC recommender-system lies in its integration of the concepts of Deep Learning (DL) and Graph Clustering (GC) in a way to take into account user's trust network as well as timely historical users' ratings. In overall, the conceptual framework of the developed model highlighted in Figure 1 has two distinct phases: (1) Userbased rating prediction, and (2) Food-based rating prediction. In the first phase, (i) by utilizing both the user rating and the follower-following network, the user-user similarity matrix as well as the users' trust network are generated. Then, (ii) based on the user similarities and the trust network, the given user set is mapped onto a weighted graph. In the next step, (iii) a novel time-aware graph clustering algorithm is proposed to cluster the users into different groups accordingly. Finally, (iv) utilizing users' clusters from previous step, user similarity and historical ratings, new user-based ratings are predicted. In the second phase, (i) the foods ingredients are embedded using a deep learning-based technique. Then, (ii) using the associated embedding vectors, the similarities between different foods are assessed. Finally, (iii) using the food similarities, the rating of unseen foods is predicted. After these two phases, (iv) utilizing the user-based prediction and the food-based prediction, the Top-N food will be recommended to the target user. In the remainder of this section, the problem formulation is stated and then the details of each phase of the proposed food recommender-system are provided.

A. PROBLEM FORMULATION
Consider a food recommender-system with N users and M food items. Let U = {u 1 , u 2 , u 3 , ..., u N } and F = {f 1 , f 2 , f 3 , ..., f M }, be the set of users and the set of food items, respectively. Let R be the user-food matrix, which contains users' ratings of individual food items. We assume Likert-scale is employed so that all ratings take value in {1, 2, 3, 4, 5}. Moreover, each element u j of U representing a given user can be assigned a profile with different characteristics such as age, gender, weight, height, location, etc. In our case, we restrict to a basis scenario where the user profile contains only one unique element, called the user ID. Similarly, each element of F can be assigned a set of characteristics such as ingredients, calories, sugar, fat, etc. In our formulation, each food item f i is defined by its ingredients only; that is, if IngSet = {ing 1 , ing 2 , ing 3 , ..., ing m } stands for the set all known ingredients, then the set of ingredi- where k i is the number of ingredients in food f i and σ is some permutation of integers {1, 2, 3, ..., m}. The interaction among users is described using a follower-following network F ollower(U, E, W ) (where E and W stand for the set of network edges and their corresponding weights, respectively). This follower-following network generates a trust network T rustG(U, T r) and overall users network G(U, E, W ) that accounts for users' similarity according to their rating inputs. Finally, the availability of time stamps on users' ratings enables us to monitor the evolution of the network on regular interval. In our case, a monthly sample was used; that is, t (u j , i) shows the time stamp of the recorded rating of user u j to food f i . In our study, as all ratings were extracted between 2000 and 2018, we used monthly intervals, yielding 132 monthly periods. As a result, the t (u j , i) value for recorded ratings in the first month of 2000 will be 1, and that of recorded ratings in the last month of 2018 will be 132.
The main function of our food recommender-system is to predict how a user u j rates the food f i . Below we summarize the notations employed:

B. USER-BASED PREDICTION
In the user-based prediction phase, the rating (with respect to each user) of an unseen food item (with a given set of ingredients) is predicted using the knowledge about user network and trust relationship (through a new clustering based strategy that uses a new user-similarity metric) and timely available previous ratings (as in matrix R). This contributes to solving the cold start problem. Figure 2 displays the general scheme of the proposed user-based rating prediction method for a simple dataset with nine users. In the remainder of this subsection, the details of the developed user-based prediction are explained.

1) User similarity calculation
In the first step of the proposed method, the time-aware similarities between different users are calculated using a new Time-Aware-based similarity measure. Formally, the similarity sim (u i , u j ) between user u i and user u j is defined as below: where where r k (u i ) is the rating given to food f k by user u i , and r (u i ) is the average score rated by user u i . Likewise, A ui,uj is the set of foods which are rated by both users u i and u j . T W (ui,uj ,k) denotes the accumulative weight of users' u i and u j ratings to food f i considering the time stamp of those ratings. This weight is calculated as below: where, t (u i , k) indicates the time period of recorded rating of user u i to food f k . T P indicates the maximum Time Periods, and λ corresponds to some user control parameter that adjusts the impact of time factor. A higher (resp. lower) value of λ indicates an increasing (resp. decreasing) importance of time in the similarity score. We shall recall that user ratings are split into monthly time intervals because of the sparsity of the 18 years collected ratings (so, T P is set to 132). In the case of denser user-food ratings, weekly or even daily time periods can be alternative reliable options.

2) Generate Trust Network
Trust network plays important role in solving the neighbor selection challenge identified in traditional recommendersystems. Previous studies [55]- [57] showed that users who trust each other, often exhibit similar rating profile. As a result, trust relationship, if available, can be used as an additional insight to predict unseen items in traditional collaborative filtering systems. In our case, the available follower-following network can be utilized to derive trust relationship. In essence, if user u i follows user u j , it is assumed that user u i trusts user u j . Therefore, the users' trust network is considered as an unweighted and undirected graph T rustG(U, T r), where U is the set of user and T r is the set of edges between different users. Equivalently, this can also be represented as a weighted graph such that the edge weight between users u i and u j is set to 1, if user u i trusts user u j or user u j trusts user u i , otherwise this edge weight is set to zero.

3) Graph Representation of users
In this step, the user set U is mapped onto a weighted graph G (U, E, W ), where E is the set of edges among all users and W is the calculated similarities between different users in U . In the user-based prediction model, a combination of the Pearson correlation coefficient and trust relationship is utilized to calculate the edge weights between different users as follows: where T r(u i , u j ) is the explicit trust score between users u i and u j calculated in trust network generation step, while sim (u i , u j ) corresponds to the user similarity calculated using the proposed time-aware-based similarity measure. α denotes a control parameter in the unit interval that adjusts the contribution of trust and use-similarity components.

4) User Clustering
Choosing an appropriate neighborhood for the target user is one of the most important problems of any recommendersystem. Indeed, identifying relevant neighbors for a given target user enables the recommender-system to predict ratings accurately. Clustering-aware recommender-system is one of the most effective ways to overcome collaborative filtering shortcomings and to improve the overall quality of the neighborhood selection process. For this purpose, a clustering-based recommender-system is employed. Our review of existing user clustering methods in recommendersystem showed the following shortcomings: VOLUME 4, 2022 • the need to specify the number of clusters before performing user clustering; • the density of user in a cluster, which is one of the most important criteria in user clustering, is not considered; • all users are considered equally, while certain influential users should have a greater impact on the clustering process; To address these shortcomings, in the proposed method, the recently introduced graph clustering-based algorithm [58] is utilized to group the users into several clusters. This algorithm uses a fast parallel model for community detection in a large graph. The algorithm is shown to be faster than previous methods, e.g., [59], [60], for user clustering and it is able to determine the number of clusters automatically.

5) User-based rating prediction
In the case of the user-based rating prediction, the rating of food f k by user u i is predicted as follows: .
(4) where w (u i , u j ) indicates the edge weight between user u j and user u j that has been calculated using Eq (3). C i corresponds to the community of users where the user u i also belongs to.

C. FOOD-BASED PREDICTION
In the Food-based prediction phase, the rating of unseen foods is predicted based on food similarities and previous ratings. The aim of this phase of the proposed system is to solve the cold start problem utilizing cluster structure between foods. The item cold start problem is a difficult and common problem in traditional recommender-systems, where no prior ratings have been recorded for certain items. To address this issue, item clustering methods are often employed in recommender-systems. In this paper, we advocate a food clustering based on their ingredients. Therefore, the proposed framework utilizes a technique that converts food ingredients into an embedding vector. Figure 3 displays the outline of the developed food-based rating prediction for a simple dataset with seven foods. Moreover, in the remainder of this subsection, the details of proposed Food-based prediction model are explained.

1) Food deep embedding
In this step, each food is mapped to an n-dimensional real valued vector. Our proposed food clustering model utilizes the Bidirectional Encoder Representations from Transformer-Large (BERT-Large) [61] model to map the foods to contextualized embedding. Strictly speaking, by the end of 2018, BERT established itself as a pioneer in many Natural Language Processing tasks [62] where by conditioning on both left and right context in all layers, BERT was able to pretrain deep bidirectional representations from unlabeled text beyond the boundary of traditional language representation models. With just one additional output layer, the pre-trained BERT model can be employed in a wide range of applications. To use Natural Language Processing (NLP) techniques for food clustering, each food f i is considered as a sentence and the ingredients of that food I i = ing σ(1) , ing σ(2) , ing σ(3) , ..., ing σ(ki) are considered as words of that sentence. The inputs to the feature extraction procedure are the sentences (i.e. foods) and tokens (i.e. ingredients), and the output is a JSON file containing simulated embeddings from different layers of BERT. Tokens are represented as n-dimensional vectors that capture the context of their appearance. The final step is to compute the average of all token representations belonging to the sentence in order to produce a contextualized representation. The overview of our BERT-based food embedding method is shown in Figure 4.

2) Food Similarity Calculation
By employing contextualized embeddings, food ingredients can be captured. Therefore, in our food clustering method, the proximity of foods in vector space was used as a measure of similarity. Foods with nearby vectors are assumed to share some ingredients. In the proposed food-based rating prediction model, a clustering step to devise groups of similar foods according to the distance of their representations in the vector space is implemented. Euclidean distance was used to evaluate the similarity between foods. Formally, let .., f jL } be contextualized representation vector of food f i and food f j , respectively. The similarity between food f i and food f j is then calculate as follows: where, f il denotes the l − th dimension of the contextualized representations vector of the food f i .

3) Food Clustering
The food clustering algorithm developed in this paper employs the Deep Embedded Clustering (DEC) technique [63] that reduces the distance between similar embedding vectors in the embedding space. DEC uses AutoEncoders (AE) and Kullback-Leibler (KL) divergence to decrease the data dimensions and to enhance the embedding vector representation. Especially, the AE uses both feedforward and backpropagation to determine the encoder and decoder weight values and predict the class label of the input data in an unsupervised mode.
In essence, DEC redefines the food embedding space as a Z-space using a stacked AE. The latter consists of a large number of deep neural networks and maps the food vector f i onto Z-space. The stacked AE employs a greedy layer wise training phase, which includes two stages: pre-training and fine-tuning. By addressing the vanishing gradient issue while carrying out unsupervised learning for each layer of the neural network, it enhances the training performance of the deeper neural network. This can improve the network capability for the input data, denoted s i , such that various vectors can be represented. Following the pre-training process, the encoder and decoder are concatenated to perform a fine-tuned learning. Moreover, in our method, the non-linear SeLU function [64] is employed in all layers other than the first hidden layers of the encoder and decoder. Furthermore, in this method, the dropout algorithm is utilized to reduce the probability of overfitting. The initial Z-space representation after the fine-tuning phase consists of the latent space layer of the encoder z i . Moreover, the cluster centroid {µ j } k j=1 will be refined by iteratively updating z i . The fitness function VOLUME 4, 2022 in this clustering scheme minimizes the difference between the soft assignment q ij and the target distribution p ij using Kullback-Leibler divergence. Therefore, the probability q ij of assigning the point z i to j can be calculated as follows: Moreover, to improve the clustering coupling, q ij can be refined iteratively as follows: where, CF k denotes the soft cluster frequencies of cluster j, that can be calculated as below: Finally, the objective function of Kullback-Leibler divergence is calculated as follows:

4) Food-based rating prediction
In the case of the food-based prediction, the rating of food f i for user u is predicted as below: (10) where r j (u) is the rating of food f j given by user u. Similarly,r i is the average rating of food f i . sim (f i , f j ) corresponds to the similarity score between food f i and f j which can be calculated by (5), and C fi denotes the set of foods belonging to the cluster where food f i also belongs to.

D. TOP-N RECOMMENDATION
After calculating the user-based prediction and the foodbased prediction, the final prediction of food f i for user u is defined as a convex combination of user-based and foodbased predictions: where p u−based i (u) and p f −based i (u) are the user-based prediction and food-based prediction on food f i for user u, respectively. The parameter β controls the trade-off between user-based and food-based predictions.

IV. EXPERIMENTAL RESULTS
In this section several experiments are designed to assess the efficiency of the developed food recommender-system. Moreover, the proposed system is compared with other stateof-the-art food recommender-systems. The corresponding subsections provide details of the used dataset, evaluation measures, results, sensitivity analysis and the discussion.

A. DATASET
Due to the inaccessibility of relevant user-food rating information, the currently available public food datasets, such as Food-101 [65] and Yummly [66], are not appropriate to evaluate our food recommender-system. For this purpose, we collected our user-food rating dataset by crawling the Allrecipes.com website. With 1.5 billion visits per year, it is considered to be one of the most visited Food-Oriented  Table 2. Besides, given that there is a need to identify food ingredients from the crawled text as part of the food-similarity calculus, natural language processing is needed. For this purpose, a simple string matching technique from NLTK (natural language processing toolkit) was employed to identify ingredients from a predefined list of ingredients. Ingredient formalization and pre-processing of the input foods are performed prior to their use in the main phases of the proposed method. This pre-processing includes tokenization, stemming, and stop-word removal. To remove noisy terms, a default stop words list is consulted and reshaped to fit our need. By stemming, inflected words are reduced to stem base, or root forms. The current paper employs Porter's stemming method (Porter) for this pre-processing phase [67].
In Table 3, part of the input food set, their raw ingredients and output ingredients set are shown. Moreover, in Table 4 part of the User-Food ranking matrix is shown.

B. EVALUATION MEASURES
The leave-one-out technique is generally employed to evaluate different recommender-systems. This technique compares the predicted rating obtained by the recommender-system with the rating taken from the dataset. It iterates through the entire dataset. In our experiment, the leave-one-out technique is employed to compare the performance of the developed model with other recommender-systems. In addition, five well-known criteria for recommender-systems, consisting of Precision, Recall, F1, AUC and NDCG are utilized for measuring the effectiveness of the developed food recommendersystem.
Precision, Recall and F1-measure are three of the most frequently used metrics in information retrieval domain. To calculate these metrics, a confusion matrix is utilized to categorize the items into four groups. This matrix places the relevant items that the system recommends as relevant for the user in the true positive (TP) category, and the relevant items the system failed to recognize as relevant for the user in the VOLUME 4, 2022 Recall is defined as the ratio of the relevant items that are recommended to the total number of all relevant items, as follows: There is a clear conflict between the Precision and Recall criteria. Increasing the number of top-recommended items increases the number of relevant items and also the recall measure, while decreasing the precision measure. By combining Precision and Recall, F1 provides an appropriate weighted combination measure: The precision and recall of recommendation algorithms cannot be evaluated directly because we need to know if each item is relevant, which means every item should be rated by the user. Therefore, in our experiments, Precision@N, Recall@N and F1@N are employed (N being the size of the recommendation list). These measures can be calculated as follows: where |Rel u | indicates the number of the items that are relevant to the user u. Ideal recommender-system would produce a ROC curve that slopes directly upward towards 1.0 recall and 0.0 fallout until all relevant items are retrieved. Thus, it is evident that the objective is to maximize the area under the ROC curve. Therefore, AUC can be utilized as a single criterion to evaluate the performance of the recommender-system. AUC represents the probability that a classifier will rank a randomly selected positive instance higher than a randomly selected negative instance.
Normalized Discounted Cumulative Gain (NDCG), which takes value in the unit interval, is another measure that was utilized in our experiments. NDCG assigns higher values to the hits at higher positions on the ranking list. High NDCG value indicates that the recommendation list is more likely to work effectively for relevant items.
Besides, to evaluate the efficiency of the proposed recommender-system, after dataset preparation, all of the recorded ratings were randomly partitioned into three subsets, corresponding to validation set (10 %), training set (70 %), and testing set (20 %). The next step is to maintain the users that appear in both the train and test subsets.

C. RESULTS
To measure the efficiency of the developed food recommender-system, various experiments are designed. In our experiments, the parameter optimization technique (Bayesian-based) developed in [68] was employed to set the optimal values for the λ, α and β parameters. Accordingly, the values of λ, α and β parameters are set to 2.5, 0.5 and 0.6, respectively.
In the first part of our experiments, the impact of the two phases of our developed recommender-system is evaluated. In Table 5, the performance of the developed food recommender-system is compared to the case where the items are recommended only according to User-based rating prediction or to the case where the foods are recommended only according to Food-based rating prediction. The results of Table 5 show that the recommender-system method performs significantly better when the final rank is predicted based on the combination of User-based and Food-based phases. For instance, the improvement is 12.3 % with respect to Precision@10, 11.3% with respect to Recall@10, 11% with respect to AUC and 16.4% with respect to NDCG@10 measure.
In the next experiment, the impact of trust network generation is investigated. Table 6 reports the performance of the developed recommender-system in the case where trust network (follower-following) is considered and the case where it is ignored. As it can be seen in this table, considering the trust statements among users has clearly improved the performance of the proposed recommender method. This improvement is 5%, 10%, 12.4%, 3.6% and 3.1% with respect to Precision@10, Recall@10, F1@10, AUC and NDCG@10, respectively.
In the next experiment, we assess the impact of accounting for time stamp in the user-based prediction. For this purpose, Table 7 compares the performance of the developed recommender-system when the items are recommended with and without considering the time stamp of ratings. The results of this table indicate that the proposed time-aware food recommender-system predicts ratings much better than the non-time-aware recommender-system (e.g., 13.4% improvement using Precision@10 measure, 12.5% improvement using Recall@10 measure).  Next, the performance of the developed system is compared with three state-of-the-art food recommender-systems; namely, LDA [29], HAFR [27] and FGCN [26]. To achieve more precise and acceptable assessments, ten separate and autonomous runs were conducted. In each run, the data is randomly divided into train data (60 % of the initial data), test data (30 % of the initial data) and validation data (10 % of the initial data). The training and validation data are used in the learning process, while the test data is utilized to assess the recommended items. To perform fair experiments, the comparative systems are evaluated on the same training/testing set.
The results of this comparison shown in Table 8 indicate that the developed food recommender-system obtained the highest Precision, Recall, F1, AUC, and NDCG values compared to other systems (e.g., 2.6% Precision@10 improvement on the best state-of-the-art method (FGCN), 1.3% improvement with respect to Recall@10 and 6% improvement with respect to NDCG@10 metric).
Next, to measure the rank efficiency of the developed system, the efficiency of Top-N recommender-system, where the ranking positions range from 5 to 20, in terms of Recall and NDCG are shown in Figure 5 and Figure 6, respectively. We notice that the performances of the developed food recommender-system are higher than those of other systems in terms of Recall and NDCG criteria. As a result, the reported results clearly indicate that the proposed system enhances the representation learning effectively and robustly.

D. SENSITIVITY ANALYSIS OF THE PARAMETERS
The developed model also requires several parameters to be specified, like many other recommender-systems. In all recommender-system methods, some of these parameters, such as the number of recommendations, must be set and are not specific to our developed model. In the developed model, the three parameters: Time weight, Trust weight, and Userbased/Food-based control are investigated.
The λ parameter controls the effect of rating time in the similarity calculation. This parameter can be set to any value between zero and ∞, where a high (resp. low) λ value translates a greater (resp. smaller) impact of the time factor on the overall similarity score. Figure 7 shows the parameter sensitivity analysis of λ in the final performance of the developed model. The results compare the performance of our recommender-system in terms of Precision@10, Recall@10, F1@10 and NDCG@10 for different values in the range [0. 5 4]. The reported results indicate that the best performance is reached when setting VOLUME 4, 2022  Secondly, we investigate how the α parameter affects the performance of the developed recommender-system. The parameter α adjusts the weight of the trust term and the similarity term in the generated user graph. This parameter can be set to any value between zero and one. A large (resp. low) value of α indicates a high (resp. low) importance of the trust term in the graph construction process. It can be concluded from these results that when α parameter was set to 0.5, the proposed food recommender-system achieves the highest performance.
To search for the appropriate value for the Userbased/Food-based control parameter, β, different experiments were designed to denote how the performance changes with different values of that parameter. β ensures a trade-off between the user-based and food based contributions. This parameter can be set to any value between zero and one.  If β is set to a value close to 1, the impact of user-based contribution increases. On the other hand, when β is set to a value close to zero, the impact of food-based term will be increased. Figure 9 reveals the β parameter sensitivity analysis. The experiment evaluates the performance of the recommernder system on the Precision@10, Recall@10, F1@10 and NDCG@10 measures for different β values. The results have shown that in most cases when β is adjusted to 0.6, the developed recommender-system achieves the best performance.

E. DISCUSSION
In this study, we collected user ratings for different types of foods by crawling the Allrecipes website. Our novel food recommender-system is trained and evaluated for food recommendation. The average Precision@10, Recall@10, F1@10, AUC, and NDCG@10 of our model were 0.0721, 0.0691, 0.0705, 0.6812 and 0.0497, respectively, which is shown to be significantly higher than other state-of-the-art food recommender-systems (Table 8).
Additionally, the reported results display the efficiency of the Top-N recommender-system with ranking positions ranging from 5 to 20 to assess the rank-efficiency of the developed system. The developed food recommender-system achieved significantly better performance in terms of Recall and NDCG scores than other competing systems ( Figure 5).
As mentioned earlier, the proposed food recommendersystem combines two independent phases: User-based rating prediction and food-based rating prediction. The impacts of these two phases of the developed recommender-system were evaluated in our experiments. The results indicated that the proposed hybrid food recommender-system on average is about 11.8% more efficient than the case in which only the user-based phase is utilized and about 11.9% more efficient than the case where only the food-based phase is employed for final recommendation (Table 5).
Moreover, the reported results showed that when explicit user trust network is used in the developed recommendersystem, the performance on average improves by 5.3% compared to the case where trust information is ignored (Table  6).
In our developed method for the user graph representation, the time stamps of the recorded ratings are considered. In fact, in the User similarity calculation phase, which has a great impact on the final food recommendation, the time of ratings are taken into account, and higher weights are assigned to the newer ratings, and conversely, lower weights are assigned to the older ones (Eq (1) and Eq (2)). Therefore, due to the higher weights of new rates, if the user's taste changes over time, our food recommender system efficiently captures this effect in the final recommendation. Our experiments also showed that when the similarity measure takes into account the time factor, the final performance of the developed system is on average 9.51% better than when ignoring this factor (Table 7).
Nutrition is the essential basis for health and is related to onset of preventable non-communicable diseases. To promote healthy eating and reducing the burden of preventable non-communicable diseases, Word Health Organization (WHO) has recently urged to develop tools for nutrient profiling to facilitate healthier choices [69]. Such tools should also be capable to accommodate individualized conditions and circumstances. For instance, for those individuals with certain health conditions such as type-2 diabetes, it is vital to also consider the food ingredients to provide healthy food recommendations while taking into account their preferences. A distinguished aspect of our proposed recommendersystem is that food ingredient was considered for making the recommendations. This shows that our proposed method has the potential to be further extended to develop healthawareness food recommender-system.
In the remainder of this subsection the reasons for the enhanced performances of the developed food recommendersystem compared to other system are discussed. These are grounded on the key innovations that were incorporated into the developed system, which made the model performs better than other state-of-the-art food recommender-systems: • People typically choose a food because it has ingredients they enjoy eating. Therefore, for any food recommendation platform to be effective, it is essential to take the ingredients into account. • Many previous food recommender-systems, including collaborative filtering-based models, which ignore food ingredients, are inefficient at learning user preferences and recommending him/her favorite foods. Therefore, for any food recommendation platform to be effective, it is essential to take the ingredients into account. • Users' ratings as well as food ingredients are considered in this paper for final food recommendations. Therefore, a set of foods will be recommended based on the user's preferences and his/her past ratings. • In previous food recommender-systems, timing information in historical user ratings is ignored. Nevertheless, it is acknowledged that users' preferences including their diets might change over time in realistic scenarios, and the recommender needs to take the time factor into account. As a result, ignoring changes in food preferences over time may result in inefficient food recommendations. An innovative time-aware similarity metric is developed in this study that takes into account changes in food preferences or diets over time. Therefore, this food recommender-system is quite different from previous works, which ignore the time of user ratings and cannot be applied to real-life food recommender-systems. • Cold start users and cold start foods are two important issues in recommender-systems, especially in food recommender-systems, where the users often rate just a few foods. In this paper, user community and usertrust network as inferred from follower-following relationships are employed to overcome the cold start user, cold start foods and neighborhood selection problems of the traditional collaborative filtering-based food recommender-system. Moreover, to address the data sparsity problem, this study incorporates user ratingsbased similarity and user's community to utilize knowledge outside the user's local neighborhood of similarity.

V. CONCLUSION AND FUTURE WORKS
With the development and increasing popularity of the Internet and the growing number of web users, recommendersystems that select items that are reasonably appropriate to the needs of users are gradually becoming more widespread. A variety of lifestyle applications rely on food recommendersystems, which are integral parts of many lifestyle services. A novel hybrid food recommender-system is developed in this paper to overcome the shortcomings of previous food recommender-systems, such as ignoring food ingredients, time stamp, cold start users and cold start foods and user community. Using user-based and content-based models as well as using time information, trust network, and user communities, the proposed method addresses all four issues simultaneously and aims to improve the final accuracy of the recommender-system. The proposed method involves two phases: food contentbased recommendation and user-based recommendation. Graph clustering is used in the first phase, and a deeplearning based approach is used in the second phase to cluster both users and food items. The model has been compared to the newest proposed food recommender-system including LDA, HAFR and FGCN methods with respect to five different metrics: Precision, Recall, F1, AUC and NDCG. The experimental results indicated that the developed food recommender-system achieved the best performance and outperforms the state-of-the-art food recommender-systems by a noticeable margin.
We aim to incorporate the side information of users (e.g., gender, age, weight, height, location, and culture) into the food recommendation framework in the future works to further improve the final performance of the food recommendation. In addition, a proper eating habit can lessen the severity of symptoms associated with non-infectious diseases. In future works, we aim to use nutritional characteristics of each food as additional information and recommend foods according to each person's health status and diseases.
MEHRDAD ROSTAMI received the B.Sc. degree with honors in Computer Engineering from Razi University, Kermanshah, Iran, in 2012, and the M.Sc. degree with honors in Artificial Intelligence from University of Kurdistan, Sanandaj, Iran, in 2014. He is currently a Ph.D. student and researcher at University of Oulu, Oulu, Finland. His research interests include pattern recognition, medical diagnosis, machine learning, recommender-systems, big data mining, and bioinformatics.
MOURAD OUSSALAH is currently working as a research professor at university of Oulu, Faculty of ITEE in Finland. He holds PhD degree from University of Paris XII in 1998 on Robotics and Artificial Intelligence. Prior to his current affiliation, he held research and academic position at KU Leuven, City University of London and Birmingham University, UK. Dr. Oussalah worked extensively on information processing, data mining and text mining where he published more than 200 technical papers and led several projects. He is a Fellow of Royal Statistical Society and Senior Member of IEEE.
VAHID FARRAHI received his PhD in Medical Physics and Technology from the Faculty of Medicine, University of Oulu, Finland in 2021. Vahid is currently a postdoctoral fellow at the same research unit, where he conducts multidisciplinary problem-oriented research with a focus on machine learning and data-driven solutions. His doctoral thesis investigating the health implications of accelerometer-determined activity behaviors was funded by MSCA COFUND and was given the grade of 'passed with distinction', which is only awarded to exceptionally high-quality theses representing the top 15% in other fields. VOLUME 4, 2022