Group Recommendation Based on Heterogeneous Graph Algorithm for EBSNs

,

including sociology, economics, historiography, computer science, and information systems, as this subject is understood to have a major impact on society.
Enabled by the widespread adoption of communication tools such as computer technologies and the internet, online social networks have been adopted in everyday life for most people in many, if not most, human societies. Facebook 1 and Twitter 2 are among the most popular online social networks, allowing users to share thoughts, make friends and communicate with each other, among many other features. Moreover, some social networks have also been created that are oriented towards specific media or subjects, such as Flickr 3 for photo sharing, YouTube 4 for video sharing, SlideShare 5 for sharing slides, and ResearchGate, 6 for researchers. The number of online social network users has massively increased over the last two decades, such that many people tend to be more active in online social networks than in the offline, physical social network, i.e., the physical society. It is clear that online social networks have become an important part of our collective life.
In recent years, EBSNs such as Meetup 7 and Eventbrite 8 have also emerged [1], [2]. These services provide convenient platforms for people to create, distribute, and organize social events. Typically, users can post new events on EBSNs ranging from informal meetings to formal activities, as well as join existing events, post comments, share photos, etc. In addition to these typical online social network activities, EBSNs also promote face-to-face offline social interactions, in which some members participate in an event in the physical world. Thus, EBSNs not only provide an online virtual space where users can exchange thoughts and share experiences, but also capture offline in-person social interactions. Specifically, an EBSN consists of both online social network (i.e., group membership, event publication) and offline social network (i.e., event participation). Therefore investigating such networks may be expected to be of considerable benefit. Many EBSNs have attracted massive userbases and have been experiencing rapid business growth. Increasing numbers of people have been joining activities advertised online in the real world, and people with common interests often prefer to share their experiences and thoughts in private online groups. For example, as of 2020, Meetup had 49 million members, creating some 15,106 social events per day. As such networks continue to grow and thrive, the present work is motivated by the special feature of EBSNs, that is, their structure as combined heterogeneous social networks.
The tendency of people to come together and form groups or communities is inherent in the structure of society; and the processes by which such groups take shape and attract 1 www.facebook.com 2 twitter.com 3 www.flickr.com 4 www.youtube.com 5 www.slideshare.net 6 www.researchgate.net 7 www.meetup.com 8 www.eventbrite.com members has been a theme in a considerable body of social science research [3], [4], [5]. With widespread adoption of the Internet, new forms of organizing such as online communities have become prevalent on nearly every online social media platform. Among global Internet users, 76% have participated in an online community [6]. Although their functionality varies markedly, these platforms all provide facilities enabling multi-person social interactions. Businesses use online communities to facilitate peer-to-peer customer support [7], build brand loyalty [8] and foster knowledge sharing and collaboration among their employees [9], while individuals join online groups to exchange information, to interact with likeminded others, and to organize and participate in collaborative work or collective action [10], [11]. The development of internet technologies has significantly transformed online communities as new type of entity, with the potential to create considerable benefits by promoting user interactions and information diffusion.
The proliferation of online communities, despite benefiting both social media and users in many ways, has also created challenges for Internet users, who must find online communities to join from amid an overwhelming volume and variety of such communities. Over the last few years, the growth of online users has slowed down, whereas the number of Internet offerings -public websites and communities -has grown exponentially [12]. Social media users, therefore, generally must expend considerable effort to choose the most useful online communities for a given purpose. However, ordinary users not only cannot usually express their preferences accurately when using vertical search engines to find groups, but may even prefer to simply be told directly which groups they should join [13]. Moreover, groups play a significant role in EBSNs, in which events are organized by a group, and users can join some groups relevant to their interests. It may be considered that people within a given group have weak friendly relationships with each other.
Consequently, helping users find suitable online groups is a promising area for the application of recommender systems, which has been of interest both to the industry and to academic researchers. Studies have been conducted on recommendation in other applications such as recommending friends [14], [15], events [16], driving routes [17], movies/music [18], [19], image tags [20], news [21], learning performance [22], etc. However, thus far the group recommendation has been explored only sporadically [23], [24], especially for EBSNs.
Currently, groups recommendation in EBSNs are typically only based on interest information filled in by the users, which is not very effective, or based on the idea that groups joined by the friends of a user are also likely to be joined by this user. Both these methods may be considered quite naïve and do not reflect users' real intentions. Hence, in this study, we propose a new comprehensive group recommender system based on a random walk algorithm to address this problem. To the best of our knowledge, no prior research has been conducted on group recommendation based on a random VOLUME 11, 2023 walk algorithm in EBSNs, which are identified as SNs comprising both online and offline social interactions. In fact, our proposed recommender framework is not restricted to group recommendation. It can also be easily extended to other kinds of recommendations, such as friend recommendation and event recommendation [25], [26]. To demonstrate the effectiveness of the proposed approach, we focus on group recommendation, i.e., personalized recommendation of event-based groups to users.
The present work includes three main contributions to the relevant literature, as given below.
(1) We propose a unified group recommendation framework. Specifically, by constructing a heterogeneous, augmented graph, the proposed approach can incorporate all the necessary information into a single social graph to make better recommendations, which should be more suitable for EBSNs requiring integration of both online and offline social networks. Furthermore, new information can be easily added to boost the effectiveness of our proposed recommendation system with little change to the algorithm.
(2) We augment the random walk algorithm to adapt to the constructed heterogeneous augmented graph containing various types of nodes and links in a single graph. In prior works, random walk algorithms have been applied in homogenous graphs containing only relationships among entities of a given type. Our approach allows the incorporation of more types of nodes into a single graph and makes the assignment of link weights more effective in a simple manner, because the proposed graph structure already captures the inherent indirect relationships among seemingly unrelated entities.
(3) We conducted extensive experiments on real datasets to compare our proposed algorithm to other commonly used recommendation algorithms. The results demonstrate the superiority of our novel group recommendation method based on a heterogeneous augmented graph structure and a random walk algorithm.
The remainder of this paper is structured as follows. We describe related works in Section II and present the heterogeneous augmented graph construction method and group recommendation method based on random walk in Section III. The dataset and evaluation metrics used are introduced in Section IV. We present experimental results verifying the efficacy of our proposed recommender system in Section V and conclude the work in Section VI.

II. RELATED WORK
The present work is closely related to three key areas of research, including communities on social network services (SNS), community recommender systems, and random walk algorithm-based recommendation.

A. COMMUNITIES/GROUPS ON SOCIAL NETWORK SERVICES
The term ''community'' denotes a collection of people who have joined an explicitly defined group within an organization that supports the formation of such groups and provides facilities for the members to exchange information with each other and engage in other joint activities. In some circumstances, other terms such as ''group'' are commonly used in the same meaning that we apply to ''community'' here. Thus, we will use the terms ''community'' and ''group'' interchangeably in this paper. Traditional social communities or groups provide many offline interactions for their members, including opportunities for affiliation or companionship, social support, information exchange capabilities, and support for collective action.
With the development of technical infrastructure, the Internet has created new opportunities and platforms (such as Flickr, Facebook, etc.) for people to interact. Likewise, social network services (SNS) provide basic online communication capabilities to support the development of interpersonal relationships, feelings of companionship, and perceptions of affiliation. Communities in SNSs also encourage a variety of online interactions such as discussion and knowledge sharing, access to information and dissemination of ideas, as well as other collective activities such as software development or political action.
Unlike these traditional social structures which function either online or offline, EBSNs are distinct as an emergent type of social network identified by the co-existence of both online and offline social interactions. Reference [2] is the first work to comprehensively study properties of EBSNs including time and location patterns, network properties, and several popular issues in social networks such as community detection and information flow. Subsequently, this new type of social structure has recently attracted increasing research interest. For example, Liu et al. explored the roles of event size and interactivity in social networking behaviors [2], while Zhang et al. exploited matrix factorization to model interactions on EBSNs by considering location features, social features, and implicit patterns simultaneously in a unified model [13]. Xu and Liu proposed a semanticenhanced and context-aware hybrid collaborative filtering for event recommendation [27]. However, prior research on this topic has only presented some algorithms for certain problems, and further research remains necessary on EBSNbased community/group recommendation.

B. ONLINE GROUP RECOMMENDER SYSTEM
In this section, we provide an overview of three categories of group recommender systems, including content-based, collaborative-filtering-based, and hybrid recommender systems [13], [27], [28], [29].
Content-based recommender systems have been designed to use the content correlation between items and past preferences of active users for recommendation. For example, Spertus et al. presented an empirical comparison of community recommendations for a given user, based on their similarity to communities to which users actually belonged [30]. Bagher et al. proposed the concept of trends to capture the interests of users in selecting items among different groups of similar items [31].
Collaborative filtering is regarded as one of the most widely-used and successfully-developed recommendation approaches [32]. Collaborative filtering methods may be further divided into two typical classes, including memorybased and model-based approaches [33]. Memory-based approaches mainly focus on finding similar users or items for recommendations. Two algorithms, namely the Pearson correlation coefficient (PCC) algorithm [34] and the vector space similarity (VSS) algorithm [35] have commonly been applied in both user-based and item-based approaches as similarity computation metrics. In contrast to the memorybased approaches, the model-based approaches to collaborative filtering fit prediction models based on training data and then apply them to predict users' preference on items [36], [37]. Algorithms in this class include clustering methods [38], [39], aspect models [40] and Bayesian networks [36], [41]. Liu et al. presented a recommendation model fusing social relations and item contents with user ratings by modifying the model using a Bayesian Probabilistic Matrix Factorization algorithm [42]. He et al. proposed the use of neural networks to improve collaborative filtering [43]. Although collaborative filtering has been shown to be effective when users expressed enough ratings to share common ratings with other users, it has tended to perform poorly for so-called coldstart users [44], [45], [46].
Hybrid recommender systems combine both content-based recommendation and collaborative filtering to make predictions [28], [47]. Generally, four classes of approaches have been used to construct hybrid recommender systems. The first such class of hybrid systems involves the combination of separate collaborative and content-based recommenders by combining outputs of these recommenders linearly, or using an evaluation metric to dynamically choose the best recommender [48], [49]. The second class adds content-based characteristics to collaborative models, which can be of benefit in addressing the cold-start user problem of collaborative filtering by employing user information in the prediction process [28]. A collaborative content-based approach is used in another class, incorporating collaborative factors into content-based models [50]. The fourth class unifies content-based and collaborative filtering [21], [51], [52]. For example, Reference [51] described an unified probabilistic model combining collaborative and content information in a coherent manner to yield better recommendations overall in terms of accuracy and cold-start items. Reference [21] proposes a personalised news recommendation framework named HYPNER to combines both collaborative filteringbased and content-based filtering methods. Reference [52] incorporates public contextual metadata and paper-citation relationship information into both content-based and collaborative filtering approaches separately to enhance the recommendation accuracy.
In contrast to the abovementioned prior research on community recommendation, event-based communities mainly consist of users, attributes, and offline events, all of which could influence users' decisions and thus need to be considered [13]. In this study, we propose a random walkbased recommendation model to accommodate the special nature of EBSN community recommendation, using both attributes and structure properties for recommendation, and applying the ideas of user-based and entity-based collaborative filtering approaches in a coherent manner.

C. RANDOM WALK-BASED ALGORITHM
Random walk algorithm was first introduced by Pearson [53], and subsequently used in many fields such as ecology, physics, and computer science. In general, a random walk is a mathematical formalization of a path formed by a succession of random steps. Random walk algorithm is commonly used on graphs [54], [55] as compared to other algorithms [56], [57], where each edge is tagged with a specified probability of traveling via this edge in the random walk. Similarity and transitive associations between nodes can be easily computed based on random walks, enabling them to be effective in recommender systems [58], [59], [60], [61].
Bogers presented the ContextWalk algorithm, incorporating different types of contextual information in a traditional random walk algorithm [62]. This technique modeled the browsing process of a user on a video website by taking random walks over a contextual graph. In the most similar approach to that of the present work available in the relevant literature, Yin et al. proposed a unified framework for link recommendation using a random walk algorithm [63]. In their graph construction process, both users and attributes were treated as nodes, and a link between two nodes was defined to exist if the nodes represented persons who were friends or one node representing a person included an attribute denoted by the other node. Jiang et al. proposed a Bayesian personalized ranking (BPR)-based machine learning method called Hete-Learn to learn the weights of links in an information network. In order to model user preferences for personalized recommendation, a generalized random walk with restart model was proposed [60]. Manju et al. sought to solve the cold start problem in research paper recommendation by integrating social network interaction factors based on a random walk [64]. Zheng et al. proposed a personalized tag recommendation for social images based on convolution features and a weighted random walk [20]. In particular, for a given image, they selected its visual weights and determined the weight of each neighbor by mining the influence of user group metadata. Afterwards, the weighted random walk algorithm was implemented on a neighbor-tag bipartite graph.
Our work differs from prior works in several key aspects. First, most recommendation studies (e.g., random walk algorithm) have been limited to a single graph, including only one or two types of nodes. In contrast, our work is oriented to EBSNs, as they possess more rich information than other applications, including both online and offline social networks. Our proposed approach constructs a heterogeneous graph containing various kinds of nodes and edges. Prior works on traditional social networks are not well applicable to EBSNs. Second, although friend/item recommendation problems for EBSNs have frequently been studied, group recommendation has not been studied previously. On some popular EBSNs, such as Meetup, groups are recommended based only on interests and common groups between friends within a group, which is naïve and not very effective, as shown in the Section V in this paper.

III. GROUP RECOMMENDATION
In this section, we present a group recommendation framework based on a random walk algorithm. We first introduce the formal definition of our group recommendation problem and then elaborate on our recommendation framework, which is divided into two parts, including graph construction and the random walk algorithm. The data flow of our recommender system is illustrated in Fig. 1. In our proposed recommender system, we first construct a heterogeneous augmented graph capturing all the available information related to the recommendation, including event participation, group membership, and interest information. Based on the relationship graph, we apply the random walk algorithm to calculate recommendation scores (i.e., the link relevance) for every group and then select the top N groups to recommend to the user. A termination condition controls the number of iterations of the random walk algorithm.

A. PROBLEM DEFINITION
In this study, we propose a recommender system to recommend groups to users in EBSNs. In this subsection, we first give a formal definition of the problem of group recommendation.
The group recommendation problem is defined as the task of recommending some groups that are most likely to interest a given user based on their activity history and personal information. That is, we want to maximize the probability that the user will join the recommended groups in the future. If we can recommend suitable groups to users, users will benefit from increased convenience in finding relevant interest groups, while groups can accept more promising members.  Formally, given the groups' memberships, groups' interests, users' interests, events' organizers (describing which group created a given event) and events' participation records, we aim to recommend a list of groups that may interest the given user in order of the joining probability (i.e., recommendation score). That is, the recommended groups should be ordered in decreasing order of recommendation scores with regard to the given user and should not contain any group that the user has already joined.
In the following subsections, we introduce our proposed group recommendation framework. The group recommendation problem is modeled as link recommendation, which can be solved using a random walk algorithm on a graph constructed from social networks. Thus, our framework consists of two parts, including a graph construction process and a random walk algorithm.

B. GRAPH CONSTRUCTION
In this subsection, we describe the proposed approach to construct a heterogeneous augmented graph from given social network information having different types of nodes and edges, each of which indicates a different physical meaning. We first present the graph structure, and then describe the weight calculation of edges of various types.

1) GRAPH STRUCTURE
Based on the group memberships, group interests, user interests, event organizers and event participation records, a heterogeneous augmented graph denoted as G(V,E), where V is the node set and E is the edge set, is defined in Table 1 for node types and Table 2 for edge types.
It may be observed that there exist four different types of nodes and five types of edges in the constructed graph. We provide an example as follows. Let there exist two user nodes u 1 and u 2 , three group nodes g 1 , g 2 and g 3 , two event nodes e 1 and e 2 , and two interest nodes i 1 and i 2 . Group g 1 has two members u 1 and u 2 , one interest attribute i 1 , and once held an event e 1 , which was participated in by user u 2 . Group g 2 has no members, an interest i 2 and organized an event e 2 , which was participated in by user u 1 . Group g 3 has only one member u 2 . Additionally, user u 1 has interests i 1 and i 2 , and user u 2 has interests i 1 . According to the above description, the graph is constructed as shown in Fig. 2.
It should be noted that the graph thus constructed is not a strict undirected graph as an edge is defined with different weights for different directions. That is, the weight of w(a, b) is different from the weight of w(b, a). We elaborate on this point in subsection III-B2.
The graph G(V, E) is inherently a heterogeneous augmented graph as there are different types of nodes and different types of edges within a single graph. This graph captures all the information that EBSNs provide in a natural manner. We do not assume two members within a given group are friends, because, in fact, they typically only have a weak friendship with each other, and it is even possible that some members do not know of the existence of others. Instead, we retain the original indirect relationship between members within a given group, that is, they are considered to have relations because they have joined the same group. Similarly, the relationships among users who participated in the same event are expressed in terms of a participation relationship. The preservation of direct relationships enables a more natural expression of the complicated relationships in EBSNs and also enables a simple but effective weight assignment scheme, as detailed in the next subsection.

2) WEIGHT CALCULATION
Although the heterogeneous augmented graph constructed above is able to depict the structure of EBSNs in a natural and direct way, it remains insufficient to express the strength of the relationships contained in it. In other words, the importance degree of relationships (i.e., edges) has not yet been included in the graph. In this subsection, we introduce our approach to assigning importance (i.e., weight) to each relationship (i.e., edge). The weight assigned to an edge measures the importance of the end node to the start node and the strength of a relationship. We adopt a uniform weighting scheme, which can depict the weights of different edges, as our heterogeneous augmented graph is already able to express the complicated relationships and importance degree differences due to indirect relationships, as explained below after the introduction of our applied weighting scheme.
We describe the weight calculation for each type of edge in Table 3. It should be clear that w(a, b) denotes the weight for the edge (a, b), and g, u, i, e denote the group, user, interest, and event node, respectively, as shown in Table 1. As mentioned above, w(a, b) does not necessarily equal w(b, a).
As we can see, there are ten weight functions for five types of edges. Although the graph thus constructed is more similar to an undirected graph, the weight functions for different directions are distinct, expressing different underlying meanings. Hence the graph may be considered as similar to a directed graph. Moreover, constraints are included about the damping factors in the weight functions for normalization. From the perspective of the random walk process, the weight of an edge represents the probability of walking via this edge; thus weights from all edges emitted from one node should aggregate to 1. We set the constraints as given in Equations (1)-(4), making weights originated from one node summing up to 1. For example, for all the edges originated from a group node, e∈N e (g) w(g, e) + i∈N i (g) w(g, i) + u∈N u (g) w(g, u) = 1.
Although defining a weight function for each type of edge may seem somewhat complicated and tedious, in this case it is quite straightforward. All functions are defined by a uniform weighting scheme in which each outgoing edge shares an equal proportion of weight, all of which aggregate to 1. Thus, the weight scheme is quite simple and easy to apply. Thanks to the informative structure of the heterogeneous graph constructed previously, the simple weight scheme is able to express the valuable relationships and importance degrees among every pair of entities (i.e., group, user, interest, and event). For example, a user u 1 who participated in quite a number of events organized by the group g 1 should be important to the group g 1 , i.e., the user u 1 and the group g 1 should have a strong relationship (w(u 1 , g 1 ) should be large). Although we simply use a uniform weighting scheme, observing that each user node in the group node g 1 shares an equal weight, there exist other paths via event nodes (that are attended by the user u 1 ) between the group node g 1 and the user u 1 such that the ranking of user node u 1 also gains rankings from the group node g 1 indirectly via these event nodes. In other words, the edge weight can be thought of the possibility of walking via this edge, and the closeness between two nodes can be thought of the possibility of walking from one node to the other node via multiple paths in the graph. For example, possible paths from a group node g to a user node u can be: • group node g to the user node u directly via the groupto-user edge • group node g to the user node u indirectly via the groupto-interest edge and interest-to-user edge • group node g to the user node u indirectly via the groupto-event edge and event-to-user edge We will elaborate on this idea in Section III-C. Hence, the seemingly complex indirect relationship has been captured naturally in the underlying graph structure, and the uniform weighting scheme is able to capture the strength of the inherent relationship.

C. RANDOM WALK ON HETEROGENEOUS GRAPH
In this section, we present a random walk with restart algorithm based on the newly constructed heterogeneous graph to simulate group membership behavior over time. This algorithm assumes that if a random walk on the graph from a given user node based on the edge weight (indicating the probability from the start node to the end node) finally arrives at a group node, it is considered highly possible that the given user would be interested in joining the group in the future. The assumption is straightforward and provides some intrinsic features of group membership prediction. For example, if a group shares more attributes with a given user, or the user once participated in some events organized by the group, or the group has been joined by most of the friends 9 of this user, the probability that a walk proceeds from the user node to this group node is high, indicating that the group is considered promising for the user to join in the near future. Moreover, if the group node is close to the given user node, the random walk probability to that group node is also high, which is consistent with the fact that the nearby groups with the user in the graph are more likely to be joined by the user, as their relationship is constructed via few intermediaries. The convergent probabilities of random walk algorithms starting from a given user node are considered as the link relevance between the user node and the respective nodes in the probability distribution, which is simply the recommendation probability discussed above. Here, we use the random walk with restart algorithm on the heterogeneous graph with user, event, interest and group nodes to calculate the link relevance 9 The ''friends'' here mean the co-members of the user. That is, if two users join the same group, they can be called ''friends''. Although this is not a definition of friendship, we use it here for clarity. for a particular user node u * . Indeed, the heterogeneous graph described above is homogenized by node coding and weight normalization in the proposed approach. Although there are different types of nodes and edges, we do not need to differentiate them in the random walk process. We can treat the heterogeneous graph as a normal homogeneous graph when we perform the random walk process. Thus, the conventional random walk with restart algorithm can be applied to our group recommender system with few modifications.
The random walk with restart function to calculate the link relevance for a particular user u * is shown in Equation 5.
where a + is an arbitrary node in the graph, r a + is the link relevance of a + with regard to the user node u * , i.e., the probability of walking to the node a + from the user node u * , where a ∈ {u, e, g, i} and u, e, g, i denote user node, event node, group node, and interest node respectively, N a (a + ) is the set of type ''a'' nodes connected to node a + , a is a specific instance of type ''a'' nodes, w(a , a + ) is the weight from node a to node a + , r a is the link relevance of node a , r a + = 0, and θ is the restart probability. Algorithm 1 shows our proposed group recommendation algorithm based on random walk with restart. The input of the algorithm includes the EBSN graph G, the particular user u * for whom to recommend groups, the maximum number of iterations i max and a threshold t used to control the termination of the algorithm, and the number of groups n to recommend. The output is an ordered list of recommended groups in non-increasing order of link relevance. In the pseudo-code for Algorithm 1, Lines 1-5 perform initialization. Lines 6-13 comprise an iterative process to update the link relevance of each node. Two termination conditions may cease the iteration, in the case that either the number of iterations exceeds the maximum allowable, as set by the parameter i max (Line 6), or that the total difference of link relevance of all nodes between the previous iteration and the current iteration is less than a threshold t (Lines 10-12). Line 14 sorts the group nodes in non-increasing order in terms of link relevance, and Line 15 returns the first n recommended groups.

IV. DATA DESCRIPTIONS AND EXPERIMENTAL DESIGN
We conducted extensive experiments on real datasets to demonstrate the effectiveness of our proposed recommender system. In this section, we describe the dataset, metrics and baseline methods used to evaluate the performance of our proposed system, while the experimental results are provided in Section V.

A. DATASET
The datasets used were originally crawled from the Meetup EBSN, 10 provided by authors of a prior work [2]. The data statistics are shown in Table 4, where RSVP represents event 10 https://github.com/wuyuehit/meetup_dataset Algorithm 1 Group Recommendation Algorithm Input: EBSN G =< V , E, W >, user u * , maximum iteration times i max , a threshold t, recommended number n Output: An ordered list of groups recommended for u * 1: if a + = u * then 2: r a + ← 0 3: else 4: r a + ← 1 5: end if 6: for i ← 1 to i max do 7: for each a + ∈ V do 8: ∈N a (a + ) w(a , a + )×r a +θ ×r (0) a + 9: end for 10: if diff({r a + } i , {r a + } i−1 ) < t then 11: break 12: end if 13: end for 14: Sort the group node set in non-increasing order based on the r a + 15: return the first n groups in the set registration. In this dataset, the online social network is constructed by capturing the membership of online social groups (i.e., membership of a user in a group forms a user-group relationship) and the attributes of users and groups. An offline social network is generated in the same way based on the participation in social events and their organizers.

B. EVALUATION CRITERIA
To evaluate the quality of our recommender system, we adopt four standard evaluation measures, including mean reciprocal rank (MRR), precision, recall, and F-Measure, which have been commonly employed in prior recommender system research [65]. We compared our algorithm with other baseline methods in terms of these metrics. MRR is defined in Equation 6.
where S is the set of sampled user nodes and rank u is the rank of the first correctly recommended group for user u.
Precision is defined as the number of correct groups recommended divided by the total number of groups recommended, and recall is defined as the percentage between the number of correct groups recommended and the number of all true groups that the user actually joins.
Formally, precision and recall are defined by Equation 7 and Equation 8.
# of correct groups in top K groups K (7) VOLUME 11, 2023 R@K = 1 |S| u∈S # of correct groups in top K groups # of true groups in the ground truth (8) where P@K and R@K represent the precision and recall in the top K recommended groups. F-Measure is a combined metric of precision and recall, as defined by Equation 9.
In the experiment, we performed 4-fold cross validation to evaluate the performance of the proposed method. In particular, we partitioned the membership dataset into four parts, three of which were used for prediction, and one of which was used for testing. To remove noisy data and guarantee the reliability of the experimental results, we randomly sampled 100 users who had joined at least 5 groups but not most of the groups. For each user, we recommended top-K groups.

C. EXPERIMENTAL DESIGN 1) BENCHMARK MODELS
We compared our proposed approach with four baseline methods. In general, two important features need to be considered in the recommendation, i.e., semantic features (i.e., the interest) and the relationship structure features. However, the baseline methods only consider one of these two features. The baseline methods used for comparison are listed as follows.
• Random recommendation (denoted by RecByRandom): Randomly choose some groups to recommend. The performance of this method is expected to be the worst.
• Interest-based recommendation (denoted by RecByInterest): This method is the most popular group recommendation method used by some EBSNs such as Meetup [66]. It only considers the interest similarity between groups and users to perform recommendation. That is, groups with similar interests will be recommended to users.
• Interest-and neighborhood-based recommendation (denoted by RecByInt&Com): This recommendation method is also frequently used by some EBSNs and social networks [14]. It considers two factors in the recommendation, i.e., interest similarity (same as RecBy-Interest) and collaborative friendship (i.e., group memberships in the context of EBSN). In short, groups that are joined by their friends are considered as the recommendation candidates, which are then ranked by their common interests.
• Latent factor model with location features (denoted by PTARMIGAN) [13]: This recommendation method is combining latent factor model with location features for EBSN. However, due to the unavailability of source code or some implementation details, we directly use their published precision results as we are using the same dataset provided in [2].
• Katz Centrality [67]: This recommendation method is based on the graph structure (specifically, the paths in  the social network graph), calculating the recommendation score of a group g with regard to a user u by using Equation 10.
where τ is the damping factor and path <l> u,g is the set of all length-l paths from node u to node g. We consider the paths with lengths of no more than 4. Our method is referred to as HeteroRandom.

2) EXPERIMENTAL SETTINGS
To obtain good recommendation results, the parameters must be tuned in the weighting and link relevance functions, which impacts the effectiveness of the recommendation. There are seven parameters to be tuned, i.e., α, α , β, γ , γ , δ in Equations (1)-(4) and the restart probability θ in Equation (5), in addition to which other parameters (i.e., α , β , γ , δ ) can be derived easily according to Equations (1)-(4). We manually tune these parameters in the following steps.
1) Initialize all parameters to an average value, e.g., α = α = 0.33, β = 0.5, γ = γ = 0.33, δ = 0.5, θ = 0.5. 2) For each parameter, either increase or decrease this parameter with other parameters fixed at their current best values until the metric becomes worse; thus, we get the local best value for this parameter. In the tuning experiment, we use the precision metric, and we do observe the linearity between one parameter and the metric. That is, if decreasing the parameter makes the metric better, increasing it will not. Hence the exploration would only take place in one direction.

3) Repeat
Step 2) for a fixed round (in the experiments, we set it to 10). Through intensive manual tuning as mentioned above, we obtained the best settings of parameter values as shown in Table 5.

V. EXPERIMENTAL ANALYSIS
In this section, we evaluate our proposed recommender method and compare it with the baseline methods to demonstrate its effectiveness. We compared our proposed method with the baseline methods in terms of the evaluation metrics introduced in Section IV-B.

A. EXPERIMENTAL RESULTS
In this subsection, we present the results of the four commonly used baseline methods and of our random walk algorithm in terms of MRR, precision, recall and F-Measure.
The results in term of the MRR metric are listed in Table 6, from which it may be observed that our method had the highest MRR value of 0.511, indicating that on average the first two groups recommended by the proposed approach accurately reflected the groups actually joined by the user subsequently. In comparison, the other three methods achieved only very poor MRR values. Katz Centrality achieve a relatively high MRR (0.345), which is mostly attributed to our informative graph structure.
Precision measures of the five methods are shown in Fig. 4 and Table 7, from which it may be observed that our method outperformed the other methods for all P@K metrics. Katz Centrality achieved a relatively good precision, which, however, was primarily due to our proposed graph construction method. The latent factor model with location features achieves a relatively high precision result between 10%-20%, which is still far lower than our method. The other four methods, i.e., random recommendation, interestbased recommendation and interest-and neighborhood-based recommendation, achieved nearly the same precision values, which were significantly below 10%. Our method achieved 42% precision in the top 1 group and over 30% precision in the top 5 groups. As K increases, the precision of all methods decreased because most users subsequently joined fewer than 20 groups (as simulated by the testing partition).
We present the recall results for different methods in Fig. 5 and Table 8, from which it may be observed that our method, denoted as HeteroRandom, achieved significantly better recall than the other methods. Notably, our method  attained nearly 50% recall at R@100. The RecByInt&Com method also achieved a relatively higher recall than RecBy-Interest, Katz Centrality and Random when K exceeded 100, achieving nearly 20% at R@500. F-Measure results are shown in Fig. 6 and Table 9. It may be observed that our proposed method delivered the best performance in terms of F-Measure. Moreover, we found that in order to achieve relatively good F-Measure, the number of recommended groups should be between 5 to 50, which provides information on how many groups should be recommended to users in EBSNs.

B. FURTHER DISCUSSION
From the experimental results, it is observed that our proposed approach performs significantly better than all the baseline methods, in terms of the four measures, i.e., MRR, precision, recall and F-Measure. It is interesting that Katz Centrality also achieves good performance in terms of MRR and precision. We believe that it is attributed to the heterogeneous graph structure we designed, which is informative, capturing a variety of relationships in EBSNs. The randomness of  our approach gives us more generalizability and the ''restart'' probability avoids local optima. Lack of these properties may cause the poor recall rate of Katz Centrality. Other approaches (i.e., RecByInt&Com, RecByInterest) that only consider interest or friend relationship lose the opportunities to utilize other useful information, e.g., the link relevance.

VI. CONCLUSION
In this paper, we have proposed a group recommender system for EBSNs based on a novel heterogeneous augmented graph construction method and a random walk algorithm. The heterogeneous augmented graph constructed in the proposed method is able to capture all relationships and valuable information in EBSNs, while the structure of the heterogeneous augmented graph naturally expresses the intrinsic complex relationships and their strengths, even with a simple uniform weighting scheme. To utilize the heterogeneous augmented graph structure and weight information for group recommendation, we adopted a random walk with restart algorithm on the constructed graph, homogenized by node coding and weight normalization. The results of extensive experiments show that our recommender system achieved better recommendation results in terms of MRR, recall, precision and F-Measure, outperforming other methods by a wide margin. Furthermore, our group recommendation framework is generic and can be extended to other recommendation applications such as event recommendation and friend recommendation with relatively minor modifications.
Despite its good performance, the proposed group recommender system has rooms for improvement. First, the parameter tuning is still a bit tedious. In the near future, parameter tuning may be conducted using machine learning to obtain the optimal values automatically. Second, graph mining of largescale graphs such as those formed by social media networks is time-consuming, especially for iterative algorithms such as random walk. Future research may further optimize the proposed group recommendation framework through parallel execution.