I. Introduction
Recently, tags become more and more important status in online social communities, such as Flickr, Zooomr, youtube. In these online communities, for an example, in Flickr, users can upload personal photos and use some keywords to describe their photos. This kind of keywords is called tags. For the motivation why users use tags from [1], we can see people want to make their photos easily to be retrievable, or let other users who have similar hobbies easily find them; at the same time, they want to easily search the photos which they are interested in, or easily find the users who have the similar hobbies. However, when users tag photos, not all the tags they used are related to the photos. Users also use some non-meaningful tags to annotate their photos: sometimes use year, month and people's names, sometimes they use the tags which are made by themselves, like(“HelloMyLove”), sometimes they use their own denotations which other people cannot understand the meanings except themselves, (like“####”.) [2], sometimes they make mistakes of spelling, all of which lead to most of tags are irrelevant and noisy. Therefore, many researches about finding high quality [6] tags, tag suggestion and tag recommendation to the items are so popular. Paper [6] defined a tag as high quality if it helps the community understand an important aspect of an item. [3], [6], [8], [9], [12] provide methods for finding high quality tags for websites, movies, photos and documents. Most of them only used textual features of tags, such as tag co-occurrence frequency. However, in our paper, based on users' social activities, we can extract social features of tags. Through this kind of social features of tags, we can extract useful information for the social users' activities. We set up our research on the Flickr users and tags. In Flickr, users can not only upload photos, but also can mark other photos as favorite. From users' marked favorite photos, we can conclude the favorite topics of users. Therefore, based on Flickr users and their activities, we extract social features of tags and use machine learning to extract representative tags for users which can be strongly related to users' favorite topics. We also used traditional textual features to compare the result and find the most useful features. Furthermore, through the extracted representative tags, we can suggest other users who have the similar hobbies and interest groups. The rest of this paper is organized as follows. In Section 2 we introduce the related work. In Section 3 we introduce the features of tags. In Sections 4 we describe the algorithm and training model. In Section 5, we provide the experimental results. Finally, we summarize the conclusions of the paper in Section 6.