Skip to Main Content
Automatic building of social networks requires extracting pair-wise relations of the individuals. In this paper, supervised learning of social networks from a set of documents is proposed. Given a small subset of relations between the individuals, the problem of learning social network is translated into a text classification problem. Relation between each pair of individuals is represented by a vector of words produced from merging all documents associated with these two individuals. The known relation is used as a label for the relation vector. The merged documents and their given labels, are used as training data. By this transformation, a text classifier such as SVM can be used for learning the unknown relations. We show that there is a link between the intrinsic sparsity of social networks and class distribution imbalance of the training data. In order to re-balance the unbalanced training data, a multiple resampling method, including undersampling of the majority and oversampling of the minority class, is employed. The proposed framework is applied to a friend of a friend (FOAF) data set and evaluated by the macro-averaged F-measure.
Date of Conference: 7-10 Oct. 2007