Abstract:
The increasing popularity of the micro-blogging sites like Twitter, which facilitates users to exchange short messages (aka tweets) is an impetus for data analytics tasks...Show MoreMetadata
Abstract:
The increasing popularity of the micro-blogging sites like Twitter, which facilitates users to exchange short messages (aka tweets) is an impetus for data analytics tasks for varied purposes, ranging from business intelligence to nation security. Twitter is being used by a large number of users for events update and sentiment expression. Since tweets are generally unstructured in nature and do not follow grammatical structures, parsing techniques generally do not work well due to incorrect parts-of-speech assignment to individual words. In this paper, we have proposed an n-gram based statistical approach to identify significant terms and using them for vector-space modelling of the tweets. Thereafter, a social graph generation method is proposed, considering tweets as nodes and the degree of similarity between a pair of tweets as a weighted edge between them. The social graph is decomposed into various clusters using Markov Clustering technique, wherein each cluster corresponds to a particular event. The experiment is carried out using a corpus of 3100 tweets related to Israel-Gaza conflicts, Delhi assembly election, and union budget 2015. The experimental results are encouraging, showing the efficacy of the proposed social graph generation and event classification methods.
Published in: 2015 Second International Conference on Soft Computing and Machine Intelligence (ISCMI)
Date of Conference: 23-24 November 2015
Date Added to IEEE Xplore: 25 February 2016
Electronic ISBN:978-1-4673-9819-0