Loading [MathJax]/extensions/MathZoom.js
Social Media Data Analysis Using MapReduce Programming Model and Training a Tweet Classifier Using Apache Mahout | IEEE Conference Publication | IEEE Xplore

Social Media Data Analysis Using MapReduce Programming Model and Training a Tweet Classifier Using Apache Mahout


Abstract:

Twitter, a micro-blogging service, has been generating a large amount of data every minute as it gives people chance to express their thoughts and feelings quickly and cl...Show More

Abstract:

Twitter, a micro-blogging service, has been generating a large amount of data every minute as it gives people chance to express their thoughts and feelings quickly and clearly about any topics. To obtain the desired information from these available big data, it requires high-performance parallel computing tools along with machine learning algorithms' support. Emerging big data processing frameworks (e.g. Hadoop) can handle such big data effectively. In this paper, we, firstly introduce a novel approach to automatically classify Twitter data obtained from British Geological Survey (BGS), collected using some specific keywords such as landslide, landslides, mudslide, landfall, landslip, soil sliding, based on tweet post date and the countries where tweets are posted using MapReduce algorithm. We then propose a model to distinguish the tweets if they are landslides-related using Naïve-Bayes machine learning algorithm with n-Grams language model on Mahout. This paper also describes an algorithm for the pre-processing steps to make the semi-structured Twitter text data ready for classification. The proposed methods are useful for the BGS and other interested people to be able to see the name and number of the countries where the tweets are sent, the number of tweets sent from each country, the dates and time intervals of the tweets, and to classify the tweets whether they are related to landslides.
Date of Conference: 18-21 November 2018
Date Added to IEEE Xplore: 09 December 2018
ISBN Information:
Conference Location: Paris, France

Contact IEEE to Subscribe

References

References is not available for this document.