Representative News Generation using Automatic Clustering in Big Data Environment | IEEE Conference Publication | IEEE Xplore

Representative News Generation using Automatic Clustering in Big Data Environment


Abstract:

There are 43,000 online media in Indonesia which publish at least one until two news every hour. The amount of information exceeds human processing capacity, resulting in...Show More

Abstract:

There are 43,000 online media in Indonesia which publish at least one until two news every hour. The amount of information exceeds human processing capacity, resulting in several impacts on humans such as confusion and psychological stress. In this research we propose a new system for processing incremental news data and provide a mechanism for determining representative news by applying Automatic Clustering algorithm. The system consists of 4 main functions: (1) Data Acquisition and Preprocessing, (2) Keyword Feature Extraction, (3) Data Aggregation, Automatic Clustering, and (4) Incremental Clustering. The news is grouped in to same information based on information-retrieval. This system runs on big data environment to process large amount of data. There are 3,000 news collected in database by the system in a whole day in database. The collected news are processed using Automatic Clustering and then aotumatically grouped into 389 clusters. A cluster is identified as the unknown cluster and the clusters are evaluated without enclosing single member clusters. For experimental study, the system performed 93,51%.
Date of Conference: 27-28 September 2019
Date Added to IEEE Xplore: 18 November 2019
ISBN Information:
Conference Location: Surabaya, Indonesia

I. Introduction

Online media is the third generation of mass media after print media (first generation) and electronic media (second generation) [1]. The first generation of mass media such as newspapers, tabloids, magazines, and books. And the second generation of mass media such as radio, television, and movies / videos. Internet and technology penetration plays an important role in mass media transformation. Since mass media was transformed, the spreading of information become fast and easy. Journalist report incident directly and news content can be accessed anywhere and anytime. Building mass media become easy by making a website or mobile apps. The number of online media is growth significantly. Based on data from the Minister of Communication and Information [2], there are 43,000 Indonesian online media in 2018. Meanwhile, one until two news can be produced in every hour from an online media [3]. This number of news will grow incrementally along with the numbers of online media. It can be imagined how many news had been produced.

Contact IEEE to Subscribe

References

References is not available for this document.