Loading [MathJax]/extensions/MathMenu.js
Performance Comparison of Apache Hadoop and Apache Spark for COVID-19 data sets | IEEE Conference Publication | IEEE Xplore

Performance Comparison of Apache Hadoop and Apache Spark for COVID-19 data sets


Abstract:

Nowadays the world is fighting against a global pandemic Covid-19 that has resulted in more than 5 million deaths and badly impacted world economy. The global spread of C...Show More

Abstract:

Nowadays the world is fighting against a global pandemic Covid-19 that has resulted in more than 5 million deaths and badly impacted world economy. The global spread of COVID-19 has triggered innovative research in the field of distributed computing using Big Data management tools. Big data analytics tools are used to better understand virus spread, to detect and track Covid-19 symptoms, to estimate risk factors, symptoms, diagnostics and other vital information and to control its spread. This paper presents a review of big data solutions that has been adopted to solve research issues in healthcare by performing distributed computing on massive datasets. In the proposed work, Apache Hadoop with MapReduce framework and Spark is used to perform analytics on Covid-19 datasets in parallel and distributive manner. Both frameworks have configuration parameters which can be modified to facilitate job performance and efficiency. This paper compares the performance of two major Bigdata platforms Hadoop and Spark. The execution time and throughput of both frameworks are analyzed with different input data size. The results shows that both platforms can be used to effectively to process huge amount of data in parallel and distributed computing and the performance depends on size of input data and configuration parameters. The results show that Spark has significantly faster computation time than Hadoop for smaller data sets.
Date of Conference: 20-22 January 2022
Date Added to IEEE Xplore: 25 February 2022
ISBN Information:
Conference Location: Tirunelveli, India

Contact IEEE to Subscribe

References

References is not available for this document.