Loading [MathJax]/extensions/MathMenu.js
General-Purpose vs. Specialized Data Analytics Systems: A Game of ML & SQL Thrones | IEEE Conference Publication | IEEE Xplore

General-Purpose vs. Specialized Data Analytics Systems: A Game of ML & SQL Thrones


Abstract:

Over the past decade, a plethora of systems have emerged to support data analytics in various domains such as SQL and machine learning, among others. In each of the data ...Show More

Abstract:

Over the past decade, a plethora of systems have emerged to support data analytics in various domains such as SQL and machine learning, among others. In each of the data analysis domains, there are now many different specialized systems that leverage domain-specific optimizations to efficiently execute their workloads. An alternative approach is to build a general-purpose data analytics system that uses a common execution engine and programming model to support workloads in different domains. In this work, we choose representative systems of each class (Spark, TensorFlow, Presto and Hive) and benchmark their performance on a wide variety of machine learning and SQL workloads. We perform an extensive comparative analysis on the strengths and limitations of each system and highlight major areas for improvement for all systems. We believe that the major insights gained from this study will be useful for developers to improve the performance of these systems.
Date of Conference: 09-12 December 2019
Date Added to IEEE Xplore: 24 February 2020
ISBN Information:
Conference Location: Los Angeles, CA, USA

Contact IEEE to Subscribe

References

References is not available for this document.