Abstract:
Recently, many data analytics systems have focused on adopting a newly emerging compute resource, serverless, which offers scalability and agility to deal with peak workl...Show MoreMetadata
Abstract:
Recently, many data analytics systems have focused on adopting a newly emerging compute resource, serverless, which offers scalability and agility to deal with peak workloads in a timely and cost-efficient manner, i.e., serverless data analytics (SDA). Unfortunately, these systems may encounter a cost bottleneck () because they have ignored the per unit time cost () of serverless, which is more expensive by up to 5.8 times for the same compute capacity than a traditional compute resource such as a virtual machine (VM). In addition, SDA may also encounter a performance bottleneck due to serverless' worse performance than VM. In this paper, we first study and report when serverless is beneficial for data analytics. Then, we present a scalable compute cost-aware data analytics system, Cocoa, that exploits serverless and VM together to achieve composite benefits. A Cocoa prototype was implemented on Spark. Evaluation results show a richer cost-performance tradeoff space opened by exploiting heterogeneous compute resources together, and identify substantial opportunities for future serverless-enabled systems research.
Date of Conference: 04-08 October 2021
Date Added to IEEE Xplore: 22 November 2021
ISBN Information: