Abstract:
Tail latency has been the defining performance metric for interactive services since the inception of cloud computing. Although various hardware and software techniques h...Show MoreMetadata
Abstract:
Tail latency has been the defining performance metric for interactive services since the inception of cloud computing. Although various hardware and software techniques have been employed to improve tail latency for these applications, recent trends across the cloud system stack require revisiting them. Over the past few years, cloud hardware has become increasingly heterogeneous, and cloud software has been dominated by event-driven modular programming frameworks, as well as the proliferation of artificial intelligence. To guarantee tail latency in this new landscape, several system advances are required. In this paper, we first review what tail latency means for cloud services, the key innovations that improved it in the past, the trends that require revisiting them, as well as the innovations that will be required for tail latency constraints to be met in the next generation of warehouse-scale computers.
Published in: IEEE Micro ( Volume: 44, Issue: 5, Sept.-Oct. 2024)