Abstract:
In data-centers, the composite origin and bursty nature of traffic, the small bandwidth-delay product and the tiny switch buffers lead to unusual congestion patterns that...Show MoreMetadata
Abstract:
In data-centers, the composite origin and bursty nature of traffic, the small bandwidth-delay product and the tiny switch buffers lead to unusual congestion patterns that are not handled well by traditional end-to-end congestion control mechanisms such as those deployed in TCP. Existing works address the problem by modifying TCP to adapt it to the idiosyncrasies of data-centers. While this is feasible in private environments, it remains almost impossible to achieve practically in public multi-tenant clouds where a multitude of operating systems and thus congestion control protocols co-exist. In this work, we design a simple switch-based active queue management scheme to deal with such congestion issues adequately. Our approach requires no modification to TCP which enables seamless deployment in public data-centers via switch firmware updates. We present a simple analysis to show the stability and effectiveness of our approach, then discuss the real implementations in software and hardware on the NetFPGA platform. Numerical results from ns-2 simulation and experimental results from a small testbed cluster demonstrate the effectiveness of our approach in achieving high overall throughput, good fairness, smaller flow completion times (FCT) for short-lived flows, and a significant reduction in the tail of the FCT distribution.
Published in: IEEE Transactions on Cloud Computing ( Volume: 12, Issue: 1, Jan.-March 2024)