Flow control for Latency-Critical RPCs

In today’s modern datacenters, the waiting time spent within a server’s queue is a major contributor of the end-to-end tail latency of μs-scale remote procedure calls. In traditional TCP, congestion control handles in-network congestion, while flow control was designed to avoid memory overruns in streaming scenarios. The latter is unfortunately oblivious to the load on the server when processing short requests from multiple clients at very high rates. Acknowledging flow control as the mechanism that controls queuing on the end-host, we propose a different flow control mechanism that depends on the application-specific service-level objectives and controls the waiting time in the receivers queue by adjusting the incoming load accordingly. We design this latency-aware flow control mechanism as part of TCP by maintaining a wire-compatible header format without introducing extra messages. We implement a proof-of-concept userspace TCP stack on top of DPDK and we show that the new flow control mechanism prevents applications from violating service-level objectives in a single-server environment by throttling the incoming requests. We demonstrate the true benefit of the approach in a replicated, multi-server scenario, where independent clients leverage the flow-control signal to avoid directing requests to the overloaded servers

Presented at:
KBNets’18: ACM SIGCOMM 2018 Afternoon Workshop on Kernel Bypassing Networks, Budapest, Hungary, 20-09,2018

 Record created 2018-06-11, last modified 2018-06-11

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)