Energy Proportionality and Workload Consolidation for Latency-Critical Applications

Energy proportionality and workload consolidation are important objectives towards increasing efficiency in large-scale datacenters. Our work focuses on achieving these goals in the presence of applications with microsecond-scale tail latency requirements. Such applications represent a growing subset of datacenter workloads and are typically deployed on dedicated servers, which is the simplest way to ensure low tail latency across all loads. Unfortunately, it also leads to low energy efficiency and low resource utilization during the frequent periods of medium or low load. We present the OS mechanisms and dynamic control needed to adjust core allocation and voltage/frequency settings based on the measured delays for latency-critical workloads. This allows for energy proportionality and frees the maximum amount of resources per server for other background applications, while respecting service-level objectives. The two key mechanism allow us to detect increases in queuing latencies and to re-assign flow groups between the threads of a latency-critical application in milliseconds without dropping or reordering packets. We compare the efficiency of our solution to the Pareto-optimal frontier of 224 distinct static configurations. Dynamic resource control saves 44%–54% of processor energy, which corresponds to 85%–93% of the Pareto-optimal upper bound. Dynamic resource control also allows background jobs to run at 32%–46% of their standalone throughput, which corresponds to 82%–92% of the Pareto bound.

Published in:
Proceedings of the 2015 ACM Symposium on Cloud Computing
Presented at:
2015 ACM Symposium on Cloud Computing, Kohala Coast, HI, USA, August 27-29, 2015

 Record created 2015-07-20, last modified 2018-03-17

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)