PerfIso: Performance Isolation for Commercial Latency-Sensitive Services

Iorgulescu, Calin; Azimi, Reza; Kwon, Youngjin; Elnikety, Sameh; Syamala, Manoj; Narasayya, Vivek; Herodotou, Herodotos; Tomita, Paulo; Chen, Alex; Zhang, Jack; Wang, Junhua

conference paper

Iorgulescu, Calin

•

Azimi, Reza

•

Kwon, Youngjin

more

January 1, 2018

Proceedings Of The 2018 Usenix Annual Technical Conference

USENIX Annual Technical Conference (ATC)

Large commercial latency-sensitive services, such as web search, run on dedicated clusters provisioned for peak load to ensure responsiveness and tolerate data center outages. As a result, the average load is far lower than the peak load used for provisioning, leading to resource under-utilization. The idle resources can be used to run batch jobs, completing useful work and reducing overall data center provisioning costs. However, this is challenging in practice due to the complexity and stringent tail-latency requirements of latency-sensitive services. Left unmanaged, the competition for machine resources can lead to severe response-time degradation and unmet service-level objectives (SLOs).

This work describes PerfIso, a performance isolation framework which has been used for nearly three years in Microsoft Bing, a major search engine, to colocate batch jobs with production latency-sensitive services on over 90,000 servers. We discuss the design and implementation of PerfIso, and conduct an experimental evaluation in a production environment. We show that colocating CPU-intensive jobs with latency-sensitive services increases average CPU utilization from 21% to 66% for off-peak load without impacting tail latency.

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/165208

Type

conference paper

Web of Science ID

WOS:000508006700040

Authors

Iorgulescu, Calin

•

Azimi, Reza

•

Kwon, Youngjin

•

Elnikety, Sameh

•

Syamala, Manoj

•

Narasayya, Vivek

•

Herodotou, Herodotos

•

Tomita, Paulo

•

Chen, Alex

•

Zhang, Jack

more

Publication date

2018-01-01

Publisher

USENIX ASSOC

Published in

Proceedings Of The 2018 Usenix Annual Technical Conference

ISBN of the book

978-1-939133-02-1

Publisher place

Berkeley

Start page

519

End page

531

Subjects

efficient

tail

Peer reviewed

REVIEWED

EPFL units

LABOS

Event name	Event place	Event date
USENIX Annual Technical Conference (ATC)	Boston, MA	Jul 11-13, 2018

Available on Infoscience

February 8, 2020