Managing Tail Latency in Datacenter-Scale File Systems Under Production Constraints

Misra, Pulkit A.; Borge, Maria F.; Goiri, Inigo; Lebeck, Alvin R.; Zwaenepoel, Willy; Bianchini, Ricardo

doi:10.1145/3302424.3303973

conference paper

Managing Tail Latency in Datacenter-Scale File Systems Under Production Constraints

Misra, Pulkit A.

•

Borge, Maria F.

•

Goiri, Inigo

more

January 1, 2019

Proceedings Of The Fourteenth Eurosys Conference 2019 (Eurosys '19)

14th EuroSys Conference

Distributed file systems often exhibit high tail latencies, especially in large-scale datacenters and in the presence of competing (and possibly higher priority) workloads. This paper introduces techniques for managing tail latencies in these systems, while addressing the practical challenges inherent in production datacenters (e.g., hardware heterogeneity, interference from other workloads, the need to maximize simplicity and maintainability). We implement our techniques in a scalable distributed file system (an extension of HDFS) used in production at Microsoft. Our evaluation uses 70k servers in 3 datacenters, and shows that our techniques reduce tail latency significantly for production workloads.

Type

conference paper

DOI

10.1145/3302424.3303973

Web of Science ID

WOS:000470898700017

Authors

Misra, Pulkit A.

•

Borge, Maria F.

•

Goiri, Inigo

•

Lebeck, Alvin R.

•

Zwaenepoel, Willy

•

Bianchini, Ricardo

Publication date

2019-01-01

Publisher

ASSOC COMPUTING MACHINERY

Published in

Proceedings Of The Fourteenth Eurosys Conference 2019 (Eurosys '19)

ISBN of the book

978-1-4503-6281-8

Publisher place

New York

Peer reviewed

REVIEWED

EPFL units

LABOS

Event name	Event place	Event date
14th EuroSys Conference	Dresden, GERMANY	Mar 25-28, 2019

Available on Infoscience

July 4, 2019

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/158811