Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Hardware and Software Support for RPC-Centric Server Architecture
 
doctoral thesis

Hardware and Software Support for RPC-Centric Server Architecture

Sutherland, Mark Johnathon  
2022

Online services have become ubiquitous in technological society, the global demand for which has driven enterprises to construct gigantic datacenters that run their software. Such facilities have also recently become a substrate for third-party organizations due to the advantages of moving infrastructure to the cloud. The task of developing, releasing, and maintaining software at datacenter scale has given rise to a software architecture employing many independent microservices, each accomplishing a single role and communicating using an enforced API, the most common of which is Remote Procedure Calls (RPCs). As microservices have become standard practice for datacenter-scale software, the datacenter's underlying components must support them efficiently. The increasing adoption of microservice architectures implies a drastic growth in network communication, because each microservice receives and creates many RPCs that often execute for only a few microseconds (us). Therefore, delivering users an interactive, low latency service becomes more challenging, because each request involves more interactions with the components implementing the communication stack. It is particularly difficult to ensure the latency of the slowest responses, called the "tail latency", is acceptable to the service's users. Datacenter system design is therefore undergoing a rapid shift to enable programmers to reap the benefits of microservices without their performance quandaries. Handling RPCs from us-scale software at the line rates of today's NICs -- delivering up to 400Gbps -- is an open challenge, which will require designing all layers of the communication stack to natively offer support for RPC semantics. Although the performance of the network and protocol layers has drastically improved by prioritizing RPCs as a primary design objective, server hardware has not yet done so. Therefore, we posit that now is the time for an RPC centric server architecture to emerge to allow server endpoints to match the performance of their surrounding system components.

To that end, this thesis introduces hardware and software support for RPC-centric server architecture. We first make the case that today's hardware-terminated network transport protocols grossly over-provision buffering because they are agnostic to the latency constraints inherent in each RPC, and simply exposing such RPC-level information to hardware allows 1.25-2.2x better performance. Motivated by prior work demonstrating the RPC stack's burdensome cost, we then show how a previously proposed RPC stack accelerator can be integrated with the implementation of our aforementioned NIC protocol. Finally, we propose new NIC driven load balancing policies that boost microservice throughput via improved locality, while simultaneously maintaining tail latency guarantees. Our proposals improve 99th% tail latency in data stores by 2-5.5x, and reduce instruction cache misses in stateless microservices by 1.1x-1.8x. In summary, we present evidence that designing and implementing a server's NIC hardware to natively support RPC semantics removes protocol scalability bottlenecks and enables microservices to enjoy further performance benefits.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-8017
Author(s)
Sutherland, Mark Johnathon  
Advisors
Falsafi, Babak  
•
Daglis, Alexandros  
Jury

Prof. Martin Jaggi (président) ; Prof. Babak Falsafi, Prof. Alexandros Daglis (directeurs) ; Prof. Edouard Bugnion, Prof. Mark Silberstein, Dr. Steven Reinhardt (rapporteurs)

Date Issued

2022

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2022-09-05

Thesis number

8017

Total of pages

256

Subjects

datacenters

•

servers

•

microservices

•

remote procedure calls

•

hardware

•

load balancing

•

tail latency

•

network protocols

•

queueing theory

•

co-design

EPFL units
PARSA  
Faculty
IC  
School
IINFCOM  
Doctoral School
EDIC  
Available on Infoscience
September 5, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/190516
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés