Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Achieving high-throughput State Machine Replication in multi-core systems
 
conference paper

Achieving high-throughput State Machine Replication in multi-core systems

Santos, Nuno
•
Schiper, Andre  
2013
2013 Ieee 33Rd International Conference On Distributed Computing Systems (Icdcs)
33rd IEEE International Conference on Distributed Computing Systems (ICDCS)

The traditional architecture used by implementations of Replicated State Machines (RSM) does not fully exploit modern multi-core CPUs. This is increasingly the limiting factor in their performance, because network speeds are increasing much faster than the single-thread performance of CPUs. Thus, when deployed on Gigabit-class networks and exposed to a workload of small to medium size client requests, RSMs are often CPU-bound, as they are only able to leverage a few cores, even though many more may be available. In this work, we revisit the traditional architecture of a RSM implementation, showing how it can be parallelized so that its performance scales with the number of cores in the nodes. We do so by applying several good practices of concurrent programming to the specific case of state machine replication, including staged execution, workload partitioning, actors, and non-blocking data structures. We describe and test a Java prototype of our architecture, based on the Paxos protocol. With a workload consisting of small requests, we achieve a six times improvement in throughput using eight cores. More generally, in all our experiments we have consistently reached the limits of the network subsystem by using up to 12 cores, and do not observe any degradation when using up to 24 cores. Furthermore, the profiling results of our implementation show that even at peak throughput contention between threads is minimal, suggesting that the throughput would continue scaling given a faster network.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/Icdcs.2013.11
Web of Science ID

WOS:000333267200027

Author(s)
Santos, Nuno
Schiper, Andre  
Date Issued

2013

Publisher

Ieee

Publisher place

New York

Published in
2013 Ieee 33Rd International Conference On Distributed Computing Systems (Icdcs)
ISBN of the book

978-0-7695-5000-8

Total of pages

10

Series title/Series vol.

IEEE International Conference on Distributed Computing Systems

Start page

266

End page

275

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LSR-IC  
Event name
33rd IEEE International Conference on Distributed Computing Systems (ICDCS)
Available on Infoscience
June 2, 2014
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/103854
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés