Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Reports, Documentation, and Standards
  4. JPaxos: State machine replication based on the Paxos protocol
 
report

JPaxos: State machine replication based on the Paxos protocol

Kończak, Jan
•
de Sousa Santos, Nuno Filipe  
•
Żurkowski, Tomasz
Show more
2011

State machine replication is a technique for making services fault-tolerant by replicating them over a group of machines. Although the theory of state machine replication has been studied extensively, the engineering challenges of converting a theoretical description into a fully functional system is less understood. This creates difficulties to implementors, because in designing such a system they face many engineering challenges which are crucial to ensure good performance and stability of a replicated system. In this report, we address this problem by describing the design and implementation of JPaxos, a fully-functional implementation of state machine replication based on the MultiPaxos protocol. Our description includes the basic modules of a state machine replication implementation, like snapshotting of service state, state-transfer and keeping up-to-date all replicas, but focus mainly on three aspects: recovery mechanisms, batching and pipelining optimizations, and a scalable threading-architecture. We present several recovery algorithms that vary in the usage of stable storage and on the system assumptions, including some that use stable storage only once per-recovery. Batching and pipelining are well-known optimizations commonly used in state machine replication. With JPaxos we have studied their interaction in detail, and provide guidelines to tune these mechanisms for a variety of systems. Finally, the threading architecture of JPaxos was designed to scale with the number of cores, while at the same time minimizing complexity to reduce the risk of concurrency bugs.

  • Files
  • Details
  • Metrics
Type
report
Author(s)
Kończak, Jan
de Sousa Santos, Nuno Filipe  
Żurkowski, Tomasz
Wojciechowski, Paweł T.
Schiper, André  
Date Issued

2011

Total of pages

38

Subjects

Fault tolerance

•

Distributed Systems

•

State Machine Replication

•

Paxos

•

Implementation

Written at

EPFL

EPFL units
LSR-IC  
Available on Infoscience
July 30, 2011
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/69874
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés