A new look at atomic broadcast in the asynchronous crash-recovery model

Atomic broadcast in particular, and group communication in general, have mainly been specified and implemented in a system model where processes do not recover after a crash. The model is called crash-stop. The drawback of this model is its inability to express algorithms that tolerate the crash of a majority of processes. This has led to extend the crash-stop model to the so-called crash-recovery model, in which processes have access to stable storage, to log their state periodically. This allows them to recover a previous state after a crash. However, the existing specifications of atomic broadcast in the crash-recovery model are not satisfactory, and the paper explains why. The paper also proposes a new specification of atomic broadcast in the crash-recovery model that addresses these issues. Specifically, our new specification allows to distinguish between a uniform and a non-uniform version of atomic broadcast. The non-uniform version logs less information, and is thus more efficient. The uniform and non-uniform atomic broadcast have been implemented and compared with a published atomic broadcast algorithm. Performance results are presented.

Published in:
Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems
Presented at:
24th IEEE Symposium on Reliable Distributed Systems (SRDS'05), Orlando, USA, October 26-28, 2005

Note: The status of this file is: EPFL only

 Record created 2005-11-25, last modified 2018-03-17

Download fulltextPDF
External link:
Download fulltextURL
Rate this document:

Rate this document:
(Not yet reviewed)