A new look at atomic broadcast in the asynchronous crash-recovery model

Mena, Sergio; Schiper, André

doi:10.1109/RELDIS.2005.6

Mena, Sergio; Schiper, André

2005

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Atomic broadcast in particular, and group communication in general, have mainly been specified and implemented in a system model where processes do not recover after a crash. The model is called crash-stop. The drawback of this model is its inability to express algorithms that tolerate the crash of a majority of processes. This has led to extend the crash-stop model to the so-called crash-recovery model, in which processes have access to stable storage, to log their state periodically. This allows them to recover a previous state after a crash. However, the existing specifications of atomic broadcast in the crash-recovery model are not satisfactory, and the paper explains why. The paper also proposes a new specification of atomic broadcast in the crash-recovery model that addresses these issues. Specifically, our new specification allows to distinguish between a uniform and a non-uniform version of atomic broadcast. The non-uniform version logs less information, and is thus more efficient. The uniform and non-uniform atomic broadcast have been implemented and compared with a published atomic broadcast algorithm. Performance results are presented.

Details

Title A new look at atomic broadcast in the asynchronous crash-recovery model

Author(s) Mena, Sergio ; Schiper, André

Published in Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems

Pages 202-211

Conference 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05), Orlando, USA, October 26-28, 2005

Date 2005

Keywords

Distributed systems; Atomic broadcast; Crash-recovery model; Group communication; Fault tolerance}

DOI https://doi.org/10.1109/RELDIS.2005.6

Additional link URL

Laboratories LSR

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IC Archives > LSR - Distributed Systems Laboratory
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2005-11-25

Files

Abstract

Details

PDF