The Overhead of Consensus Failure Recovery

Dutta, P; Guerraoui, R; Keidar, I

doi:10.1007/s00446-006-0017-6

Dutta, P; Guerraoui, R; Keidar, I

2007

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Many reliable distributed systems are consensus-based and typically operate under two modes: a fast normal mode in failure-free synchronous periods, and a slower recovery mode following asynchrony and failures. A lot of work has been devoted to optimize the normal mode, but little has focused on optimizing the recovery mode. This paper seeks to understand whether the recovery mode is inherently slower than the normal mode. In particular, we consider consensus algorithms in the round-based eventually synchronous model of [DLS88], where t out of n processes may fail by crashing, messages may be lost, and the system may be asynchronous for arbitrarily long, but eventually the system becomes synchronous and no new failure occurs (we say that the system becomes stable). For t >= n/3, we prove a lower bound of three rounds for achieving a global decision whenever the system becomes stable, and we contrast this with a bound of two rounds when t < n/3. We then give matching algorithms for both t >= n/3 and t < n/3.

Details

Title The Overhead of Consensus Failure Recovery

Author(s) Dutta, P ; Guerraoui, R ; Keidar, I

Published in Distributed Computing

Volume 19

Pages 373–386

Date 2007

DOI https://doi.org/10.1007/s00446-006-0017-6

Laboratories DCL

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > DCL - Distributed Computing Laboratory
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2006-10-05

Actions

Preview

Select file: