On the Implementation and Use of Message Logging

Elnozahy, Elmootazbellah N.; Zwaenepoel, Willy

Elnozahy, Elmootazbellah N.; Zwaenepoel, Willy

1994

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

We present a number of experiments showing that for compute-intensive applications executing in parallel on clusters of workstations, message logging has higher failure-free overhead than coordinated checkpointing. Message logging protocols, however, result in much shorter output latency than coordinated checkpointing. Therefore, message logging should be used for applications involving substantial interactions with the outside world, while coordinated checkpointing should be used otherwise. We also present an unorthodox message logging design that uses coordinated checkpointing with message logging, departing from the conventional approaches that use independent checkpointing. This combination of message logging and coordinated checkpointing offers several advantages, including improved failure-free performance, bounded recovery time, simplified garbage collection, and reduced complexity. Meanwhile, the new protocols retain the advantages of the conventional message logging protocols with respect to output commit. Finally, we discuss three “lessons learned” from an implementation of various message logging protocols

Details

Title On the Implementation and Use of Message Logging

Author(s) Elnozahy, Elmootazbellah N. ; Zwaenepoel, Willy

Conference Proceedings of the Twentyfourth Fault-Tolerant Computing Symposium, June 1994

Date 1994

Laboratories LABOS

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LABOS - Operating Systems Laboratory
Peer-reviewed publications
Work outside EPFL
Conference Papers
Published

Record creation date 2005-10-20

Actions

Preview

Select file: