Efficient Testing of Recovery Code Using Fault Injection

Marinescu, Paul D.; Candea, George

doi:10.1145/2063509.2063511

2011

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.

Details

Title Efficient Testing of Recovery Code Using Fault Injection

Author(s) Marinescu, Paul D. ; Candea, George

Published in Acm Transactions On Computer Systems

Volume 29

Issue 4

Pages 11

Date 2011

Keywords

Reliability; Fault injection; automated testing

Language English

DOI https://doi.org/10.1145/2063509.2063511

Other identifier(s) View record in Web of Science

Laboratories DSLAB

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > DSLAB - Dependable Systems Laboratory
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2012-06-25

Abstract

Details

Actions