Efficient Testing of Recovery Code Using Fault Injection

Marinescu, Paul D.; Candea, George

doi:10.1145/2063509.2063511

research article

Efficient Testing of Recovery Code Using Fault Injection

Marinescu, Paul D.

•

Candea, George

2011

Acm Transactions On Computer Systems

A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/82265

Type

research article

DOI

10.1145/2063509.2063511

Web of Science ID

WOS:000298638000002

Authors

Marinescu, Paul D.

•

Candea, George

Publication date

2011

Published in

Acm Transactions On Computer Systems

Volume

29

Issue

4

Start page

11

Subjects

Reliability

Fault injection

automated testing

Peer reviewed

REVIEWED

EPFL units

DSLAB

Available on Infoscience

June 25, 2012