Back to Distributed Systems Online homepage
Find out how you can contribute to DS Online.
GUEST EDITORS' INTRODUCTION
  Home > Features > Guest Editors' Introduction

 
DSO Exclusive
September 2003


From Fault Tolerance to Security and Back

 

Felix C. Gärtner Swiss Federal Institute of Technology
Levente Buttyán Budapest University of Technology and Economics
Klaus Kursawe IBM Research

We know from everyday experience that computer systems fail because of relatively simple reasons: aged hard drives crash, power or network cables are accidentally cut, or spilled coffee produces a short circuit in your keyboard. Today, we know how to handle these types of failures—basically by decreasing the chances that the failures happen (for example, keeping individuals away from cables and beverages away from the keyboard) and by employing system redundancy in case the failure does happen (for example, using a Redundant Array of Inexpensive Disks system). In the area of dependability, these techniques have become known as fault removal and fault tolerance.

The advent of global networks such as the Internet, which let us access almost every computer from anywhere in the world, has paved the way to more flexibility (such as worldwide communication via email independent of a user’s whereabouts). But it has also created new risks, as the recent incidents involving computer viruses and worms have demonstrated. Today, your system’s perimeter is not controllable via doors and locks anymore, so keeping certain individuals from accessing your system in unwanted ways is difficult. The intent of the causes of failure have changed too: a random hardware fault cannot steal your password or other secret data, but an (experienced) hacker can. We have a rich toolset to deal with these types of threats, too—namely, by employing computer security methods such as cryptographic protocols or intrusion detection systems.

The PoDSy workshop

Evidently, there are strong similarities but also methodological differences between the ways of modeling threats and dealing with them in both the fault tolerance and security areas. However, precisely naming and formalizing these similarities and differences is still difficult. For example, are the techniques developed in a security-related setting basically applicable in a fault tolerance setting if you equate faults and attacks? This and other questions were explored during the first Workshop on the Principles of Dependable Systems (PoDSy) that was held on 24 June 2003 in San Francisco. The workshop's core was a set of four invited talks by Paulo Veríssimo (University of Lisboa), Ran Canetti (IBM Research), Catherine Meadows (US Naval Research Lab), and John Knight (University of Virginia).

The Whisper protocol

Among the research papers presented at the workshop, the organization committee selected the Whisper protocol paper by Vinayak Naik, Anish Arora, Sandip Bapat, and Mohamed Gouda. This is a well-written, paradigmatic example of the problems and solutions that play a role in building fault tolerant and secure systems. We're glad we can present this article in an updated version in this month’s DS Online.

The Whisper protocol maintains a shared secret between two network nodes that is refreshed incrementally. The main security properties are forward and backward secrecy, meaning that even if parts of the secret are obtained by an attacker, the protocol maintains or recovers the shared secret. If arbitrary (noncritical) variables are perturbed by faults, the fault tolerance properties involve stabilizing behavior. The protocol is aimed at the emerging field of sensor networks—networks of very small and computationally restricted devices. Whisper handles both types of adversarial behavior in a unique way and is a good example of a pragmatic, practical solution satisfying both fault tolerance and security goals.

What can we learn?

More information on the PoDSy workshop including a project report, copies of the talk slides, and photographs, is at http://lpdwww.epfl.ch/fgaertner/podsy2003. The site also contains a transcript of the interesting panel discussions on what fault-tolerance people can learn from security people and vice versa (the invited panelists were Yves Deswarte, Leslie Lamport, Roy Maxion, Jonathan Millen, and Neeraj Suri). The panel showed that security and fault tolerance still live separate lives, although some promising prospects are visible.

Acknowledgments
The organizers thank the contributing authors, the invited speakers, the panelists, and the members of the program committee for their valuable contributions to this event. Vital support by DSN organizing committee of the International Conference on Dependable Systems and Networks, especially the workshop chair Neeraj Suri, and by the German Research Foundation is also gratefully acknowledged. We also wish to thank the additional IEEE Distributed Systems Online reviewers and the IEEE Distributed Systems Online staff for their support.

Felix C. Gärtner is a postdoctoral student in the Distributed Programming Laboratory at the Swiss Federal Institute of Technology, Lausanne. His research interests cover the fuzzy relationship between fault tolerance and security, as well as the even fuzzier relationship between theoretical research and engineering practices. He has a PhD in computer science from Darmstadt University of Technology. Contact him at fcg@acm.org.

Levente Buttyán is an assistant professor in the Laboratory of Cryptography and System Security at the Budapest University of Technology and Economics. His current research focuses on the design of cryptographic protocols and security aspects of wired and wireless networks, including wireless ad hoc networks. He has a PhD in computer science from the Swiss Federal Institute of Technology, Lausann. Contact him at buttyan@hit.bme.hu.

Klaus Kursawe is an independent consultant in computer and communications security based in Switzerland. His research interests are in dependable systems, cryptography, and trusted computing platforms. He did his PhD in computer science at IBM Research in collaboration with Saarland University, Germany. Contact him at kursawe@acm.org.

 

Back to IEEE Distributed Systems Online

Topic areas list appears below.
»
»
»
»
»
»
»
»
»
»
»
»
»
»
graphic element
SPONSORING MAGAZINES:
»
»


Click here to subscribe to CS publications

IEEE homepage
Click here to visit the Computer Society's homepage
Click here to visit the Communications Society's homepage DS Online ISSN: 1541-4922 • Feedback? Send comments to .
This site and all contents (unless otherwise noted) are Copyright ©2003, Institute of Electrical and Electronics Engineers, Inc. All rights reserved.