JAGR: an autonomous self-recovering application server

This paper demonstrates that the dependability of generic, evolving J2EE applications can be enhanced through a combination of a few recovery-oriented techniques. Our goal is to reduce downtime by automatically and efficiently recovering from a broad class of transient software failures without having to modify applications. We describe here the integration of three new techniques into JBoss, an open-source J2EE application server. The resulting system is JAGR-JBoss with application-generic recovery - a self-recovering execution platform. JAGR combines application-generic failure-path inference (AFPI), path-based failure detection, and micro-reboots. AFPI uses controlled fault injection and observation to infer paths that faults follow through a J2EE application. Path-based failure detection uses tagging of client requests and statistical analysis to identify anomalous component behavior. Micro-reboots are fast reboots we perform at the sub-application level to recover components from transient failures; by selectively rebooting only those components that are necessary to repair the failure, we reduce recovery time. These techniques are designed to be autonomous and application-generic, making them well suited to the rapidly changing software of Internet services

Published in:
Proceedings of the Autonomic Computing Workshop. Fifth Annual International Workshop on Active Middleware Services. AMS 2003, 168 - 77
autonomous self-recovering application server;evolving J2EE application;recovery-oriented technique;downtime reduction;transient software failure;open-source J2EE application server;JAGR-JBoss;application-generic recovery;self-recovering execution platform;application-generic failure-path inference;AFPI;path-based failure detection;microreboot;controlled fault injection;fault path inference;client request tagging;statistical analysis;anomalous component behavior identification;component recovery;transient failure;selective rebooting;recovery time reduction;Internet service;

 Record created 2006-12-22, last modified 2018-03-17

Rate this document:

Rate this document:
(Not yet reviewed)