Mapping Moving Landscapes by Mining Mountains of Logs: Novel Techniques for Dependency Model Generation

Problem diagnosis for distributed systems is usually difficult. Thus, an automated support is needed to identify root causes of encountered problems such as performance lags or inadequate functioning quickly. The many tools and techniques existing today that perform this task rely usually on some dependency model of the system. However, in complex and fast evolving environments it is practically unfeasible to keep such a model up-to-date manually and it has to be created in an automatic manner. For high level objects this is in itself a challenging and less studied task. In this paper, we propose three different approaches to discover dependencies by mining system logs. Our work is inspired by a recently developed data mining algorithm and techniques for collocation extraction from the natural language processing field. We evaluate the techniques in a case study for Geneva University Hospitals (HUG) and perform large-scale experiments on production data. Results show that all techniques are capable of finding useful dependency information with reasonable precision in a real-world environment.

Published in:
The 32nd International Conference on Very Large Data Bases, 1093-1102
Presented at:
VLDB 2006, Seoul, Korea, September 12-15, 2006

 Record created 2006-08-14, last modified 2020-07-30

External link:
Download fulltext
Rate this document:

Rate this document:
(Not yet reviewed)