Monitoring and Maintenance Framework for a Distributed System

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Helsinki University of Technology | Diplomityö
Date
2006
Major/Subject
Informaatiotekniikka
Mcode
T-115
Degree programme
Language
en
Pages
88 s. + liitt. 20
Series
Abstract
A distributed system is a collection of, probably heterogeneous, machines whose distribution is transparent to the user, so that the system appears as one local machine. Managing a large-scale distributed system is itself a distributed activity. A consistent approach is needed to manage all the services which constitute a typical distributed system. Distributed system management involves monitoring the activity of a system, making management decisions and performing control actions to modify the behaviour of the system. The automated management of a distributed system is a challenge. The management is the function that aims at maintaining the system's ability to provide its specified services, with a prescribed quality of service. In order to reduce the amount of human-intervention, automated maintenance and monitoring is a necessity. Work has been done to identify a set of requirements for the management of a distributed system of security sensors. Existing management tools have been analyzed but no satisfactory solution was found. To address this issue, a new framework for automated monitoring and maintenance was devised and implemented based on a theoretical study described. The framework has been deployed as part of the IBM Billy Goat project. The ability to conveniently monitor the system and specify automated maintenance actions based on events are the most important features of the framework. The requirements, design and implementation, including the relevant details, are presented. The objectives have been achieved and the system has been successfully tested with positive results. The results achieved by introducing automated actions and monitoring are very promising. Future work consists of extending current features for the implementation to be sufficient for a production distributed system. This Master's Thesis has been done for IBM Zürich Research Laboratory, Switzerland.
Description
Supervisor
Simula, Olli|Biersack, Ernst
Thesis advisor
Zamboni, Diego
Keywords
distributed system management, hajautetut järjestelmät, monitoring, verkonhallinta, automated maintenance, monitorointi, framework, ylläpito, autonomic computing, autonominen järjestelmä
Other note
Citation