Home // DEPEND 2013, The Sixth International Conference on Dependability // View article


Self-Recovery Technology in Distributed Service-Oriented Mission Critical Systems for Fault Tolerance

Authors:
Raymundo Garcia-Gomez
Juan Sebastian Guadalupe Godinez-Borja
Pedro Josue Hernandez-Torres
Carlos Perez-Leguizamo

Keywords: Service continuity; fault tolerance; service-oriented architecture; autonomous decentralized systems; fault detection; fault recovery

Abstract:
Mission Critical Systems (MCS) require continuous operation since a failure might cause economic or human losses. Autonomous Decentralized Service Oriented Architecture (ADSOA) is a proposal to design and develop MCS in which the system functionality is divided into service units in order to provide functional reliability and load balancing; on the other hand, it offers high availability through distributed replicas. A fault detection technology has been proposed for ADSOA. In this technology, an operational service level degradation can be detected autonomously by the service units at a point in which the continuity of the service may be compromised. However, this technology is limited because it requires human supervision for recovery. In this paper, we propose an autonomous recovering technology, which detects and instructs to service units to be gradually cloned in order to recover the operational service level. A prototype has been developed in order to verify the feasibility of this technology.

Pages: 37 to 41

Copyright: Copyright (c) IARIA, 2013

Publication date: August 25, 2013

Published in: conference

ISSN: 2308-4324

ISBN: 978-1-61208-301-8

Location: Barcelona, Spain

Dates: from August 25, 2013 to August 31, 2013