Home // CLOUD COMPUTING 2011, The Second International Conference on Cloud Computing, GRIDs, and Virtualization // View article
A Generalized Approach for Fault Tolerance and Load Based Scheduling of Threads in Alchemi .Net
Authors:
Vishu Sharma
Manu Vardhan
Shakti Mishra
Dharmender Singh Kushwaha
Keywords: ARMF; FCFS; fault tolerance; load based scheduling
Abstract:
Computational grids can be best utilized by the divide and conquer approach, when it comes to executing a large process. In order to achieve this, building multithreaded application is one of the efficient approaches. The threads are scheduled on different computational nodes for execution. One of the frameworks that support multithreaded applications is Alchemi, but it does not incorporate any load based scheduling and fault tolerance strategy. In Alchemi, a manager node uses first come first serve (FCFS) scheduling to schedule threads on executors (node that execute independent thread), but it does not consider any CPU load on which the executors are running. Moreover if an executor fails in between, then the manager node reschedules the thread on other executor node. One solution for the above problem is to save intermediate results from each thread and reschedule these threads on another executor. We propose an approach that provides fault tolerance in Alchemi by using Alchemi Replica Manager Framework (ARMF), where the manager node will be replicated on one of its executor node. The proposed algorithm is 6-16 percent more efficient than FCFS, when implemented in Alchemi.
Pages: 211 to 216
Copyright: Copyright (c) IARIA, 2011
Publication date: September 25, 2011
Published in: conference
ISSN: 2308-4294
ISBN: 978-1-61208-153-3
Location: Rome, Italy
Dates: from September 25, 2011 to September 30, 2011