Home // CLOUD COMPUTING 2015, The Sixth International Conference on Cloud Computing, GRIDs, and Virtualization // View article


Residual Traffic Based Task Scheduling in Hadoop

Authors:
Daichi Tanaka
Masatoshi Kawarasaki

Keywords: distributed computing; Hadoop; MapReduce; job performance; network simulation

Abstract:
In Hadoop job processing, it is reported that a large amount of data transfer significantly influences job performance. In this paper, we clarify that the cause of performance deterioration in the CPU (Central Processing Unit) heterogeneous environment is the delay of copy phase due to the heavy load in the inter rack links of the cluster network. Thus, we propose a new scheduling method -Residual Traffic Based Task Scheduling- that estimates the amount of inter rack data transfer in the copy phase and regulates task assignment accordingly. We evaluate the scheduling method by using ns-3 (network simulator-3) and show that it can improve Hadoop job performance significantly.

Pages: 94 to 102

Copyright: Copyright (c) IARIA, 2015

Publication date: March 22, 2015

Published in: conference

ISSN: 2308-4294

ISBN: 978-1-61208-388-9

Location: Nice, France

Dates: from March 22, 2015 to March 27, 2015