Home // CLOUD COMPUTING 2015, The Sixth International Conference on Cloud Computing, GRIDs, and Virtualization // View article
Residual Traffic Based Task Scheduling in Hadoop
Authors:
Daichi Tanaka
Masatoshi Kawarasaki
Keywords: distributed computing; Hadoop; MapReduce; job performance; network simulation
Abstract:
In Hadoop job processing, it is reported that a large amount of data transfer significantly influences job performance. In this paper, we clarify that the cause of performance deterioration in the CPU (Central Processing Unit) heterogeneous environment is the delay of copy phase due to the heavy load in the inter rack links of the cluster network. Thus, we propose a new scheduling method -Residual Traffic Based Task Scheduling- that estimates the amount of inter rack data transfer in the copy phase and regulates task assignment accordingly. We evaluate the scheduling method by using ns-3 (network simulator-3) and show that it can improve Hadoop job performance significantly.
Pages: 94 to 102
Copyright: Copyright (c) IARIA, 2015
Publication date: March 22, 2015
Published in: conference
ISSN: 2308-4294
ISBN: 978-1-61208-388-9
Location: Nice, France
Dates: from March 22, 2015 to March 27, 2015