Home // FUTURE COMPUTING 2013, The Fifth International Conference on Future Computational Technologies and Applications // View article
Exploring HADOOP as a Platform for Distributed Association Rule Mining
Authors:
Shravanth Oruganti
Qin Ding
Nasseh Tabrizi
Keywords: Cloud computing; association rule mining; data mining; Hadoop.
Abstract:
Association rule mining is one of the important data mining techniques. Association rule mining is used to discover associations between different items in large datasets. The Apriori algorithm for association rule mining forms the basis for most other association rule mining algorithms. The original Apriori algorithm runs on a single node or computer. This limits the algorithm’s capability to run on large datasets due to the limited computational resources available. There have been various studies for parallelizing the algorithm. In this paper, Apache Hadoop was chosen as the distributed framework to implement the Apriori algorithm and to evaluate the performance of the algorithm on Hadoop. The Apriori algorithm was modified to be run on Hadoop. Performance analysis shows that Hadoop is a promising platform for distributed association rule mining.
Pages: 62 to 67
Copyright: Copyright (c) IARIA, 2013
Publication date: May 27, 2013
Published in: conference
ISSN: 2308-3735
ISBN: 978-1-61208-272-1
Location: Valencia, Spain
Dates: from May 27, 2013 to June 1, 2013