Using k-Core Decomposition to Find Cluster Centers for k-Means Algorithm in GraphX on Spark

Cheng, Sheng-Tzong; Chen, Yin-Chun; Tsai, Meng-Shuan

Home // CLOUD COMPUTING 2017, The Eighth International Conference on Cloud Computing, GRIDs, and Virtualization // View article

Using k-Core Decomposition to Find Cluster Centers for k-Means Algorithm in GraphX on Spark

Authors:
Sheng-Tzong Cheng
Yin-Chun Chen
Meng-Shuan Tsai

Keywords: cloud computing; GraphX; Spark; k-core decomposition; graph-based k-means

Abstract:
Big data analysis is getting more and more attention these days. In social network applications, a large amount of data is in a graph structure form. As a result, more computation time is required for graph data analysis. In 2014, a framework of in-memory computing, Spark, was proposed for big data analysis. Through reusing the data in memory to solve the long computation time issue, Spark finishes a task in a shorter time compared to Hadoop. In addition, GraphX, a Spark API (Application Interface), provides a graphical interface and makes graph data analysis simple and efficient. This study presents an improved k-mean clustering method by integrating k-core decomposition, which is an important algorithm in community detection to find the center of each cluster. We implement the clustering algorithm with GraphX to get better performance and results compared to the original k-mean clustering method.

Pages: 93 to 98

Copyright: Copyright (c) IARIA, 2017

Publication date: February 19, 2017

Published in: conference

ISSN: 2308-4294

ISBN: 978-1-61208-529-6

Location: Athens, Greece

Dates: from February 19, 2017 to February 23, 2017