Home // PATTERNS 2018, The Tenth International Conference on Pervasive Patterns and Applications // View article
Authors:
Ole Kristian Ekseth
Svein-Olaf Hvasshovd
Keywords: Clustering, similarity metrics, data analysis, per- formance.
Abstract:
In data-analysis the use of approximate cluster algorithms has received broad popularity. A popular cluster- algorithm is the DBSCAN cluster-algorithm. While a number of software libraries provide support for the latter, they provide poor performance when analysing high-dimensional data. In this work we address this issue. We present a novel method and implementation which significantly boosts the performance of DBSCAN. The result is a software which reduce the memory- consumption by 103 GB for large data-sets while reducing the execution-time by 600x+ (for important similarity-metrics). This artilce presents a high-performance appraoch to identify answers to region-based similarity queries. While our work is tuned towards the application of DBSCAN, our novel approach for high-performance filtering of pairwise similarity-scores may be used in a number of cluster-algorithms. Therefore, the proposed method and software manages to address issues which are known to hamper high-dimensional data-analysis.
Pages: 6 to 11
Copyright: Copyright (c) IARIA, 2018
Publication date: February 18, 2018
Published in: conference
ISSN: 2308-3557
ISBN: 978-1-61208-612-5
Location: Barcelona, Spain
Dates: from February 18, 2018 to February 22, 2018