Home // CONTENT 2012, The Fourth International Conference on Creative Content Technologies // View article
Fast Parallel k-NN Search in High-Dimensional Spaces
Authors:
Hyun-Hwa Choi
Seung-Jo Bae
Kyu-Chul Lee
Keywords: k-nn search; distributed indexing structure; high dimensionality
Abstract:
We are currently witnessing a rapid growth of image data, triggered by the popularity of the Internet and the huge amount of user-generated content from Web 2.0 applications. To address the demanding search needs caused by large-scale image collections, two major approaches for high-dimensional data in cluster systems have been proposed: Speeding up the search by using distributed index structures, and speeding up the search by scanning a Vector Approximation-file (VA-file) in parallel. We propose to combine both techniques to search for large k-nearest neighbors (k-NN) in a high-dimensional space. We develop a distributed index structure, called a Distributed Vector Approximation-tree (DVA-tree), with a two-level structure: the first level is a hybrid spill-tree consisting of minimum bounding spheres, the second level is VA-files. We also introduce a new approximate k-NN search algorithm on this structure and derive cost formulae for predicting the response time of the k-NN search. We then provide a detailed evaluation on large, high dimensional datasets. In an experimental evaluation, we show that our indexing scheme can handle approximate k-NN queries more efficiently for high-dimensional datasets.
Pages: 32 to 37
Copyright: Copyright (c) IARIA, 2012
Publication date: July 22, 2012
Published in: conference
ISSN: 2308-4162
ISBN: 978-1-61208-220-2
Location: Nice, France
Dates: from July 22, 2012 to July 27, 2012