Home // CONTENT 2012, The Fourth International Conference on Creative Content Technologies // View article


Fast Parallel k-NN Search in High-Dimensional Spaces

Authors:
Hyun-Hwa Choi
Seung-Jo Bae
Kyu-Chul Lee

Keywords: k-nn search; distributed indexing structure; high dimensionality

Abstract:
We are currently witnessing a rapid growth of image data, triggered by the popularity of the Internet and the huge amount of user-generated content from Web 2.0 applications. To address the demanding search needs caused by large-scale image collections, two major approaches for high-dimensional data in cluster systems have been proposed: Speeding up the search by using distributed index structures, and speeding up the search by scanning a Vector Approximation-file (VA-file) in parallel. We propose to combine both techniques to search for large k-nearest neighbors (k-NN) in a high-dimensional space. We develop a distributed index structure, called a Distributed Vector Approximation-tree (DVA-tree), with a two-level structure: the first level is a hybrid spill-tree consisting of minimum bounding spheres, the second level is VA-files. We also introduce a new approximate k-NN search algorithm on this structure and derive cost formulae for predicting the response time of the k-NN search. We then provide a detailed evaluation on large, high dimensional datasets. In an experimental evaluation, we show that our indexing scheme can handle approximate k-NN queries more efficiently for high-dimensional datasets.

Pages: 32 to 37

Copyright: Copyright (c) IARIA, 2012

Publication date: July 22, 2012

Published in: conference

ISSN: 2308-4162

ISBN: 978-1-61208-220-2

Location: Nice, France

Dates: from July 22, 2012 to July 27, 2012