Feature Selection for Clustering by Exploring Nearest and Farthest Neighbors

Chen, Chien-Hsing

Home // FUTURE COMPUTING 2012, The Fourth International Conference on Future Computational Technologies and Applications // View article

Feature Selection for Clustering by Exploring Nearest and Farthest Neighbors

Authors:
Chien-Hsing Chen

Keywords: feature selection; nearest neighbor; farthest neighbor; salient feature; cluster analysis

Abstract:
Feature selection has been explored extensively for use in several real-world applications. In this paper, we propose a new method to select a salient subset of features from unlabeled data, and the selected features are then adaptively used to identify natural clusters in the cluster analysis. Unlike previous methods that select salient features for clustering, our method does not require a predetermined clustering algorithm to identify salient features, and our method potentially ignores noisy features, allowing improved identification of salient features. Our feature selection method is motivated by a basic characteristic of clustering: a data instance usually belongs to the same cluster as its geometrically nearest neighbors and belongs to a cluster different than those of its geometrically farthest neighbors. In particular, our method uses instance-based learning to quantify features in the context of the nearest and the farthest neighbors of every instance so that clusters generated by the salient features maintain this characteristic.

Pages: 80 to 83

Copyright: Copyright (c) IARIA, 2012

Publication date: July 22, 2012

Published in: conference

ISSN: 2308-3735

ISBN: 978-1-61208-217-2

Location: Nice, France

Dates: from July 22, 2012 to July 27, 2012