Missing Categorical Data Imputation for FCM Clusterings of Mixed Incomplete Data

Furukawa, Takashi; Ohnishi, Shin-ichi; Yamanoi, Takahiro

Home // COGNITIVE 2014, The Sixth International Conference on Advanced Cognitive Technologies and Applications // View article

Missing Categorical Data Imputation for FCM Clusterings of Mixed Incomplete Data

Authors:
Takashi Furukawa
Shin-ichi Ohnishi
Takahiro Yamanoi

Keywords: clustering; incomplete data; mixed data; FCM.

Abstract:
The Data mining is related to human cognitive ability, and one of popular method is fuzzy clustering. The focus of fuzzy c-means (FCM) clustering method is normally used on numerical data. However, most data existing in databases are both categorical and numerical. To date, clustering methods have been developed to analyze only complete data. Although we, sometimes, encounter data sets that contain one or more missing feature values (incomplete data) in data intensive classification systems, traditional clustering methods cannot be used for such data. Thus, we study this theme and discuss clustering methods that can handle mixed numerical and categorical incomplete data. In this paper, we propose some algorithms that use the missing categorical data imputation method and distances between numerical data that contain missing values. Finally, we show through a real data experiment that our proposed method is more effective than without imputation, when missing ratio becomes higher.

Pages: 94 to 98

Copyright: Copyright (c) IARIA, 2014

Publication date: May 25, 2014

Published in: conference

ISSN: 2308-4197

ISBN: 978-1-61208-340-7

Location: Venice, Italy

Dates: from May 25, 2014 to May 29, 2014