Home // INTELLI 2013, The Second International Conference on Intelligent Systems and Applications // View article
Authors:
Patrick G. Clark
Jerzy W. Grzymala-Busse
Martin Kuehnhausen
Keywords: Data mining; probabilistic approaches to missing attribute values; rough set theory; probabilistic approximations; parameterized approximations
Abstract:
In this paper, we study probabilistic and rough set approaches to missing attribute values. Probabilistic approaches are based on imputation, a missing attribute value is replaced either by the most probable known attribute value or by the most probable attribute value restricted to a concept. In this paper, in a rough set approach to missing attribute values we consider two interpretations of such value: lost and "do not care". Additionally, we apply three definitions of approximations (singleton, subset and concept) and use an additional parameter called alpha. Our main objective was to compare probabilistic and rough set approaches to missing attribute values for incomplete data sets with many missing attribute values. We conducted experiments on six incomplete data sets with as many missing attribute values as possible. In these data sets an additional incremental replacement of known values by missing attribute values resulted with the entire records filled with only missing attribute values. Rough set approaches were better for five data sets, for one data set probabilistic approach was more successful.
Pages: 12 to 17
Copyright: Copyright (c) IARIA, 2013
Publication date: April 21, 2013
Published in: conference
ISSN: 2308-4065
ISBN: 978-1-61208-269-1
Location: Venice, Italy
Dates: from April 21, 2013 to April 26, 2013