Home // INTELLI 2014, The Third International Conference on Intelligent Systems and Applications // View article
Complexity of Rule Sets Induced from Incomplete Data with Lost Values and Attribute-concept Values
Authors:
Patrick G. Clark
Jerzy W. Grzymala-Busse
Keywords: Data mining; rough set theory; probabilistic approximations; MLEM2 rule induction algorithm; lost values; attribute-concept values
Abstract:
This paper presents novel research on complexity of rule sets induced from incomplete data sets with two interpretations of missing attribute values: lost values and attribute-concept values. Experiments were conducted on 176 data sets, using three kinds of probabilistic approximations (lower, middle and upper) and the Modified Learning from Examples Module, version 2 (MLEM2) rule induction system. In our experiments, the size of the rule set was always smaller for attribute-concept values than for lost values (5% significance level). The total number of conditions was smaller for attribute-concept values than for lost values for 17 combinations of the type of data set and approximation, out of 24 combinations total. In remaining 7 cases, the difference in performance was statistically insignificant. Thus, we may claim that attribute-concept values are better than lost values in terms of rule complexity.
Pages: 91 to 96
Copyright: Copyright (c) IARIA, 2014
Publication date: June 22, 2014
Published in: conference
ISSN: 2308-4065
ISBN: 978-1-61208-352-0
Location: Seville, Spain
Dates: from June 22, 2014 to June 26, 2014