Home // INTENSIVE 2012, The Fourth International Conference on Resource Intensive Applications and Services // View article


Mining Interesting Contrast Sets

Authors:
Mondelle Simeon
Robert Hilderman
Howard Hamilton

Keywords: contrast set mining; group differences; data mining

Abstract:
Contrast set mining has been developed as a data mining task which aims at discerning differences across groups. These groups can be patients, organizations, molecules, and even time-lines. A valid contrast set is a conjunction of attribute-value pairs that differ significantly in their distribution across groups. The search for valid contrast sets can produce a prohibitively large number of results which must be further filtered in order to be examined by a domain expert and have decisions enacted from them. In this paper, we introduce the notion of the minimum support ratio threshold to measure the ratio of maximum and minimum support across groups. We propose a contrast set mining technique to discover maximal valid contrast sets which meet a minimum support ratio threshold. We also introduce five interestingness measures and demonstrate how they can be used to rank contrast sets. Our experiments on real datasets demonstrate the efficiency and effectiveness of our approach, and the interestingness of the contrast sets discovered.

Pages: 14 to 21

Copyright: Copyright (c) IARIA, 2012

Publication date: March 25, 2012

Published in: conference

ISBN: 978-1-61208-188-5

Location: St. Maarten, The Netherlands Antilles

Dates: from March 25, 2012 to March 30, 2012