Home // IMMM 2012, The Second International Conference on Advances in Information Mining and Management // View article


A New Algorithm for Accurate Histogram Construction

Authors:
Zeineb Dhouioui
Wissem Labbadi
Jalel Akaichi

Keywords: Optimal histograms; query result size estimation; error; query optimization; data summarization

Abstract:
Many commercial relational database systems use histograms to summarize data sets and also to determine the frequency distribution of attribute values. Based on this distribution, a database system estimates query result sizes within query optimization useful in effective information retrieval. Moreover, histograms are beneficial for judging whether the quality of the source is reliable or not; therefore, they enable us/ one to decide whether to keep this source in the information retrieval or remove it. Each histogram contains commonly an error which affects the accuracy of the estimation. This work surveys the state of the art on the problem of identifying optimal histograms, studies the effectiveness of these optimal histograms in limiting error propagation in the context of query optimization, and proposes a new algorithm for accurate histogram construction. As a result, we can conclude that theoretical results are confirmed in practice. In fact, the proposed histogram generates a low error.

Pages: 154 to 160

Copyright: Copyright (c) IARIA, 2012

Publication date: October 21, 2012

Published in: conference

ISSN: 2326-9332

ISBN: 978-1-61208-227-1

Location: Venice, Italy

Dates: from October 21, 2012 to October 26, 2012