Home // DBKDA 2019, The Eleventh International Conference on Advances in Databases, Knowledge, and Data Applications // View article
A Schema Readability Metric for Automated Data Quality Measurement
Authors:
Lisa Ehrlinger
Gudrun Huszar
Wolfram Woess
Keywords: Data Quality; Metrics; Readability; Semantics
Abstract:
Data quality measurement is a critical success factor to estimate the explanatory power of data-driven decisions. Several data quality dimensions, such as completeness, accuracy, and timeliness, have been investigated so far and metrics for their measurement have been proposed. While most research into those dimensions refers to the data values, schema quality dimensions in general, and readability in particular, have not gained sufficient attention so far. A poorly readable schema has a negative impact on the data quality, e.g., two attributes with different purpose, but synonymous labels may cause incorrectly inserted attribute values. Thus, we specifically observe the data quality dimension readability on schema-level and introduce a metric for its measurement. The measurement is based on a dictionary-approach using a wordnet, which takes into account the semantics of the words used in the schema (e.g., attribute labels). We implemented and evaluated the schema readability metric within the data quality tool QuaIIe.
Pages: 4 to 10
Copyright: Copyright (c) IARIA, 2019
Publication date: June 2, 2019
Published in: conference
ISSN: 2308-4332
ISBN: 978-1-61208-715-3
Location: Athens, Greece
Dates: from June 2, 2019 to June 6, 2019