Home // ALLDATA 2020, The Sixth International Conference on Big Data, Small Data, Linked Data and Open Data // View article


Synonym Predicate Discovery for Linked Data Quality Assessment Without Requiring the Ontology Semantic Relations

Authors:
Samah Salem
Fouzia Benchikha

Keywords: linked data; quality assessment; semantic relations; synonym predicates; profiling statistics; DBpedia.

Abstract:
Over the past years, an increasing number of datasets have been published as part of the Web of Data, reaching more than 1,200 datasets in 2019. However, many datasets, totaling a large quantity of RDF triples, are without ontology or with an incomplete one. As a result, they suffer more and more from quality problems. Assessing linked data quality for fitness for use is a current research problem that we are interested in. In this paper, we propose a novel approach for the assessment of quality between RDF triples without requiring schema information. It allows assessing the quality of datasets by detecting errors and eventually measuring the error rate using synonym predicates techniques, profiling statistics, and quality verification cases. Promising results are obtained on the DBpedia dataset where several data quality issues have been detected, such as inaccurate values, redundant predicates, and redundant triples.

Pages: 8 to 13

Copyright: Copyright (c) IARIA, 2020

Publication date: February 23, 2020

Published in: conference

ISSN: 2519-8386

ISBN: 978-1-61208-775-7

Location: Lisbon, Portugal

Dates: from February 23, 2020 to February 27, 2020