Home // DBKDA 2017, The Ninth International Conference on Advances in Databases, Knowledge, and Data Applications // View article
A Large Scale Synthetic Social Provenance Database
Authors:
Mohamed Jehad Baeth
Mehmet S Aktas
Keywords: data quality, large-scale database, provenance, social networks, synthetic workflow simulation
Abstract:
Provenance about data derivations in social networks is commonly referred as social provenance, which helps in estimating data quality, tracking of resources, and understanding the ways of information diffusion in social networks. We observed several challenges related to provenance in the social network domain. First, provenance collection systems capture provenance on the fly; however, their collection mechanism may be faulty and have dropped provenance notifications. Hence, social provenance records may be partial, partitioned, or simply inaccurate. Although current provenance systems deliver a source of real provenance data, these systems do not provide a controlled provenance generation environment; and there are few that contain provenance with failures. Synthetic provenance databases are available in other domains, such as e-Science; but there is also a need for such a database in the social networking domain. To address these challenges, this study introduces a large-scale noisy synthetic social provenance database, which includes a high volume of large-size social provenance graphs. It also introduces metrics that can be used to capture such vital information as provenance for calculating data quality and user credibility.
Pages: 16 to 22
Copyright: Copyright (c) IARIA, 2017
Publication date: May 21, 2017
Published in: conference
ISSN: 2308-4332
ISBN: 978-1-61208-558-6
Location: Barcelona, Spain
Dates: from May 21, 2017 to May 25, 2017