Home // International Journal On Advances in Systems and Measurements, volume 12, numbers 3 and 4, 2019 // View article


Data Quality Challenges in Weather Sensor Data, Including Identification of Mis-located Sites

Authors:
Douglas Galarus
Rafal Angryk

Keywords: Data Quality; Spatial-Temporal Data; Quality Control; Outlier; Inlier; Bad Data; Ground Truth; Bad Metadata

Abstract:
There are many challenges in developing and evaluating methods including: real-world cost and infeasibility of verifying ground truth, non-isotropic covariance, near-real-time operation, challenges with time, bad data, bad metadata, and other quality factors. In this paper, we demonstrate the challenges of evaluating spatio-temporal data quality methods for weather sensor data via a method we developed and other popular, interpolation-based methods to conduct model-based outlier detection. We demonstrate that a multi-faceted approach is necessary to counteract the impact of outliers. We demonstrate the challenges of evaluation in the presence of incorrect labels of good and bad data. We also investigate, in depth, the challenge of identifying mis-located sites.

Pages: 181 to 197

Copyright: Copyright (c) to authors, 2019. Used with permission.

Publication date: December 30, 2019

Published in: journal

ISSN: 1942-261x