Home // SIMUL 2022, The Fourteenth International Conference on Advances in System Simulation // View article


Experimental Comparison of Some Multiple Imputation Methods From the R Package Mice

Authors:
Wim De Mulder

Keywords: multiple imputation, interval score, R package mice

Abstract:
Missing values is an annoying, but common, artifact of many real-world data sets. The most convenient solution is to simply discard the variables with missing values. This is, however, not a risk-free operation, as it may entail the elimination of useful information, while under certain circumstances ignoring missing data may even introduce bias in downstream statistical inferences. A more statistically valid approach is to employ multiple imputation to impute plausible values at locations where values are missing. This paper provides an experimental comparison of some multiple imputation methods from the R package mice on two real-world data sets. Our analysis suggests some interesting hypotheses, e.g., that the absolute number of missing values is of more profound influence on the performance of imputation methods than the relative number of missing values. From the analysis, we draw some guidelines for data analysts who intend to impute missing values. Our work is also of particular relevance for statisticians, as most statistical analyses require complete data.

Pages: 8 to 14

Copyright: Copyright (c) IARIA, 2022

Publication date: October 16, 2022

Published in: conference

ISSN: 2308-4537

ISBN: 978-1-68558-001-8

Location: Lisbon, Portugal

Dates: from October 16, 2022 to October 20, 2022