Challenges and Implications of Missing Data on the Validity of Inferences and Options for Choosing the Right Strategy in Handling Them
International Journal of Statistical Distributions and Applications
Volume 3, Issue 4, December 2017, Pages: 87-94
Received: Sep. 29, 2017; Accepted: Oct. 17, 2017; Published: Nov. 20, 2017
Views 2284      Downloads 140
Nicholas Pindar Dibal, Department of Mathematical Sciences, University of Maiduguri, Maiduguri, Nigeria
Ray Okafor, Department of Mathematics, University of Lagos, Lagos, Nigeria
Hamadu Dallah, Department of Actuarial Science, University of Lagos, Lagos, Nigeria
Article Tools
Follow on us
Missing data in surveys and experimental research is a common occurrence which has serious implications on the validity of inferences. Advances in statistical procedures provides better and efficient methods of handling missing data yet many researches still handle incomplete data in ways that affects the results negatively. We review in detail the mechanisms that generates missingness, and the appropriate methods to account for the missing values to enable the researcher have adequate knowledge to make informed decision on the choice of method to account for missingness.
Missing Data, Inference, Missingness Mechanisms, Ignorable, Non-Ignorable Missingness, Multiple Imputation
To cite this article
Nicholas Pindar Dibal, Ray Okafor, Hamadu Dallah, Challenges and Implications of Missing Data on the Validity of Inferences and Options for Choosing the Right Strategy in Handling Them, International Journal of Statistical Distributions and Applications. Vol. 3, No. 4, 2017, pp. 87-94. doi: 10.11648/j.ijsd.20170304.15
Copyright © 2017 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License ( which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Acock, A. C. (2005). Working with missing values. Journal of Marriage and Family. 67; 1012-1028.
Ader, H. J. {2008). Missing data. In Ader, H. J. & Mellenbergh, G. J. (Eds). Advising on research Methods: A consultant’s companion. (pp. 305-332). Huizen, The Netherlands: Johannes van Kessel Publishing.
Allison, P. D. (2000). Multiple imputation for missing data: A cautionary tale. Sociological Methods & Research, 28 (3); 301 – 309.
Allison, P. D. (2003). Missing data techniques for structural equation modeling. Journal of Abnormal Psychology. 112 (4); 545-557.
Beasley, T. M. (1988). Comments on the analysis of data with missing values. Multiple Linear Regression Viewpoints. 25, 40-44.
Blackwell, M., Honaker, J. and King, G (2011). Multiple Over imputation: A Unified Approach to Measurement Error and Missing Data.
Bolotin, A. (2010). Anew method of multiple imputation for completely (or almost completely) missing data. Proceeding MACMESE'10 Proceedings of the 2th WSEASinternational conference on Mathematical and computational methods in science and engineering.
Burke, S. (1998). Missing values, outliers, robust statistics & non-parametric methods. LC•GC Europe Online Supplement. Scientific Data Management. 2 (2), 19–24.
Carpenter, J. R. (2010). Statistical modeling with missing data using multiple Imputation. www. missingdata.
Carpenter, J. R. and Kenward, M. G. (2008). Missing data in randomized controlled trials-a practical guide. Birmingham: National Health Service Coordinating Centre for Research Methodology,
Carter, R. L. (2006). Solutions for missing data in structural equation modeling. Research & Practice in Assessment. 1 (1); 20-27.
Chen, S. (2014) "Imputation of missing values using quantile regression". Unpublished Graduate Theses and Dissertations. Iowa State University.
Chiu-Hsieh, H., He, Y., Li, Y., Long, Q. and Friese, R. (2016). Doubly robust multiple imputation using kernel-based techniques. Biom J. 58(3): 588-606.
Davey, A. and Savla, J. (2010). Statistical power analysis with missing data: A structural equation modeling approach. NY: Routledge; 47-65.
Enders, C. K. (2006). A primer on the use of modern missing-data methods in psychosomatic medicine research. Psychosomatic Medicine, 68;.427-436.
Fisher, A. and Waclawski, A. (2009). A survey of techniques for identifying and handling outliers and missing values in time series data. 29th International Symposium on Forecasting. Hong Kong.
Foster, P. J., Mami, M. A. and Bala, A. M. (2009). On treatment of the multivariate missing data. Research Report No. 13, Probability and Statistics Group, School of Mathematics. The University of Manchester.
Glas, C. A. W. and Pimentel, J. L. (2008). Modeling Non-ignorable Missing Data in Speeded Tests. Educational and Psychological Measurement 68 (6), 907-922.
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60; 549-576.
Graham, J. W., Cumsille, P. E. and Elek-Fisk, E. (2003). Methods for handling missing data. In J. A. Schinka & W. F. Velicer (Eds.), Research Methods in Psychology (pp. 87-114). Handbook of Psychology, New York: John Wiley & Sons.
Graham, J. W., Olchowski, A. E., and Gilreath, T. D. (2007). How many imputations are really needed: Some practical clarifications of multiple imputation theory. Prevention Science, 8, 206-213.
He, Y. (2010). Missing data analysis: Getting to the heart of the matter. Journal of the American Heart Association. 3; 98-105.
Horton, N. J. and Kleinman, K. P. (2007). Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models. The American Statistician, 61(1), 89-90.
Kang, H. (2013). The prevention and handling of the missing data. Korean J Anesthesiol.; 64(5): 402–406.
Kenward, M. (2007). Missing Data with MLwiN: An overview. A Paper Presented at Researcher Development Initiative Workshop. London School of Hygiene and Tropical Medicine.
Kim, J. O. and Curry, J. (1977). The treatment of missing data in multivariate analysis. Sociological Methods & Research, 6 (2); 215-240.
Litttle, R. J. A. (1988). A test of missing completely at random data with missing values. Journal of the America Statistical Association. 83 (404); 1198-1202.
Little, R. J. A. and Rubin, D. B. (2002). Statistical analysis with missing data, (2nded.). New York: John Wiley & Sons.
Okafor, R. (1982). Bias due to logistic non-response in sample survey. (Unpublished Ph. D. Thesis Submitted to the Department of Statistics, Harvard University. Cambridge, Massachusetts).
Patrcian, A. P. (2002). Focus on research methods: Multiple imputation for missing data. Research in Nursing and Health, 25, 76-84.
Peng, C. Y., Harwell, M. R., Liou, S. M., & Ehman, L. H. (2006). Advances in missing data methods and implications for educational research. In S. S. Sawilowsky (Ed.), Real Data Analysis. (pp. 31-78). New York.
Pigott, T. D. (2001). A review of methods for missing data. Educational Research and Evaluation, 7 (4); 353-383.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63; 581–592.
Rubin, D. B. (1987). Multiple imputation for non-response in Surveys. New York: Wiley.
Schafer, J. L. (1997). Analysis of incomplete multivariate data, New York: Chapman and Hall.
Schafer, J. L. and Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2); 147-177.
Schwartz, T., Chen, NY Q. and Duan, NY N. (2011). Studying missing data patterns using a SAS® Macro. Statistics and Data Analysis, SAS Global Forum. Paper 339.
Song, Q., Shepperd, M., Cartwright M. and Twala, B. (2005), A New imputation method for small software project data sets.
Stratton, I. M., and Aldington, S. J. (2007). Missing data means lost opportunities. Journal of Clinical Research Best Practices. 3, (5).
Stuart, E. A., Azur, M., Frangakis, C. and Leaf, P. (2009). Multiple imputation with large data set: A case study of the children’s mental health initiative. American Journal of Epidemiology, 169 (9); 1133-1139.
Swetha, S. (2016). An Integral Study on Missing Value Data Imputation. International Journal of Engineering Sciences & Research Technology. 5(2). 356-365.
Todorov, V., Templ, M. and Filzmoser, P. (2011). Detection of multivariate outliers in business survey data with incomplete information. Advance Data Analyses and Classification. 5; 37-56.
vonHippel, P. T. (2013). Should a normal imputation model be modified to impute skewed variables? Sociological Methods and Research, 42 (1); 105-138.
Wayman, J. C. (2003). Multiple imputation for missing data: What is it and how can I use it? Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL.
Science Publishing Group
1 Rockefeller Plaza,
10th and 11th Floors,
New York, NY 10020
Tel: (001)347-983-5186