Non-responses leading to missing data are common in most studies and
causes inefficient and biased statistical inferences if ignored. When faced
with missing data, many studies choose to employ complete case analysis
approach to estimate the parameters of the model. This however compromises on
the susceptibility of the estimates to reduced bias and minimum variance as
expected. Several classical and model based techniques of imputing the missing
values have been mentioned in literature. Bayesian approach to missingness is deemed superior amongst the other techniques
through its natural self-lending to missing data settings where the
missing values are treated as unobserved random variables that have a
distribution which depends on the observed data. This paper digs up the
superiority of Bayesian imputation to Multiple Imputation with Chained
Equations (MICE) when estimating logistic panel data models with single fixed
effects. The study validates the superiority of conditional maximum likelihood
estimates for nonlinear binary choice logit panel model in the presence of
missing observations. A Monte Carlo simulation was designed to determine the
magnitude of bias and root mean square errors (RMSE) arising from MICE and Full
Bayesian imputation. The simulation results show that the conditional maximum
likelihood (ML) logit estimator presented in this paper is less biased and more
efficient when Bayesian imputation is performed to curb non-responses.
References
[1]
Donders, A.R.T., van der Heijden, G.J., Stijnen, T. and Moons, K.G.M. (2006) Review: A Gentle Introduction to Imputation of Missing Values. Journal of Clinical Epidemiology, 59, 1087-1091. https://doi.org/10.1016/j.jclinepi.2006.01.014
[2]
Janssen, K.J.M., Donders, A.R.T., Harrell Jr., F.E., Vergouwe, Y., Chen, Q., Grobbee, D.E. and Moons, K.G. (2010) Missing Covariate Data in Medical Research: To Impute Is Better than to Ignore. Journal of Clinical Epidemiology, 63, 721-727.
https://doi.org/10.1016/j.jclinepi.2009.12.008
[3]
Knol, M.J., Janssen, K.J., Donders, A.R.T., Egberts, A.C., Heerdink, E.R., Grobbee, D.E., Moons, K.G. and Geerlings, M.I. (2010) Unpredictable Bias When Using the Missing Indicator Method or Complete Case Analysis for Missing Confounder Values: An Empirical Example. Journal of Clinical Epidemiology, 63, 728-736.
https://doi.org/10.1016/j.jclinepi.2009.08.028
[4]
van Buuren, S. (2012) Flexible Imputation of Missing Data. Chapman & Hall/CRC Interdisciplinary Statistics, CRC Press Taylor & Francis Group, Boca Raton.
[5]
Rubin, D. (1987) Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons, Inc., Hoboken. https://doi.org/10.1002/9780470316696
[6]
Moons, K.G.M., Donders, R.A., Stijnen, T. and Harrell Jr., F.E. (2006) Using the Outcome for Imputation of Missing Predictor Values Was Preferred. Journal of Clinical Epidemiology, 59, 1092-1101. https://doi.org/10.1016/j.jclinepi.2006.01.009
[7]
Ibrahim, J.G., Chen, M.-H. and Lipsitz, S.R. (2002) Bayesian Methods for Generalized Linear Models with Covariates Missing at Random. Canadian Journal of Statistics, 30, 55-78. https://doi.org/10.2307/3315865
[8]
Stubbendick, A.L. and Ibrahim, J.G. (2003) Maximum Likelihood Methods for Nonignorable Missing Responses and Covariates in Random Effects Models. Biometrics, 59, 1140-1150. https://doi.org/10.1111/j.0006-341X.2003.00131.x
[9]
Chen, B., Grace, Y.Y. and Cook, R.J. (2010) Weighted Generalized Estimating Functions for Longitudinal Response and Covariate Data That Are Missing at Random. Journal of the American Statistical Association, 105, 336-353.
https://doi.org/10.1198/jasa.2010.tm08551
[10]
Chen, B. and Zhou, X.-H. (2011) Doubly Robust Estimates for Binary Longitudinal Data Analysis with Missing Response and Missing Covariates. Biometrics, 67, 830-842. https://doi.org/10.1111/j.1541-0420.2010.01541.x
[11]
Chamberlain, G. (1984) Panel Data. In: Chamberlai, G., Ed., Handbook of Econometrics, Vol. 2, Elsevier, Amsterdam.
[12]
Seaman, S., Galati, J., Jackson, D. and Carlin, J. (2013) What Is Meant by “Missing at Random”? Statistical Science, 28, 257-268. https://doi.org/10.1214/13-STS415
[13]
Rubin, D.B. (2004) Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons Inc., New York.
[14]
van Buuren, S. and Groothuis-Oudshoorn, K. (2011) MICE: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45, 1-67.
https://doi.org/10.18637/jss.v045.i03
[15]
Carpenter, J.R. and Kenward, M.G. (2013) Multiple Imputation and Its Application. John Wiley & Sons, Ltd., Chichester. https://doi.org/10.1002/9781119942283
[16]
Little, R. and Rubin, D. (1987) Statistical Analysis with Missing Data. John Wiley & Sons, Inc., Hoboken.
[17]
Bartlett, J.W., Seaman, S.R., White, I.R. and Carpenter, J.R. (2015) Multiple Imputation of Covariates by Fully Conditional Specification: Accommodating the Substantive Model. Statistical Methods in Medical Research, 24, 462-487.
https://doi.org/10.1177/0962280214521348
[18]
Chen, M.-H. and Ibrahim, J.G. (2001) Maximum Likelihood methods for Cure Rate Models with Missing Covariates. Biometrics, 57, 43-52.
https://doi.org/10.1111/j.0006-341X.2001.00043.x
[19]
Zhao, J.H. and Schafer, J.L. (2015) Pan: Multiple Imputation for Multivariate Panel or Clustered Data. https://cran.r-project.org/web/packages/pan/pan.pdf
[20]
Bartlett, J.W. and Morris, T.P. (2015) smcfcs: Multiple Imputation of Covariates by Substantive-Model Compatible Fully Conditional Specification. The Stata Journal: Promoting communications on statistics and Stata, 15, 437-456.
https://doi.org/10.1177/1536867X1501500206
[21]
Lunn, D.J., Thomas, A., Best, N. and Spiegelhalter, D. (2000) WinBUGS—A Bayesian Modelling Framework: Concepts, Structure, and Extensibility. Statistics and Computting, 10, 325-337. https://doi.org/10.1023/A:1008929526011
[22]
Plummer, M. (2003) JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling. Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), Vienna, 20-22 March 2003.
[23]
Opeyo, P.O., Olubusoye, O.E. and Odongo, L.O. (2014) Conditional Maximum Likelihood Estimation for Logit Panel Models with Non-Responses. International Journal of Science and Research, 3, 2242-2254.