全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Superiority of Bayesian Imputation to Mice in Logit Panel Data Models

DOI: 10.4236/ojs.2023.133017, PP. 316-358

Keywords: Panel Data, Imputation, Monte Carlo, Bias, Conditional Maximum Likelihood

Full-Text   Cite this paper   Add to My Lib

Abstract:

Non-responses leading to missing data are common in most studies and causes inefficient and biased statistical inferences if ignored. When faced with missing data, many studies choose to employ complete case analysis approach to estimate the parameters of the model. This however compromises on the susceptibility of the estimates to reduced bias and minimum variance as expected. Several classical and model based techniques of imputing the missing values have been mentioned in literature. Bayesian approach to missingness is deemed superior amongst the other techniques through its natural self-lending to missing data settings where the missing values are treated as unobserved random variables that have a distribution which depends on the observed data. This paper digs up the superiority of Bayesian imputation to Multiple Imputation with Chained Equations (MICE) when estimating logistic panel data models with single fixed effects. The study validates the superiority of conditional maximum likelihood estimates for nonlinear binary choice logit panel model in the presence of missing observations. A Monte Carlo simulation was designed to determine the magnitude of bias and root mean square errors (RMSE) arising from MICE and Full Bayesian imputation. The simulation results show that the conditional maximum likelihood (ML) logit estimator presented in this paper is less biased and more efficient when Bayesian imputation is performed to curb non-responses.

References

[1]  Donders, A.R.T., van der Heijden, G.J., Stijnen, T. and Moons, K.G.M. (2006) Review: A Gentle Introduction to Imputation of Missing Values. Journal of Clinical Epidemiology, 59, 1087-1091.
https://doi.org/10.1016/j.jclinepi.2006.01.014
[2]  Janssen, K.J.M., Donders, A.R.T., Harrell Jr., F.E., Vergouwe, Y., Chen, Q., Grobbee, D.E. and Moons, K.G. (2010) Missing Covariate Data in Medical Research: To Impute Is Better than to Ignore. Journal of Clinical Epidemiology, 63, 721-727.
https://doi.org/10.1016/j.jclinepi.2009.12.008
[3]  Knol, M.J., Janssen, K.J., Donders, A.R.T., Egberts, A.C., Heerdink, E.R., Grobbee, D.E., Moons, K.G. and Geerlings, M.I. (2010) Unpredictable Bias When Using the Missing Indicator Method or Complete Case Analysis for Missing Confounder Values: An Empirical Example. Journal of Clinical Epidemiology, 63, 728-736.
https://doi.org/10.1016/j.jclinepi.2009.08.028
[4]  van Buuren, S. (2012) Flexible Imputation of Missing Data. Chapman & Hall/CRC Interdisciplinary Statistics, CRC Press Taylor & Francis Group, Boca Raton.
[5]  Rubin, D. (1987) Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons, Inc., Hoboken.
https://doi.org/10.1002/9780470316696
[6]  Moons, K.G.M., Donders, R.A., Stijnen, T. and Harrell Jr., F.E. (2006) Using the Outcome for Imputation of Missing Predictor Values Was Preferred. Journal of Clinical Epidemiology, 59, 1092-1101.
https://doi.org/10.1016/j.jclinepi.2006.01.009
[7]  Ibrahim, J.G., Chen, M.-H. and Lipsitz, S.R. (2002) Bayesian Methods for Generalized Linear Models with Covariates Missing at Random. Canadian Journal of Statistics, 30, 55-78.
https://doi.org/10.2307/3315865
[8]  Stubbendick, A.L. and Ibrahim, J.G. (2003) Maximum Likelihood Methods for Nonignorable Missing Responses and Covariates in Random Effects Models. Biometrics, 59, 1140-1150.
https://doi.org/10.1111/j.0006-341X.2003.00131.x
[9]  Chen, B., Grace, Y.Y. and Cook, R.J. (2010) Weighted Generalized Estimating Functions for Longitudinal Response and Covariate Data That Are Missing at Random. Journal of the American Statistical Association, 105, 336-353.
https://doi.org/10.1198/jasa.2010.tm08551
[10]  Chen, B. and Zhou, X.-H. (2011) Doubly Robust Estimates for Binary Longitudinal Data Analysis with Missing Response and Missing Covariates. Biometrics, 67, 830-842.
https://doi.org/10.1111/j.1541-0420.2010.01541.x
[11]  Chamberlain, G. (1984) Panel Data. In: Chamberlai, G., Ed., Handbook of Econometrics, Vol. 2, Elsevier, Amsterdam.
[12]  Seaman, S., Galati, J., Jackson, D. and Carlin, J. (2013) What Is Meant by “Missing at Random”? Statistical Science, 28, 257-268.
https://doi.org/10.1214/13-STS415
[13]  Rubin, D.B. (2004) Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons Inc., New York.
[14]  van Buuren, S. and Groothuis-Oudshoorn, K. (2011) MICE: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45, 1-67.
https://doi.org/10.18637/jss.v045.i03
[15]  Carpenter, J.R. and Kenward, M.G. (2013) Multiple Imputation and Its Application. John Wiley & Sons, Ltd., Chichester.
https://doi.org/10.1002/9781119942283
[16]  Little, R. and Rubin, D. (1987) Statistical Analysis with Missing Data. John Wiley & Sons, Inc., Hoboken.
[17]  Bartlett, J.W., Seaman, S.R., White, I.R. and Carpenter, J.R. (2015) Multiple Imputation of Covariates by Fully Conditional Specification: Accommodating the Substantive Model. Statistical Methods in Medical Research, 24, 462-487.
https://doi.org/10.1177/0962280214521348
[18]  Chen, M.-H. and Ibrahim, J.G. (2001) Maximum Likelihood methods for Cure Rate Models with Missing Covariates. Biometrics, 57, 43-52.
https://doi.org/10.1111/j.0006-341X.2001.00043.x
[19]  Zhao, J.H. and Schafer, J.L. (2015) Pan: Multiple Imputation for Multivariate Panel or Clustered Data.
https://cran.r-project.org/web/packages/pan/pan.pdf
[20]  Bartlett, J.W. and Morris, T.P. (2015) smcfcs: Multiple Imputation of Covariates by Substantive-Model Compatible Fully Conditional Specification. The Stata Journal: Promoting communications on statistics and Stata, 15, 437-456.
https://doi.org/10.1177/1536867X1501500206
[21]  Lunn, D.J., Thomas, A., Best, N. and Spiegelhalter, D. (2000) WinBUGS—A Bayesian Modelling Framework: Concepts, Structure, and Extensibility. Statistics and Computting, 10, 325-337.
https://doi.org/10.1023/A:1008929526011
[22]  Plummer, M. (2003) JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling. Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), Vienna, 20-22 March 2003.
[23]  Opeyo, P.O., Olubusoye, O.E. and Odongo, L.O. (2014) Conditional Maximum Likelihood Estimation for Logit Panel Models with Non-Responses. International Journal of Science and Research, 3, 2242-2254.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413