全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Dealing with Multicollinearity in Factor Analysis: The Problem, Detections, and Solutions

DOI: 10.4236/ojs.2023.133020, PP. 404-424

Keywords: Multicollinearity, Factor Analysis, Biased Factor Loadings, Unreliable Factor Structure, Reduced Stability, Variance Inflation Factor

Full-Text   Cite this paper   Add to My Lib

Abstract:

Multicollinearity in factor analysis has negative effects, including unreliable factor structure, inconsistent loadings, inflated standard errors, reduced discriminant validity, and difficulties in interpreting factors. It also leads to reduced stability, hindered factor replication, misinterpretation of factor importance, increased parameter estimation instability, reduced power to detect the true factor structure, compromised model fit indices, and biased factor loadings. Multicollinearity introduces uncertainty, complexity, and limited generalizability, hampering factor analysis. To address multicollinearity, researchers can examine the correlation matrix to identify variables with high correlation coefficients. The Variance Inflation Factor (VIF) measures the inflation of regression coefficients due to multicollinearity. Tolerance, the reciprocal of VIF, indicates the proportion of variance in a predictor variable not shared with others. Eigenvalues help assess multicollinearity, with values greater than 1 suggesting the retention of factors. Principal Component Analysis (PCA) reduces dimensionality and identifies highly correlated variables. Other diagnostic measures include the condition number and Cook’s distance. Researchers can center or standardize data, perform variable filtering, use PCA instead of factor analysis, employ factor scores, merge correlated variables, or apply clustering techniques for the solution of the multicollinearity problem. Further research is needed to explore different types of multicollinearity, assess method effectiveness, and investigate the relationship with other factor analysis issues.

References

[1]  Hair, J.F., Black, W.C., Babin, B.J. and Anderson, R.E. (2019) Multivariate Data Analysis. 8th Edition, Pearson, Upper Saddle River.
[2]  Costello, A.B. and Osborne, J.W. (2005) Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most from Your Analysis. Practical Assessment, Research & Evaluation, 10, 1-9.
[3]  Fabrigar, L.R., Wegener, D.T., MacCallum, R.C. and Strahan, E.J. (1999) Evaluating the Use of Exploratory Factor Analysis in Psychological Research. Psychological Methods, 4, 272-299.
https://doi.org/10.1037/1082-989X.4.3.272
[4]  Williams, B., Onsman, A. and Brown, T. (2010) Exploratory Factor Analysis: A Five-Step Guide for Novices. Australasian Journal of Paramedicine, 8, 1-13.
https://doi.org/10.33151/ajp.8.3.93
[5]  Field, A.P. (2018) Discovering Statistics Using IBM SPSS Statistics. 5th Edition, Sage, Newbury Park.
[6]  Thompson, B. (2004) Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications. American Psychological Association, Washington DC.
https://doi.org/10.1037/10694-000
[7]  Beavers, A.S., Lounsbury, J.W., Richards, J.K., Huck, S.W., Skolits, G.J. and Esquivel, S.L. (2013) Practical Considerations for Using Exploratory Factor Analysis in Educational Research. Practical Assessment, Research, and Evaluation, 18, 6.
[8]  Henson, R.K. and Roberts, J.K. (2006) Use of Exploratory Factor Analysis in Published Research: Common Errors and Some Comment on Improved Practice. Educational and Psychological Measurement, 66, 393-416.
https://doi.org/10.1177/0013164405282485
[9]  Ledesma, R.D., Ferrando, P.J., Trógolo, M.A., Poó, F.M., Tosi, J.D. and Castro, C. (2021) Exploratory Factor Analysis in Transportation Research: Current Practices and Recommendations. Transportation Research Part F: Traffic Psychology and Behaviour, 78, 340-352.
https://doi.org/10.1016/j.trf.2021.02.021
[10]  Daoud, J.I. (2017, December) Multicollinearity and Regression Analysis.
https://doi.org/10.1088/1742-6596/949/1/012009
[11]  Alin, A. (2017) Multicollinearity. Wiley StatsRef: Statistics Reference Online.
[12]  Marquardt, D.W. (1970) Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation. Technometrics, 12, 591-612.
https://doi.org/10.2307/1267205
[13]  Belsley, D.A. (1991) Conditioning Diagnostics: Collinearity and Weak Data in Regression. John Wiley & Sons, Hoboken.
[14]  Porter, D.C. and Gujarati, D.N. (2009) Basic Econometrics. McGraw-Hill Irwin, New York.
[15]  Mickey, R.M. and Greenland, S. (1989) The Impact of Confounder Selection Criteria on Effect Estimation. American Journal of Epidemiology, 129, 125-137.
https://doi.org/10.1093/oxfordjournals.aje.a115101
[16]  Kline, R.B. (2015) Principles and Practice of Structural Equation Modeling. 4th Edition, Guilford Press, New York.
[17]  Hotelling, H. (1933) Analysis of a Complex of Statistical Variables into Principal Components. Journal of Educational Psychology, 24, 417-441.
https://doi.org/10.1037/h0071325
[18]  Kutner, M.H. (2005) Applied Linear Statistical Models.
[19]  Sulaiman, M.S., Abood, M.M., Sinnakaudan, S.K., Shukor, M.R., You, G.Q. and Chung, X.Z. (2021) Assessing and Solving Multicollinearity in Sediment Transport Prediction Models Using Principal Component Analysis. ISH Journal of Hydraulic Engineering, 27, 343-353.
https://doi.org/10.1080/09715010.2019.1653799
[20]  Johnson, R.A. and Wichern, D.W. (2007) Applied Multivariate Statistical Analysis. 6th Edition, Pearson Prentice Hall, Upper Saddle River.
[21]  Tabachnick, B.G. and Fidell, L.S. (2013) Using Multivariate Statistics. 6th Edition, Pearson, Upper Saddle River.
[22]  Sass, D.A., Schmitt, T.A. and Marsh, H.W. (2018) Evaluating Model Fit with Ordered Categorical Data within a Measurement Invariance Framework: A Comparison of Estimators. Structural Equation Modeling: A Multidisciplinary Journal, 25, 604-619.
[23]  Hoffman, L. and Rovine, M.J. (2015) Multilevel Models for the Experimental Psychologist: Foundations and Illustrative Examples. Behavior Research Methods, 47, 967-978.
[24]  Mindrila, D. (2010) Maximum Likelihood (ML) and Diagonally Weighted Least Squares (DWLS) Estimation Procedures: A Comparison of Estimation Bias with Ordinal and Multivariate Non-Normal Data. International Journal of Digital Society, 1, 60-66.
https://doi.org/10.20533/ijds.2040.2570.2010.0010
[25]  Reise, S.P., Bonifay, W.E. and Haviland, M.G. (2013) Scoring and Modeling Psychological Measures in the Presence of Multidimensionality. Journal of Personality Assessment, 95, 129-140.
https://doi.org/10.1080/00223891.2012.725437
[26]  Kelava, A., Moosbrugger, H., Dimitruk, P. and Schermelleh-Engel, K. (2008) Multicollinearity and Missing Constraints: A Comparison of Three Approaches for the Analysis of Latent Nonlinear Effects. Methodology, 4, 51-66.
https://doi.org/10.1027/1614-2241.4.2.51
[27]  R Core Team (2021) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna.
[28]  Beauducel, A. and Hilger, N. (2017) On the Bias of Factor Score Determinacy Coefficients Based on Different Estimation Methods of the Exploratory Factor Model. Communications in Statistics-Simulation and Computation, 46, 6144-6154.
https://doi.org/10.1080/03610918.2016.1197247
[29]  Hallgren, K.A. (2012) Computing Inter-Rater Reliability for Observational Data: An Overview and Tutorial. Tutorials in Quantitative Methods for Psychology, 8, 23-34.
https://doi.org/10.20982/tqmp.08.1.p023
[30]  Kolenikov, S. and Angeles, G. (2009) Socioeconomic Status Measurement with Discrete Proxy Variables: Is Principal Component Analysis a Reliable Answer? Review of Income and Wealth, 55, 128-165.
https://doi.org/10.1111/j.1475-4991.2008.00309.x
[31]  Auerswald, M. and Moshagen, M. (2019) How to Determine the Number of Factors to Retain in Exploratory Factor Analysis: A Comparison of Extraction Methods under Realistic Conditions. Psychological Methods, 24, 468-491.
https://doi.org/10.1037/met0000200
[32]  Appelbaum, M.I. and Cramer, E.M. (1974) Some Problems in the Nonorthogonal Analysis of Variance. Psychological Bulletin, 81, 335-343.
https://doi.org/10.1037/h0036315
[33]  Gardner, M.J. and Altman, D.G. (1986) Confidence Intervals Rather than P Values: Estimation Rather than Hypothesis Testing. BMJ (Clinical Research ed.), 292, 746-750.
https://doi.org/10.1136/bmj.292.6522.746
[34]  Fox, J. and Weisberg, S. (2018) An R Companion to Applied Regression. Sage Publications.
[35]  Strang, G. (2006) Linear Algebra and Its Applications. Thomson, Brooks/Cole, Belmont, CA.
[36]  Lax, P.D. (2007) Linear Algebra and Its Applications (Vol. 78). John Wiley & Sons, Hoboken.
[37]  Jolliffe, I.T. (2002) Principal Component Analysis. John Wiley & Sons, Hoboken.
[38]  Weisberg, S. (2014) Applied Linear Regression. John Wiley & Sons, Hoboken.
[39]  Cook, R.D. (1977) Detection of Influential Observation in Linear Regression. Technometrics, 19, 15-18.
https://doi.org/10.1080/00401706.1977.10489493
[40]  Joreskog, K.G. and Sorbom, D. (1996) LISREL 8: User’s Reference Guide. Scientific Software International, Lincolnwood.
[41]  Preacher, K.J., Curran, P.J. and Bauer, D.J. (2007) Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis. Journal of Educational and Behavioral Statistics, 31, 437-448.
https://doi.org/10.3102/10769986031004437
[42]  Enders, C.K. (2001) A Primer on Maximum Likelihood Algorithms Available for Use with Missing Data. Structural Equation Modeling: A Multidisciplinary Journal, 8, 128-141.
https://doi.org/10.1207/S15328007SEM0801_7
[43]  Aiken, L.S. and West, S.G. (1991) Multiple Regression: Testing and Interpreting Interactions. Sage, Newbury Park.
[44]  Kuhn, M. (2021) Caret: Classification and Regression Training. R Package Version 6.0-88.
https://CRAN.R-project.org/package=caret
[45]  Kim, J.O. and Mueller, C.W. (1978) Factor Analysis: Statistical Methods and Practical Issues (Vol. 14). SAGE Publications, Inc., Thousand Oaks.
https://doi.org/10.4135/9781412984256
[46]  Osborne, J.W. and Waters, E. (2002) Four Assumptions of Multiple Regression That Researchers Should Always Test. Practical Assessment, Research & Evaluation, 8, 1-9.
[47]  Abdi, H. and Williams, L.J. (2010) Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 433-459.
https://doi.org/10.1002/wics.101
[48]  Geladi, P. and Kowalski, B.R. (1986) Partial Least-Squares Regression: A Tutorial. Analytica Chimica Acta, 185, 1-17.
https://doi.org/10.1016/0003-2670(86)80028-9
[49]  Jackson, D.A. (1991) Stopping Rules in Principal Components Analysis: A Comparison of Heuristical and Statistical Approaches. Ecology, 74, 2204-2214.
https://doi.org/10.2307/1939574
[50]  Harrington, D. (2009) Confirmatory Factor Analysis. Oxford University Press, Oxford.
https://doi.org/10.1093/acprof:oso/9780195339888.001.0001
[51]  Soper, D.S. (2021) Merging Variables in SPSS and R.
https://www.statskingdom.com/220merge.html
[52]  Everitt, B.S., Landau, S. and Leese, M. (2011) Cluster Analysis. 4th Edition, Arnold Publishers, London.
https://doi.org/10.1002/9780470977811
[53]  Bollen, K.A. and Lennox, R. (1991) Conventional Wisdom on Measurement: A Structural Equation Perspective. Psychological Bulletin, 110, 305-314.
https://doi.org/10.1037/0033-2909.110.2.305
[54]  Graham, J.M. (2003) Adding Missing-Data-Relevant Variables to FIML-Based Structural Equation Models. Structural Equation Modeling, 10, 80-100.
https://doi.org/10.1207/S15328007SEM1001_4

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413