全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Measures of Variability for Qualitative Variables Using the R Software

DOI: 10.4236/ojs.2024.143013, PP. 259-293

Keywords: Variation Ratio, Relative Entropy, Index of Qualitative Variation, Standard Deviation from Mode, Bootstrap Confidence Interval

Full-Text   Cite this paper   Add to My Lib

Abstract:

Although there are many measures of variability for qualitative variables, they are little used in social research, nor are they included in statistical software. The aim of this article is to present six measures of variation for qualitative variables of simple calculation, as well as to facilitate their use by means of the R software. The measures considered are, on the one hand, Freeman’s variation ratio, Moral’s universal variation ratio, Kvalseth’s standard deviation from the mode, and Wilcox’s variation ratio which are most affected by proximity to a constant random variable, where the measures of variability for qualitative variables reach their minimum value of 0. On the other hand, the Gibbs-Poston index of qualitative variation and Shannon’s relative entropy are included, which are more affected by the proximity to a uniform distribution, where the measures of variability for qualitative variables reach their maximum value of 1. Point and interval estimation are addressed. Bootstrap by the percentile and bias-corrected and accelerated percentile methods are used to obtain confidence intervals. Two calculation situations are presented: with a sample mode and with two or more modes. The standard deviation from the mode among the six considered measures, and the universal variation ratio among the three variation ratios, are particularly recommended for use.

References

[1]  Wilcox, A.R. (1973) Indices of Qualitative Variation and Political Measurement. The Western Political Quarterly, 26, 325-343.
https://doi.org/10.1177/106591297302600209
[2]  Agresti, A. and Agresti, B.F. (1978) Statistical Analysis of Qualitative Variation. Sociological Methodology, 9, 204-237.
https://doi.org/10.2307/270810
[3]  Moralde la Rubia, J. (2022) Una Medida de Variación para Datos Cualitativos con Cualquier Tipo de Distribución [A Meausre of Variation for Qualitattive Data with Any Type of Distribution]. Psychologia, 16, 63-76.
https://doi.org/10.21500/19002386.5642
[4]  Levitt, H.M. (2021) Qualitative Generalization, Not to the Population but to the Phenomenon: Reconceptualizing Variation in Qualitative Research. Qualitative Psychology, 8, 95-110.
https://doi.org/10.1037/qup0000184
[5]  Maxwell, J.A. (2021) Why Qualitative Methods Are Necessary for Generalization. Qualitative Psychology, 8, 111-118.
https://doi.org/10.1037/qup0000173
[6]  Golan, A. and Harte, J. (2022) Information Theory: A Foundation for Complexity Science. Proceedings of the National Academy of Sciences of the United States of America, 119, e2119089119.
https://doi.org/10.1073/pnas.2119089119
[7]  Simpson, E.H. (1949) Measurement of Diversity. Nature, 163, 688.
https://doi.org/10.1038/163688a0
[8]  Li, Y., Garg, H. and Deng, Y. (2020) A New Uncertainty Measure of Discrete Z-Numbers. International Journal of Fuzzy Systems, 22, 760-776.
https://doi.org/10.1007/s40815-020-00819-8
[9]  Weiss, C.H. (2019) On the Sample Coefficient of Nominal Variation. In: Steland, A., Rafajłowicz, E. and Okhrin, O., Eds., Stochastic Models, Statistics and Their Applications, Springer, Cham, 239-250.
[10]  Freeman, L.C. (1965) Elementary Applied Statistics for Students in Behavioral Sciences. John Wiley and Sons, New York.
[11]  Kvalseth, T.O. (1988) Measuring Variation for Nominal Data. Bulletin of the Psychonomic Society, 26, 433-436.
https://doi.org/10.3758/BF03334906
[12]  Gibbs, J.P., and Poston Jr., D.L. (1975) The Division of Labor: Conceptualization and Related Measures. Social Forces, 53, 468-476.
https://doi.org/10.2307/2576589
[13]  Shannon, C.E. (1948) A Mathematical Theory of Communication. The Bell System Technical Journal, 27, 379-423.
https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
[14]  Deacon, D. and Stanyer, J. (2021) Media Diversity and the Analysis of Qualitative Variation. Communication and the Public, 6, 19-32.
https://doi.org/10.1177/20570473211006481
[15]  Evren, A., Tuna, E., Ustaoglu, E. and Sahin, B. (2021) Some Dominance Indices to Determine Market Concentration. Journal of Applied Statistics, 48, 2755-2775.
https://doi.org/10.1080/02664763.2021.1963421
[16]  Gini, C. (1912) Variabilitá e Mulabilitá: Contributo allo Studio delle Distribuzioni edelle Relazioni Statistiche. Tipografia di Paolo Cuppini, Bologna.
[17]  Keren, S., Svoboda, M., Janda, P. and Nagel, T.A. (2020) Relationships between Structural Indices and Conventional Stand Attributes in an Old-Growth Forest in Southeast Europe. Forests, 11, Article 4.
https://doi.org/10.3390/f11010004
[18]  Sharp, K. (2019). Entropy and the Tao of Counting: A Brief Introduction to Statistical Mechanics and the Second Law of Thermodynamics. Springer, Cham.
https://doi.org/10.1007/978-3-030-35457-2
[19]  Wald, A. (1939) Contributions of the Theory of Statistical Estimation and Testing Hypotheses. Annals of Mathematical Statistics, 10, 299-326.
https://doi.org/10.1214/aoms/1177732144
[20]  Zepeda-Tello, R., Schomaker, M., Maringe, C., Smith, M.J., Belot, A., Rachet, B., Schnitzer, M.E. and Luque-Fernandez, M.A. (2022) Delta Method in Epidemiology: An Applied and Reproducible Tutorial. arXiv: 2206.15310.
[21]  Janczyk, M. and Pfister, R. (2023) Confidence Intervals. In: Janczyk, M. and Pfister, R., Eds., Understanding Inferential Statistics: From A for Significance Test to Z for Confidence Interval, Springer, Heidelberg, 69-80.
https://doi.org/10.1007/978-3-662-66786-6_6
[22]  James, G., Witten, D., Hastie, T., Tibshirani, R. and Taylor, J. (2023) Resampling Methods. In: James, G., Witten, D., Hastie, T., Tibshirani, R. and Taylor, J., Eds., An Introduction to Statistical Learning, Springer, Cham, 201-228.
https://doi.org/10.1007/978-3-031-38747-0_5
[23]  Van de Schoot, R., Depaoli, S., King, R., Kramer, B., Märtens, K., Tadesse, M.G., Vannucci, M., Gelman, A., Veen, D., Willemsen, J. and Yau, C. (2021) Bayesian Statistics and Modelling. Natural Reviews Methods Primers, 1, Article No. 1.
https://doi.org/10.1038/s43586-020-00001-2
[24]  Blaker, H. (2000) Confidence Curves and Improved Exact Confidence Intervals for Discrete Distributions. Canadian Journal of Statistics, 28, 783-798.
https://doi.org/10.2307/3315916
[25]  Zelikman, E., Wu, Y., Mu, J. and Goodman, N. (2022) Star: Bootstrapping Reasoning with Reasoning. Advances in Neural Information Processing Systems, 35, 15476-15488.
[26]  Rousselet, G.A., Pernet, C.R. and Wilcox, R.R. (2021) The Percentile Bootstrap: A Primer with Step-by-Step Instructions in R. Advances in Methods and Practices in Psychological Science, 4.
https://doi.org/10.1177/2515245920911881
[27]  Efron, B. (1987) Better Bootstrap Confidence Intervals. Journal of the American Statistical Association, 82, 171-185.
https://doi.org/10.1080/01621459.1987.10478410
[28]  Canty, A. (2022) Package ‘boot’.
https://cran.r-project.org/web/packages/boot/boot.pdf
[29]  Freedman, D. and Diaconis, P. (1981) On the Histogram as a Density Estimator: L2 Theory. Probability Theory and Related Fields, 57, 453-476.
https://doi.org/10.1007/BF01025868
[30]  Hyndman, R.J. and Fan, Y. (1996) Sample Quantiles in Statistical Packages. American Statistician, 50, 361-365.
https://doi.org/10.1080/00031305.1996.10473566
[31]  Guajardo, S.A. (2024) Assessing Organizational Diversity with the Index of Qualitative Variation. Cambridge Scholars Publishing, Cambridge.
[32]  Feutrill, A. and Roughan, M. (2021) A Review of Shannon and Differential Entropy Rate Estimation. Entropy, 23, Article 1046.
https://doi.org/10.3390/e23081046
[33]  Efron, B. and Narasimhan, B. (2020) The Automatic Construction of Bootstrap Confidence Intervals. Journal of Computational and Graphical Statistics, 29, 608-619.
https://doi.org/10.1080/10618600.2020.1714633
[34]  Ramachandran, K.M. and Tsokos, C.P. (2020) Mathematical Statistics with Applications in R. Academic Press, San Diego.
[35]  Coolidge, F.L. (2020) Statistics: A Gentle Introduction. 4th Edition, Sage Publications, Thousand Oaks.
https://doi.org/10.4135/9781071939000
[36]  Poncet, P. (2022) Package ‘Modeest’. Mode Estimation.
https://cran.r-project.org/web/packages/modeest/modeest.pdf
[37]  Banić, N. and Elezović, N. (2021) TVOR: Finding Discrete Total Variation Outliers among Histograms. IEEE Access, 9, 1807-1832.
https://doi.org/10.1109/ACCESS.2020.3047342
[38]  Nielsen, F. and Nock, R. (2017) MaxEnt Upper Bounds for the Differential Entropy of Univariate Continuous Distributions. IEEE Signal Processing Letters, 24, 402-406.
https://doi.org/10.1109/LSP.2017.2666792
[39]  Moral de la Rubia, J. (2023) Shape Measures for the Distribution of a Qualitative Variable. Open Journal of Statistics, 13, 619-634.
https://doi.org/10.4236/ojs.2023.134030
[40]  Grubbs, F.E. (1950) Sample Criteria for Testing Outlying Observations. Annals of Mathematical Statistics, 21, 27-58.
https://doi.org/10.1214/aoms/1177729885
[41]  Dixon, W.J. (1951) Ratios Involving Extreme Values. Annals of Mathematical Statistics, 22, 68-78.
https://doi.org/10.1214/aoms/1177729693
[42]  Rosner, B. (1983) Percentage Points for a Generalized ESD Many-Outlier Procedure. Technometrics, 25, 165-172.
https://doi.org/10.1080/00401706.1983.10487848
[43]  Nowak-Brzezińska, A. and Łazarz, W. (2021) Qualitative Data Clustering to Detect Outliers. Entropy, 23, Article 869.
https://doi.org/10.3390/e23070869
[44]  Kvålseth, T.O. (1991) Statistical Inference for the Odds Measure of Qualitative Variation. Perceptual and Motor Skills, 72, 115-118.
https://doi.org/10.2466/pms.1991.72.1.115
[45]  Fossett, M. (2017) New Methods for Measuring and Analyzing Segregation. Springer, Cham.
https://doi.org/10.1007/978-3-319-41304-4
[46]  Hutchens, R.M. (2004) One Measure of Segregation. International Economic Review, 45, 555-578.
https://doi.org/10.1111/j.1468-2354.2004.00136.x
[47]  Chakravarty, S.R. and Silber, J. (2007) A Generalized Index of Employment Segregation. Mathematical Social Sciences, 53, 185-195.
https://doi.org/10.1016/j.mathsocsci.2006.11.003

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413