The normal distribution, which has a symmetric and middle-tailed profile,
is one of the most important distributions in probability theory, parametric
inference, and description of quantitative variables. However, there are many
non-normal distributions and knowledge of a non-zero bias allows their
identification and decision making regarding the use of techniques and
corrections. Pearson’s skewness coefficient defined as the standardized signed
distance from the arithmetic mean to the median is very simple to calculate and
clear to interpret from the normal distribution model, making it an excellent
measure to evaluate this assumption, complemented with the visual inspection by
means of a histogram and a box-and-whisker plot. From its variant without tripling the numerator or
Yule’s skewness coefficient, the objective of this methodological article is to
facilitate the use of this latter measure, presenting how to obtain asymptotic
and bootstrap confidence intervals for its interpretation. Not only are the formulas shown, but they are applied with an example
using R program. A general rule of interpretation of ?0.1 has been suggested, but this can only become relevant when
contextualized in relation to sample size and a measure of skewness with a
population or parametric value of zero. For this purpose, intervals with
confidence levels of 90%, 95% and 99% were estimated with 10,000 draws
at random with replacement from 57 normally distributed samples-population with
different sample sizes. The article closes with suggestions for the use of this
measure of skewness.
References
[1]
Orcan, F. (2020) Parametric or Non-Parametric: Skewness to Test Normality for Mean Comparison. International Journal of Assessment Tools in Education, 7, 255-265.
https://doi.org/10.21449/ijate.656077
[2]
Pearson, K. (1895) X. Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material. Philosophical Transactions of the Royal Society of London A, 186, 343-414. https://doi.org/10.1098/rsta.1895.0010
[3]
Bruni, V., and Vitulano, D. (2020) SSIM Based Signature of Facial Micro-Expressions. Proceedings of the Image Analysis and Recognition: 17th International Conference, Póvoa de Varzim, 24-26 June 2020, 267-279.
https://doi.org/10.1007/978-3-030-50347-5_24
[4]
Doane, D.P. and Seward, L.E. (2011) Measuring Skewness: A Forgotten Statistic? Journal of Statistics Education, 19, Article No. 18.
https://doi.org/10.1080/10691898.2011.11889611
[5]
Mohammed, M.B., Adam, M.B., Ali, N. and Zulkafli, H.S. (2022) Improved Frequency Table’s Measures of Skewness and Kurtosis with Application to Weather Data. Communications in Statistics—Theory and Methods, 51, 581-598.
https://doi.org/10.1080/03610926.2020.1752386
[6]
Singh, A., Gewali, L. and Khatiwada, J. (2019) New Measures of Skewness of a Probability Distribution. Open Journal of Statistics, 9, 601-621.
https://doi.org/10.4236/ojs.2019.95039
[7]
Eberl, A. and Klar, B. (2020) Asymptotic Distributions and Performance of Empirical Skewness Measures. Computational Statistics & Data Analysis, 146, Article ID: 106939. https://doi.org/10.1016/j.csda.2020.106939
[8]
Cabilio, P. and Masaro, J. (1996) A Simple Test of Symmetry about an Unknown Median. Canadian Journal of Statistics, 24, 349-361.
https://doi.org/10.2307/3315744
[9]
Majindar, K.N. (1962) Improved Bounds on a Measure of Skewness. Annals of Mathematical Statistics, 33, 1192-1194. https://doi.org/10.1214/aoms/1177704482
[10]
Canty, A. and Ripley, B. (2022) Boot: Bootstrap R (S-Plus) Functions. R Package Version 1.3-28.1. https://cran.r-project.org/web/packages/boot/boot.pdf
[11]
Tibshirani, R., Leisch, F. and Kostyshak, S. (2022) Package “Bootstrap”.
https://cran.r-project.org/web/packages/bootstrap/bootstrap.pdf
[12]
Galton, F. (1883) Enquiries into Human Faculty and Its Development. Macmillan and Company, London. https://doi.org/10.1037/14178-000
[13]
Pearson, K. (1894) Contributions to the Mathematical Theory of Evolution. I. On the Dissection of Asymmetrical Frequency Curves. Philosophical Transactions of the Royal Society of London A, 185, 71-110. https://doi.org/10.1098/rsta.1894.0003
[14]
Pearson, K. (1916) Mathematical Contributions to the Theory of Evolution. XIX. Second Supplement to a Memoir on Skew Variation. Philosophical Transactions of the Royal Society of London A, 216, 429-457. https://doi.org/10.1098/rsta.1916.0009
[15]
Srivastava, R. (2023) Karl Pearson and “Applied” Statistics. Resonance, 28, 183-189.
https://doi.org/10.1007/s12045-023-1542-3
[16]
DeVellis, R.F. and Thorpe, C.T. (2021) Scale Development: Theory and Applications. Sage Publications, Thousand Oaks.
[17]
Moral de la Rubia, J. (2022) A Measure of One-Dimensional Asymmetry for Qualitative Variables. Revista de Psicología (PUCP), 40, 519-551.
https://dx.doi.org/10.18800/psico.202201.017
[18]
Shi, J., Luo, D., Wan, X., Liu, Y., Liu, J., Bian, Z. and Tong, T. (2020) Detecting the Skewness of Data from the Sample Size and the Five-Number Summary.
[19]
Mishra, P., Pandey, C. M., Singh, U., Gupta, A., Sahu, C. and Keshri, A. (2019) Descriptive Statistics and Normality Tests for Statistical Data. Annals of Cardiac Anaesthesia, 22, 67-72. https://doi.org/10.4103/aca.ACA_157_18
[20]
Gupta, S.C. and Kapoor, V.K. (2020) Descriptive Measures. In: Fundamentals of Mathematical Statistics, 12th Edition, Sultan Chand & Sons, New Delhi, Section 2, 1-78.
[21]
Altinay, G. (2016) A Simple Class of Measures of Skewness. Munich Personal RePEc Archive, Paper No. 72353, 1-13. https://mpra.ub.uni-muenchen.de/72353
[22]
Sarka, D. (2021) Descriptive Statistics. In: Advanced Analytics with Transact-SQL, Apress, Berkeley, 3-29. https://doi.org/10.1007/978-1-4842-7173-5_1
[23]
Hatem, G., Zeidan, J., Goossens, M. and Moreira, C. (2022) Normality Testing Methods and the Importance of Skewness and Kurtosis in Statistical Analysis. BAU Journal—Science and Technology, 3, Article No. 7.
https://doi.org/10.54729/KTPE9512
[24]
Aytaçoğlu, B. and Sazak, H.S. (2017) A Comparative Study on the Estimators of Skewness and Kurtosis. Ege University Journal of the Faculty of Science, 41, 1-13.
[25]
Yule, G.U. (1912) An Introduction to the Theory of Statistics. Charles Griffin and Company Limited, London.
[26]
Bickel, D.R. (2002) Robust Estimators of the Mode and Skewness of Continuous Data. Computational Statistics & Data Analysis, 39, 153-163.
https://doi.org/10.1016/S0167-9473(01)00057-3
[27]
Kaliyadan, F. and Kulkarni, V. (2019) Types of Variables, Descriptive Statistics, and Sample Size. Indian Dermatology Online Journal, 10, 82-86.
https://doi.org/10.4103/idoj.IDOJ_468_18
[28]
Chacón, J.E. (2020) The Modal Age of Statistics. International Statistical Review, 88, 122-141. https://doi.org/10.1111/insr.12340
[29]
Upton, G.J. and Cook, I. (2014) Pearson’s Coefficient of Skewness. In: Oxford Dictionary of Statistics, 3th Edition, Oxford University Press, Cambridge, 81-82.
[30]
Efron, B. (2003) Second Thoughts on the Bootstrap. Statistical Science, 18, 135-140.
https://doi.org/10.1214/ss/1063994968
[31]
Manly, B.F.J. and Navarro-Alberto, J.A. (2022) Randomization, Bootstrap and Monte Carlo Methods in Biology. 4th Edition, Chapman & Hall, Boca Raton.
[32]
Rizzo, M. (2019) Statistical Computing with R. 2nd Edition, Chapman & Hall/CRC Press, Boca Raton.
[33]
Braun, W.J. and Murdoch, D.J. (2021) A First Course in Statistical Programming with R. Cambridge University Press, Cambridge.
https://doi.org/10.1017/9781108993456
[34]
Lane, D.M. (2021) Histograms. In: Online Statistics Education: A Multimedia Course of Study, Department of Statistics, Rice University, Houston.
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Book%3A_Introductory_Statistics_(Lane)/02%3A_Graphing_Distributions/2.04%3A_Histograms
[35]
DiCiccio, T.J., Ritzwoller, D.M., Romano, J.P. and Shaikh, A.M. (2022) Confidence Intervals for Seroprevalence. Statistical Science, 37, 306-321.
https://doi.org/10.1214/21-STS844
[36]
Mukhopadhyay, N. (2020) Probability and Statistical Inference. CRC Press, Boca Raton.
[37]
Giorgi, F.M., Ceraolo, C. and Mercatelli, D. (2022) The R Language: An Engine for Bioinformatics and Data Science. Life, 12, Article No. 648.
https://doi.org/10.3390/life12050648
[38]
Lyhagen, J. and Ornstein, P. (2023) Robust Polychoric Correlation. Communications in Statistics—Theory and Methods, 52, 3241-3261.
https://doi.org/10.1080/03610926.2021.1970770