全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Depth-Based Classification for Distributions with Nonconvex Support

DOI: 10.1155/2013/629184

Full-Text   Cite this paper   Add to My Lib

Abstract:

Halfspace depth became a popular nonparametric tool for statistical analysis of multivariate data during the last two decades. One of applications of data depth considered recently in literature is the classification problem. The data depth approach is used instead of the linear discriminant analysis mostly to avoid the parametric assumptions and to get better classifier for data whose distribution is not elliptically symmetric, for example, skewed data. In our paper, we suggest to use weighted version of halfspace depth rather than the halfspace depth itself in order to obtain lower misclassification rate in the case of “nonconvex” distributions. Simulations show that the results of depth-based classifiers are comparable with linear discriminant analysis for two normal populations, while for nonelliptic distributions the classifier based on weighted halfspace depth outperforms both linear discriminant analysis and classifier based on the usual (nonweighted) halfspace depth. 1. Introduction Data classification or discrimination is an old statistical problem which has been discussed, studied, and applied since more than century. Therefore we will recall the problem only very briefly. The goal of the classification is to allocate a new observation into one of two (or more) groups. The rule for assessing a new observation to one of the possible groups is created by analysis of available observations with known group assignment (so-called training set). The most popular classification method is based on normality assumption, and it is known as the linear discriminant analysis (LDA). LDA is easy to use and quite successful in many cases, in particular for elliptically symmetric distributions. Naturally there are also nonparametric methods of classification. One of the motivations for replacing the LDA method is to avoid the normality assumption and therefore, hopefully, get better results for nonelliptic distributions. Recently it was proposed to use data depth as the basis for new nonparametric classifiers, see, for example, [1] or [2]. Let us recall the essential facts about the data depth. A depth of a point with respect to a probability measure is a nonnegative number which measures the “centrality” of with respect to . In other words, depth should reflect position of the point with respect to the probability distribution, or in the sample version, position of the point with respect to the observed data cloud. Recall also that for a multivariate data there is no natural linear ordering available. The depth, as a nonnegative (and bounded) number allows to

References

[1]  R. J?rnsten, “Clustering and classification based on the data depth,” Journal of Multivariate Analysis, vol. 90, no. 1, pp. 67–89, 2004.
[2]  A. K. Ghosh and P. Chaudhuri, “On maximum depth and related classifiers,” Scandinavian Journal of Statistics, vol. 32, no. 2, pp. 327–350, 2005.
[3]  J. Tukey, “Mathematics and picturing data,” in Proceedings of the 1975 International Congress of Mathematics, vol. 2, pp. 523–531, August 1975.
[4]  R. Y. Liu, “On a notion of data depth based on random simplices,” Annals of Statistics, vol. 18, pp. 405–414, 1990.
[5]  G. Koshevoy and K. Mosler, “Zonoid trimming for multivariate distributions,” Annals of Statistics, vol. 25, no. 5, pp. 1998–2017, 1997.
[6]  Y. Vardi and C. Zhang, “The multivariate -median and associated data depth,” Proceedings of the National Academy of Sciences of the United States of America, vol. 97, no. 4, pp. 1423–1426, 2000.
[7]  R. Y. Liu, R. Serfling, and D. L. Souvaine, Eds., Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications, American Mathematical Society, Providence, RI, USA, 2006.
[8]  D. Hlubinka and S. Nagy, “Functional data depth and classification,” Submitted.
[9]  D. Hlubinka, L. Kotík, and O. Vencálek, “Weighted halfspace depth,” Kybernetika, vol. 46, no. 1, pp. 125–148, 2010.
[10]  A. Hartikainen and H. Oja, “On some parametric, nonparametric and semiparametric discrimination rules,” in Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications, R. Liu, R. Serfling, and D. L. Souvaine, Eds., pp. 61–70, American Mathematical Society, Providence, RI, USA, 2006.
[11]  K. Mosler and R. Hoberg, “Data analysis and classification with the zonoid depth,” in Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications, R. Liu, R. Serfling, and D. L. Souvaine, Eds., pp. 49–59, American Mathematical Society, Providence, RI, USA, 2006.
[12]  Y. Zuo and R. Serfling, “General notions of statistical depth function,” Annals of Statistics, vol. 28, no. 2, pp. 461–482, 2000.
[13]  B. Chakraborty and P. Chaudhuri, “On a transformation and re-transformation technique for constructing an affine equivariant multivariate median,” Proceedings of the American Mathematical Society, vol. 124, no. 8, pp. 2539–2547, 1996.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413