全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Is Bagging Effective in the Classification of Small-Sample Genomic and Proteomic Data?

DOI: 10.1155/2009/158368

Full-Text   Cite this paper   Add to My Lib

Abstract:

Randomized ensemble methods for classifier design combine the decision of an ensemble of classifiers designed on randomly perturbed versions of the available data [1–5]. The combination is often done by means of majority voting among the individual classifier decisions [4–6], whereas the data perturbation usually employs the bootstrap resampling approach, which corresponds to sampling uniformly with replacement from the original data [78]. The combination of bootstrap resampling and majority voting is known as bootstrap aggregation or bagging [45].There has been considerable interest recently in the application of bagging in the classification of both gene-expression data [9–12] and protein-abundance mass spectrometry data [13–18]. However, there is scant theoretical justification for the use of this heuristic, other than the expectation that combining the decision of several classifiers will regularize and improve the performance of unstable overfitting classification rules, such as unpruned decision trees, provided one uses a large enough number of classifiers in the ensemble [45]. It is also claimed that ensemble rules "do not overfit," meaning that classification error converges as the number of component classifiers tends to infinity [5].However, the main performance issue is not whether the ensemble scheme improves the classification error of a single unstable overfitting classifier, or whether its classification error converges to a fixed limit; these are important questions, which have been studied in the literature (in particular when the component classifiers are decision trees) [519–23], but the question of main practical interest is whether the ensemble scheme will improve the performance of unstable overfitting classifiers sufficiently to beat the performance of single stable, nonoverfitting classifiers, particularly in small-sample settings. Therefore, there is a pressing need to examine rigorously the suitability and validity of the ensemble approach

Full-Text

Contact Us

[email protected]

QQ:3279437679

WhatsApp +8615387084133