全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Principal Component Analysis for Authorship Attribution

Keywords: principal components , authorship attribution , stylometry , text categorization , function words , classification task , stylistic features , syntactic characteristics

Full-Text   Cite this paper   Add to My Lib

Abstract:

A common problem in statistical pattern recognition is that offeature selection or feature extraction. Feature selection refers to a processwhereby a data space is transformed into a feature space that, in theory,has exactly the same dimension as the original data space. However, thetransformation is designed in such a way that the data set may berepresented by a reduced number of "effective" features and yet retain mostof the intrinsic information content of the data; in other words, the data setundergoes a dimensionality reduction. In this paper the data collected bycounting words and characters in around a thousand paragraphs of eachsample book underwent a principal component analysis performed usingneural networks. Then first of the principal components is used todistinguished the books authored by a certain author.

Full-Text

Contact Us

[email protected]

QQ:3279437679

WhatsApp +8615387084133