%0 Journal Article %T Principal Component Analysis for Authorship Attribution %A Amir Jamak %A Alen Savatic %A Mehmet Can %J Southeast Europe Journal of Soft Computing %D 2012 %I %X A common problem in statistical pattern recognition is that offeature selection or feature extraction. Feature selection refers to a processwhereby a data space is transformed into a feature space that, in theory,has exactly the same dimension as the original data space. However, thetransformation is designed in such a way that the data set may berepresented by a reduced number of "effective" features and yet retain mostof the intrinsic information content of the data; in other words, the data setundergoes a dimensionality reduction. In this paper the data collected bycounting words and characters in around a thousand paragraphs of eachsample book underwent a principal component analysis performed usingneural networks. Then first of the principal components is used todistinguished the books authored by a certain author. %K principal components %K authorship attribution %K stylometry %K text categorization %K function words %K classification task %K stylistic features %K syntactic characteristics %U http://www.scjournal.com.ba/index.php/scjournal/article/view/10/9