|
- 2019
MISMATCH OF DATA MINING SOFTWARE PREDICTION UNDER POSITIVE DEFINITE MATRIX PROBLEMKeywords: Veri Madencili?i,Pozitif Tan?ml? Matris,Uyu?mazl?k Abstract: Like other analysis software such as data mining software (DMS) to perform the methods use a guidebook (reference manual). Software declares their results are consistently obtained by the reference book and their analysis are based on related references. DMS uses different references and may cause different results. Some analysis may work with some different convergence and iteration techniques. The same analysis with different convergence and iteration techniques can reveal a different outcome. Aim of the study is to reveal the mismatch of data mining software results for the same analysis. In this study as an example data, 270 Borsa ?stanbul companies were used at period 2012. In order to survive in the competitive environment, companies must balance their monetary and non-monetary assets. Edward I. Altman et al revealed the procedure called Altman Z score (AZS) of financial distress or not from the balance sheet and income table (BS&IT). In this study according to items of BS&IT 21 financial ratios were calculated. The financial ratios of BS&IT are related to each other because of BS&IT nature. The problem of multicollinearity was wanted to solved with principal component analysis (PCA) and dimension was reduced. Covariance matrix that analyzed with PCA, was not a positive definite matrix. Each DMS use its own numerical analysis procedure in order to solve this problem. Financial distress or not of companies were determined according to Altman Z score (AZS) as categorically labeled with 0-1. The reduced number of variables estimated from PCA, the financial distress of companies or not were predicted using binary logistic regression (BLR) with these reduced variables. BLR extraction performance of the data mining software was compared to ROC curves. After PCA, BLR was conducted on the IBM Modeler (SPSS), Statistica, Stata, SAS, R, Weka, Orange data mining software. The mismatch of data mining software results was discussed
|