全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
-  2019 

PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMS IN DATA MINING AND AN APPLICATION WITH THE R LANGUAGE

Keywords: S?n?fland?rma Y?ntemi,S?n?fland?rma Algoritmalar?,R Dili,Gini Algoritmas?,C5.0 Algoritmas?,Kar???kl?k Matrisi,Performans De?erlendirme

Full-Text   Cite this paper   Add to My Lib

Abstract:

Knowledge discovery in databases (KDD) is the overall process of exploring previously unknown and useful knowledge in large volumes of data. The first stage of KDD is the process of ETL (extract, transform, load). It involves the following sequential steps in the process of KDD: Extracting raw data from a data source, applying data preprocessing and loading the processed data into several data repositories, such as databases, data warehouses. Data preprocessing technique is used to convert a raw data into a clean and proper data set according to the purpose of a related project. Data mining is an important part of the process in knowledge discovery. Compared to the traditional analyzing techniques, data mining is a process in order to extract understandable, valuable and previously unknown information in a large amount of dataset. Data mining techniques are divided into two different categories such as supervised learning and unsupervised learning. Supervised learning is a machine learning. Applying a supervised learning technique, a classification model called training model, is built with a reference. By using the built classification model, the class of testing data is predicted. Accordingly, there are some supervised learning techniques, such as Classification, Decision Tree, Bayesian Classification, Neural Networks, Association Rule Mining. Unsupervised learning is a type of machine learning. The difference between Supervised learning and Unsupervised learning is unsupervised learning learns from the data but without reference. Therefore, it is not necessary to create a prior model in unsupervised learning. Clustering is one of the unsupervised learning techniques. It separates data into some groups called clusters in which objects are similar to each other. Several data mining techniques have been developing that are used for knowledge discovery from a large amount of datasets including Classification, Clustering, Decision Tree, Bayesian Classification, Neural Networks, Association Rule Mining, Prediction, Sequential Pattern and Genetic Algorithm, Time Series and Nearest Neighbor. The classification method which is one of the main methods of data mining is based on learning algorithm. It is applied in order to discover hidden patterns in a large-scale data. Following the ETL process, a classification model is created by selecting one of data mining methods. Within the scope of data mining, a pattern is expressed as an observable, measurable and repeatable information that is stored in digital area for an entity. Classification algorithms that are

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413