preview

Decision Tree Induction & Clustering Techniques in Sas Enterprise Miner, Spss Clementine, and Ibm Intelligent Miner – a Comparative Analysis

Better Essays

International Journal of Management & Information Systems – Third Quarter 2010

Volume 14, Number 3

Decision Tree Induction & Clustering Techniques In SAS Enterprise Miner, SPSS Clementine, And IBM Intelligent Miner – A Comparative Analysis
Abdullah M. Al Ghoson, Virginia Commonwealth University, USA

ABSTRACT Decision tree induction and Clustering are two of the most prevalent data mining techniques used separately or together in many business applications. Most commercial data mining software tools provide these two techniques but few of them satisfy business needs. There are many criteria and factors to choose the most appropriate software for a particular organization. This paper aims to provide a comparative analysis for three …show more content…

In this way, decision trees provide accuracy and explanatory models where the decision tree model is able to explain the reason of certain decisions using these decision rules. Decision trees could be used in classification applications that target discrete value outcomes by classifying unclassified data based on a pre-classified dataset, for example, classifying credit card applicants into three classes of risk, which are low, medium or high. Also, decision trees could be used in estimation applications that have continuous outcomes by estimating value based on pre-classified datasets, and in this case the tree is called a regression tree, for example, estimating household income. Moreover, decision trees could be used in prediction applications that have discrete or continuous outcomes by predicting future value same as classification or estimation, for example, predicting credit card loan as good or bad. 2.1 Decision Tree Models

Decision tree models are explanatory models, which are English rules so they are easy to evaluate and understand by people. The decision tree model is considered as a chain of rules that classify records in different bins or classes called nodes [1]. Based on the model 's algorithm, every node may have two or more children or have no child, which is called in this case leaf node [1]. Building decision tree models requires partitioning the pre-classified dataset into three parts,

Get Access