10.1.2 Classification Trees

A classification tree is a rule for predicting the class of a nominal (dichotomous or polytomous) dependent variable from the values of its predictors.

Chi-square Analysis

This method is used when analyzing one dependent variable (nominal or ordinal) or a set of dichotomous variables with several predictors. The method uses an impurity index, which can be written as the classes of the dependent variable.

i (t ) = F {p ( j t)} for j = 1,2,…, J

where (j t) is the relative frequency of class j at node t.

I (t ) is minimum (=0) when a node contains individuals of only one class. The node is then called a pure node. I (t ) is maximum, when a node contains all classes with equal relative frequency. Search uses entropy as a measure of impurity of a node.

Entropy for group g:

where

xjgk = frequency (coded 0 to 1) of code j (or value of variable j) of case k in group g.

For a given node and for all predictors and all admissible splits, the predictor and splits are chosen that maximize the impurity reduction between the parent node and its descendents.

This criterion is applied recursively to the descendents, which become the parents of successive splits, and so on. The splitting process is continued until the criteria of minimum reduction in impurity and minimum size of a node are satisfied.