Trees are directed graphs beginning with one node, called parent node, and branching into many descendent nodes. The tree growing methodology consists of recursive splitting of each group of individuals into two subgroups, starting from the parent node (total sample) and continuing until the subgroups contain mostly ‘similar‘ individuals, without allowing these subgroups to become too small. A split is considered to be permissible if partition of the data based on a predictor code satisfies the following constraints:

- The predictor ordinality is not
violated.
- Minimum group frequency. Each descendent subgroup contains at least
a minimum number of objects.
- A minimum value of criterion improvement is ensured

The tree is constructed by a recursive
partitioning of the dataset. The procedure examines all possible splits of the data
along each predictor variable that most reduces some measure of node *impurity*.
The impurity of a set of objects is designed to capture how dissimilar the
objects are to each other.. In the case of regression
trees, the impurity measure is the *Sum of* *Squares*, whereas in the
case of classification trees, several measures are proposed in the literature: Gini index, Misclassification, Twoing
and Entropy. However, Search uses entropy as a measure of impurity of a node.
An essential property of the impurity is that it should decrease across the
splitting process.

I (t ) ³ a _{i} (*t _{r}*) + (1-
a )

0 £ a £ 1

If *X* is an ordered variable, this approach searches all possible
values *C *for splits of the form

.*X* £
*C*

A case is assigned to the left descendent node if the inequality is satisfied and to the right descendent node if the inequality is not satisfied.

If *X* is a categorical (unordered) predictor, the search is over all
splits of the form

*X* Î *A*

where *A* is a non-empty subset of the values
taken by *X*.

If *X* is an ordered variable with *n* distinct values at a node,
(*n*-1) splits of the form (1) are made. Therefore, the order of the
computations at each node is linear. in the number of
distinct values of each predictor. If *X* is an unordered categorical
variable, the order of computations increases exponentially with the number of
categories: (2 ^{m}^{-1} –1),
where *m* is the number of categories of *X*.

The heart of the tree growing process is binary recursive partitioning. It is binary, because the parent nodes are always split into exactly two child nodes. It is recursive, because the process can be repeated by treating each child node as a parent. node of the tree.

The algorithm makes all possible splits of each predictor variable and selects the variable and the cut-point, which provides an optimal split of a group into two subgroups. The procedure is applied recursively to the descendents, which become the parents of successive splits, and so on, until one gets the final nodes. The following steps are taken at each node:

- Define the possible (permissible) splits at each node
- Select the best split according to a statistical criterion.
- Verify a stopping criterion based on a statistical threshold.