Mahalanobis distance is a measure of distance between two points in the space defined by two or more correlated variables. For example, if there are two variables that are uncorrelated, then we could plot points (cases) in a standard twodimensional scatter plot; the Mahalanobis distances between the points would then be identical to the Euclidean distance. When the variables are correlated, then the axes in the plots would be nonorthogonal. In those cases, the simple Euclidean distance is not an appropriate measure, while the Mahalanobis distance will adequately account for the correlations. Mahalanobis distance, D^{2}, is a generalized measure of the distance between two groups. The distance between groups 1 and 2 is defined as
)
where p is the number of variables in the model, is the mean for the i^{th }variable in Group 1, is the mean for the i^{th} variable in Group 2. is an element from the inverse of the withingroups covariance matrix.
When Mahalanobis’ distance is the criterion for variable selection, the Mahalanobis’ distance between all pairs of groups are calculated first. The variable that has the largest D^{2} for the two groups that are closest (have the smallest D^{2} initially) is selected for inclusion.
A test of the null hypothesis that the two sets of population means are equal can be based on Mahalanobis’ distance. The corresponding F statistic is
This F value can also be used for variable selection. At each step the variable chosen for inclusion is the one with the largest F value.
Classification is the process by which a decision is made whether a particular case belongs to a particular group.
For each group, we can determine the location of the point that represents the means for all variables in the multivariate space defined by the variables in the model. These points are called group centroids. For each case we can then compute the Mahalanobis distances (of the respective case) from each of the group centroids. We would classify the case as belonging to the group to which it is closest, that is, where the Mahalanobis distance is smallest.
A classification table can be constructed as follows:
Actual Groups 

Predicted group membership 


No. of Cases 
Group 1 
Group 2 
Group 1 
N_{1} 
n_{11} 
n_{12} 
Group 2 
N_{2} 
n_{21} 
n_{22} 
Let n_{1} denote the number of cases that truly belong to Group 1 and n_{2} denote the number of cases that truly belong to Group 2.
n_{11} = Number of cases that belong to Group 1 and assigned to Group 1 (i.e. correctly classified
n_{12} = Number of cases that belong to Group 1 and are assigned to Group 2 (Incorrectly classified)
n_{21} = Number of cases that belong to Group 2 but are assigned to Group 1 (i.e. incorrectly classified)
n_{22} = Number of cases that belong to Group 2 and are assigned to Group 2 (i.e. correctly classified)
Total number of cases correctly classified
= n_{11}+n_{22}
Percentage of cases of correctly classified
where n is the total number of cases
Discran uses stepwise procedure for inclusion of variables in the linear discriminant model. The procedure begins by selecting the individual variable that provides the greatest univariate discriminations (i.e. the largest value of acceptance criterion). After the first variable is entered, the value of the criterion is reevaluated for all the remaining variables, and the variable with the largest acceptance criterion value is entered into the model. This procedure is repeated until the number of steps specified by researcher have been carried out.