Multiple discriminant analysis (MDA) is also termed Discriminant Factor Analysis and Canonical Discriminant Analysis. It adopts a perspective similar to Principal Components Analysis, but PCA and MDA are mathematically different in what they are maximizing. MDA maximizes the difference between values of the dependent, whereas PCA maximizes the variance in all the variables accounted for by the factor.
Geometrically, the rows of the data matrix can be considered as points in a multidimensional space, as also the group mean vectors. Discriminating axes are determined in this space, in such a way that optimal separation of the predefined groups is attained. The first discriminant function maximizes the differences between the values of the dependent variable. The second function is orthogonal to it (uncorrelated with it) and maximizes the differences between values of the dependent variable, controlling for the first factor. And so on. Though mathematically different, each discriminant function is a dimension, which differentiates a case into categories of the dependent variable based on its values on the independent variables. The first function will be the most powerful differentiating dimension, but later functions may also represent additional significant dimensions of differentiation
As in the case of Principal Components Analysis, mathematically the problem is eigenreduction of a real, symmetric matrix. The eigenvalues represent the discriminating power of the associated eigenvectors. The g groups lie in a space of at most g –1 dimensions. This will be the number of discriminant axes or factors that can be obtained in a common practical situation, when n > m > g (where n is the number of rows, and m the number of columns of the input data matrix). There is one eigenvalue for each discriminant function. The ratio of the eigenvalues indicates the relative discriminating power of the discriminant functions. For example, if the ratio of two eigenvalues is 1.6, then the first discriminant function explains 60% more between-group variance in the dependent categories than does the second discriminant function.
The relative percentage of a discriminant function equals a function's eigenvalue divided by the sum of all eigenvalues of all discriminant functions in the model. Thus it is the percent of discriminating power for the model associated with a given discriminant function. Relative % is used to decide how many functions are important. Usually, the first two or three eigenvalues are important.
The procedure for discrimination of three or more groups uses not only the total covariance matrix, but also the between groups covariance matrix. The criterion for selecting the next variable is the trace of a product of these two matrices (generalization of Mahalanobis distance for two groups). After selecting the new variable to be entered, discriminant factor analysis is performed and Discran provides the overall discriminant power and the discriminant power of the first three factors. Cases are classified according to their distances from the centroids of the groups. In each step, the program calculates and prints the classification table and the percentage of correctly classified cases for both the basic and test samples.
The distance of a case x from the centers of the group g in the step q is defined as the linear function
![]()
where Tq, is the total covariance matrix (calculated for the cases from all groups) for the variables included in step q, with the elements
![]()
A case is assigned to the group for which
has the smallest
value (the smallest distance).
The classification table and the percentage of cases correctly classified are derived in the same way as for discrimination between two groups.
The procedure for discrimination of three or more groups uses the total covariance matrix as well as the between group covariance matrix. The criterion for variable selection is the trace of the product of these two covariance matrices (generalization of Mahalanobis' distance for two groups). After selecting the new variable to be entered, discriminant factor analysis is performed and the program provides the overall discriminant power and the discriminant power of the first three factors. Cases are classified according to their distances from the centres of Groups.
At each step, the program calculates and prints the classification table and the percentages of cases correctly classified.
The variable selected in the step q is the one which maximizes the
value of the trace of the matrix
,
where Tq is
the total covariance matrix used in step q and Bq
is the matrix of covariances between groups, with the
elements
![]()
The following part of analysis is performed in one of the three following circumstances:
The distances from each group are calculated using the variables retained in the step. The assignment of cases to the groups is done according to the above criterion.
The matrix
is analyzed. The first two
eigenvectors corresponding to the two highest eigenvalues
of this matrix are the two discriminant factorial
axes. The discriminant power of the factors is
measured by the corresponding eigenvalues. Since the
program provides the discriminant power for only the
first three factors, the sum of eigenvalues allows
the estimation of the level of remaining eigenvalues,
i.e. those which are not printed.
For a case, the value of discriminant factor is calculated as the scalar product of the cases vector containing variables retained in the step by the eigenvector corresponding to the factor. Note that these values are not printed, but they are used in a graphical representation of cases in the space of the first two factors.
For a group mean, the value of discriminant factor is calculated in the same way replacing the case vector by the group mean vector.
The distances from each group are calculated in the same way, and assignment of cases to the groups is done following the same rules as for the basic sample.
The distances from each group are calculated in the same way and assignment of cases to the groups is done following the same rules as for the basic sample.