The task of discriminant
analysis is to find the best linear discriminant function(s) of a
set of variables which reproduce(s), as far as it is possible, an
a priori grouping of the cases considered. A stepwise procedure
is used in this program, i.e. in each step the most powerful variable
is entered into the discriminant function. The criterion function
for selecting the next variable depends on the number of groups specified
(number of groups varies between 2 and 20). In the case of two groups
the Mahalanobis distance is used. When the number of groups is greater
than 2 then the variable selection criterion is the trace of a product
of the covariance matrix for the variables involved and the inter-class
covariance matrix at a particular step. This is a generalization of
Mahalanobis distance defined for two groups. Besides executing
the main discriminant analysis steps on a basic sample there
are two optional possibilities: checking the power of the discriminant
function(s) with the help of a test sample , in which the group
assignment of the cases is known (as in the basic sample) but which
cases were not used in the analysis, and classifying the cases with
the help of discriminant function(s) provided by the analysis in an
anonymous sample where the group assignment of the cases is
unknown, or at least is not used.
Case and variable
selection. The standard filter is available to select a subset
of cases from the input data. A further subsetting is possible with
the use of the sample and group variables. Analysis variables are
selected with the VARS parameter. Transforming data. IDAMS
Recoding may be used. Weighting data. A variable can be
used to weight the input data; this weight variable may have integer
or decimal values. When the value of the weight variable for a case
is zero, negative, missing or non-numeric, then the case is always
skipped; the number of cases so treated is printed. Treatment
of missing data. The MDVALUES parameter is available to indicate
which missing data values, if any, are to be used to check for missing
data. Cases with missing data in the sample variable, the group variable
and/or the analysis variables can be optionally excluded from the
analysis.
Input dictionary.
(Optional: see the parameter PRINT). Variable descriptor records,
and C-records if any, only for variables used in the execution. Number
of cases in samples. The number of cases in the basic, test and
anonymous samples according to the sample definition parameters. Revised
number of cases in samples. The number of cases in the basic,
test and anonymous samples revised according to the sample and group
definition parameters. Note that the revised figures may be smaller
than the non-revised ones for the basic and the test samples if the
groups defined do not cover completely the samples. Basic sample.
(Optional: see the parameter PRINT). The identification and the analysis
variables of the cases in the basic sample are printed by groups,
while the groups are separated from each other by a line of asterisks.
Test sample. As for basic sample. Anonymous sample.
As for basic sample except that there are no groups. Univariate
statistics. For each variable used in the analysis the program
prints the group means and standard deviations as well as the total
mean. Stepwise procedure results (for each step)
Step number. The sequence number of the step. Variables
entered. The list of variables retained in this step. Linear
discriminant function. (Conditional: only if 2 groups specified).
The constant term and the coefficients of the linear discriminant
function corresponding to the variables already entered. Classification
table for basic sample. Bivariate frequency table showing the
re-distribution of cases between the original groups and the groups
to which they are allocated on the basis of the discriminant function,
followed by the percentage of the correctly classified cases. Classification
table for test sample. As for basic sample. Case assignment
list. (Optional: see the parameter PRINT). The cases of the three
samples are printed here with case identification, case allocation,
and discriminant function value (for 2 groups) or distances to each
group (for more than 2 groups). Discriminant factor analysis
results. (Conditional: only if more than 2 groups specified).
Overall discriminant power and the discriminant power of the first
three factors, followed by the values of discriminant factors for
group means. In addition, a graphical representation of cases and
means in the space of the first two factors is also given. 25.1  General Description
25.2  Standard IDAMS Features
25.3  Printed Output
$RUN DISCRAN
$FILES
File definitions
$RECODE (optional)
Recode statements
$SETUP
1. Filter (optional)
2. Label
3. Parameters
$DICT (conditional)
Dictionary
$DATA (conditional)
Data
Files:
DICTxxxx input dictionary (omit if $DICT used)
DATAxxxx input data (omit if $DATA used)
PRINT printed output (default IDAMS.LST)
|
Refer to "The
IDAMS Setup File" chapter for further descriptions of the program
control statements, items 1-3 below.
BADDATA=STOP /SKIP/MD1/MD2
MAXCASES=n
VARS=(variable list)
MDVALUES=BOTH /MD1/MD2/NONE
MDHANDLING=(SAMPVAR, GROUPVAR, ANALVARS)
WEIGHT=variable number
IDVAR=variable number
STEPMAX=n
PRINT=(CDICT/DICT, DATA, GROUP)
Sample specification These
parameters are optional. If they are not specified, all cases from
the input file are taken for the basic sample. Test and anonymous
samples, if they exist, must always be explicitly defined. The pair-wise
intersection of the samples must be empty. However, they need not
cover the whole input data file. A single value or a range of values
can be used for selecting the cases which belong to the corresponding
sample. where m1 and m2
may be integer or decimal values. SAVAR=variable number
BASA=(m1, m2)
TESA=(m1, m2)
ANSA=(m1, m2)
Basic sample classification These parameters define
the a priori groups used in the discriminant analysis procedure. All
the groups must be defined explicitly and their pair-wise intersection
must be empty. However, they need not cover the whole basic sample.
GRVAR=variable number
GR01=(m1, m2)
GR02=(m1, m2)
GRnn=(m1, m2)
Note. At least two groups have to be specified. 25.6  Program Control Statements
Example: INCLUDE V3=6 OR V11=99
Example: DISCRIMINANT ANALYSIS ON AGRICULTURAL SURVEY
Example: MDHA=SAMPVAR IDVAR=V4 SAVAR=R5 BASA=(1,5) VARS=(V12-V15)
INFILE=IN /xxxx
or
m1
<= value of sample variable < m2