The task of discriminant
analysis is to find the best linear discriminant function(s) of a
set of variables which reproduce(s), as far as it is possible, an
a priori grouping of the cases considered. A stepwise procedure
is used in this program, i.e. in each step the most powerful variable
is entered into the discriminant function. The criterion function
for selecting the next variable depends on the number of groups specified
(number of groups varies between 2 and 20). In the case of two groups
the Mahalanobis distance is used. When the number of groups is greater
than 2 then the variable selection criterion is the trace of a product
of the covariance matrix for the variables involved and the inter-class
covariance matrix at a particular step. This is a generalization of
Mahalanobis distance defined for two groups. Besides executing
the main discriminant analysis steps on a basic sample there
are two optional possibilities: checking the power of the discriminant
function(s) with the help of a test sample , in which the group
assignment of the cases is known (as in the basic sample) but which
cases were not used in the analysis, and classifying the cases with
the help of discriminant function(s) provided by the analysis in an
anonymous sample where the group assignment of the cases is
unknown, or at least is not used.
Case and variable
selection. The standard filter is available to select a subset
of cases from the input data. A further subsetting is possible with
the use of the sample and group variables. Analysis variables are
selected with the VARS parameter. Transforming data. Recode
statements may be used. Weighting data. A variable can be
used to weight the input data; this weight variable may have integer
or decimal values. When the value of the weight variable for a case
is zero, negative, missing or non-numeric, then the case is always
skipped; the number of cases so treated is printed. Treatment
of missing data. The MDVALUES parameter is available to indicate
which missing data values, if any, are to be used to check for missing
data. Cases with missing data in the sample variable, the group variable
and/or the analysis variables can be optionally excluded from the
analysis. 24.1  General Description
24.2  Standard IDAMS Features
| - | the transferred variables, |
| - | the code of the original groups as renumbered by DISCRAN ("Original group"), |
| - | the code of groups assigned to cases at the end ("Assigned group"), |
| - | the "Sample type" (1=basic, 2=test, 3=anonymous) and, |
| - | for analysis with more than 2 original groups, the values of the first two discriminant factors |
| ("Factor-1", "Factor-2"). |
The variables are renumbered starting from one.
The code of the original groups is set to the first missing data code (999.9999) for cases in anonymous sample; factors are set to the first missing data code (999.9999) for cases in the test and anonymous samples.
Note: variable specified in IDVAR is not output automatically and thus ID variables should better be included in the transfer variable list.
$RUN DISCRAN
$FILES
File specifications
$RECODE (optional)
Recode statements
$SETUP
1. Filter (optional)
2. Label
3. Parameters
$DICT (conditional)
Dictionary
$DATA (conditional)
Data
Files:
DICTxxxx input dictionary (omit if $DICT used)
DATAxxxx input data (omit if $DATA used)
DICTyyyy output dictionary if WRITE=DATA specified
DATAyyyy output data if WRITE=DATA specified
PRINT results (default IDAMS.LST)
|
Refer to "The
IDAMS Setup File" chapter for further descriptions of the program
control statements, items 1-3 below.
BADDATA=STOP /SKIP/MD1/MD2
MAXCASES=n
VARS=(variable list)
MDVALUES=BOTH /MD1/MD2/NONE
MDHANDLING=(SAMPVAR, GROUPVAR, ANALVARS)
WEIGHT=variable number
IDVAR=variable number
STEPMAX=n
MEMORY=20000 /n
WRITE=DATA
OUTFILE=OUT /yyyy
TRANSVARS=(variable list)
PRINT=(CDICT/DICT, OUTCDICT/OUTDICT, DATA, GROUP)
Sample specification These
parameters are optional. If they are not specified, all cases from
the input file are taken for the basic sample. Test and anonymous
samples, if they exist, must always be explicitly defined. The pair-wise
intersection of the samples must be empty. However, they need not
cover the whole input data file. A single value or a range of values
can be used for selecting the cases which belong to the corresponding
sample. where m1 and m2
may be integer or decimal values. SAVAR=variable number
BASA=(m1, m2)
TESA=(m1, m2)
ANSA=(m1, m2)
Basic sample classification These parameters define
the a priori groups used in the discriminant analysis procedure. All
the groups must be defined explicitly and their pair-wise intersection
must be empty. However, they need not cover the whole basic sample.
GRVAR=variable number
GR01=(m1, m2)
GR02=(m1, m2)
GRnn=(m1, m2)
Note. At least two groups have to be specified. 24.7  Program Control Statements
Example: INCLUDE V3=6 OR V11=99
Example: DISCRIMINANT ANALYSIS ON AGRICULTURAL SURVEY
Example: MDHA=SAMPVAR IDVAR=V4 SAVAR=R5 BASA=(1,5) VARS=(V12-V15)
INFILE=IN /xxxx
or
m1
<= value of sample variable < m2