7.2 Iterative Typology and Ascending Classification (TYPOL)
This program performs a partition of a large data set into a pre-assigned
number of clusters. It creates a classification variable, summarizing a large
number of variables. It is a highly versatile program; its most important
features are:
- It can handle variables at different levels of measurement–-
interval scale and nominal scale. The latter are treated as quantitative
variables after their full dichotomization as binary (0,1)
variables. The number of dichotomies is equal to the number of modalities
of the nominal (or categorized) variables. The input variables may be
quantitative, qualitative or a mix of quantitative and qualitative
variables.
- It can handle active and passive variables. The active variables
are those which are used in the construction of the typology. The passive
variables are those which do not participate in the construction of the
typology, but their main statistics within the typology groups are
computed – mean and standard deviation for quantitative variables
and frequencies for categorical variables. Thus, active variables are used
to construct the typology, whereas passive variables are used to
illustrate the typology.
- The program uses three different measures of proximity – City
block distance, Euclidean distance and Benzecri’s
Chi-square distance, depending upon the measurement
scales of the input variables. These metrics are computed as follows: