#### 7.1.6 Monothetic Analysis (Mona)

Mona is a divisive hierarchical clustering method, but unlike Diana which can process a dissimilarity matrix as well as n ´ p data matrix of interval scale variables, Mona operates on a data matrix with binary variables. Each separation is carried out, using a well selected single variable– that is why the algorithm is called monothetic. Most other hierarchical methods, including Agnes and Diana use all the variables simultaneously, and are therefore called polythetic.

The clustering algorithm used in Mona assumes that there are no missing values. However, in several situations, all data may not be available. Hence, the program has a provision for filling in the missing data.

First, all missing values in the binary data matrix (all those values other than 0 or 1) are replaced by estimated values, obtained as follows. Suppose that xif is missing, then we consider any other variable and construct the contingency table.

 f                           g 1 0 1 afg bfg 0 cfg dfg

Association between f and g is defined as

Note that afg+ bfg+ cfg+ dfg is the total number of non-missing values of the variable f. After calculating Afg for each complete variable g, the variable t is determined for which the association with the variable f is maximal

Aft = max Afg

The missing values of f are then estimated by variable t in the following manner

Put xif = xit when > 0

Put xif = 1-xit when < 0

After a complete data matrix has been obtained, Mona starts the actual clustering algorithm. If the data matrix cannot be filled completely, the program stops with a warning message.

The algorithm constructs a clustering hierarchy, starting with one large cluster. At a separation step, it selects one of the variables and divides the set of objects into two groups, one for which the selected variable equals 0 and the other for which it equals 1. The variable used for splitting a cluster is selected as follows. For each variable f, the association measures with all other variables are added, giving the total association:

The variable t which satisfies At = max Af is selected for splitting the cluster.

However, if the same maximal value is found for several variables, t is chosen as the one appearing first.

The process is continued until each cluster consists of objects having identical values for all variables. Such clusters can not be split any more. A final cluster is then a singleton or an indivisible cluster.

#### Graphical display

The clustering hierarchy constructed by Mona can be represented by means of a separation plot (banner). This is a divisive banner similar to that for Diana. The length of a row of stars is now proportional to the step number at which separation was carried out. A row of object identifiers, which does not continue to the right hand side of the banner, signals an object that became a singleton cluster at the corresponding step. Rows of identifiers plotted between two rows of stars indicate objects belonging to a cluster which can not be separated.

#### Clustering Structure

When Agglomerative coefficient (AC) , Divisive Coefficient (DC) or Silhouette Coefficient (SC) are very small, it implies that the corresponding method has not found a natural structure, which implies that no clusters have been found or in other words the data consist of a big cluster. On the other hand, if AC, DC, or SC is close to 1, it means that a very clear structuring has been identified. However, it can not be stated that the right clustering structure has been identified. However, all these coefficients become very large when a bad outlier is added to any data set. The graphical display would, of course, expose the outlier.