4.2 Non-parametric Measures of Bivariate Relationships

Spearman’s Correlation

Pearson correlation is unduly influenced by outliers, unequal variances, non-normality, and nonlinearity. An important competitor of the Pearson correlation coefficient is the Spearman’s rank correlation coefficient. This latter correlation is calculated by applying the Pearson correlation formula to the ranks of the data rather than to the actual data values themselves. In so doing, many of the distortions that plague the Pearson correlation are reduced considerably.

Pearson correlation measures the strength of linear relationship between X and Y. In the case of nonlinear, but monotonic relationships, a useful measure is Spearman’s rank correlation coefficient, Rho, which is a Pearson’s type correlation coefficient computed on the ranks of X and Y values. It is computed by the following formula:


di is the difference between the ranks of Xi and Yi.

rs = +1, if there is a perfect agreement between the two sets of ranks.

rs = - 1, if there is a complete disagreement between the two sets of ranks.

Kendall’s Tau

This is a measure of correlation between two ordinal-level variables. It is most appropriate for square tables. For any sample of n observations, there are [n (n-1)/2] possible comparisons of points (XI, YI) and (XJ, YJ).

Let C = Number of pairs that are concordant.

Let D = Number of pairs that are not concordant.

Kendall’s Tau =

Obviously, Tau has the range: - 1 Tau +1

If XI = XJ, or YI = YJ or both, the comparison is called a ‘tie’. Ties are not counted as concordant or discordant.

If there are a large number of ties, then the dominator has to be replaced by

where nX is the number of ties involving X, and nY is the number of ties involving Y.

In large samples, the statistic:

3 Tau {n (n-1)} 1/2/ {2 (2n+5)} 1/2

has a normal distribution, and therefore can be used as a test statistic for testing the null hypothesis of zero correlation.

Kendall’s Tau is equivalent to Spearman’s Rho, with regard to the underlying assumptions, but Spearman’s Rho and Kendall’s Tau are not identical in magnitude, since their underlying logic and computational formulae are quite different. The relationship between the two measures is given by

-1 {(3 Kendall’s Tau) – (2 Spearman’ Rho)} +1

In most cases, these values are very similar, and when discrepancies occur, it is probably safer to interpret the lower value. More importantly, Kendall’s Tau and Spearman’s Rho imply different interpretations. Spearman’s Rho is considered as the regular Pearson’s correlation coefficient in terms of the proportion of variability accounted for, whereas Kendall’s Tau represents a probability, i.e., the difference between the probability that the observed data are in the same order versus the probability that the observed data are not in the same order.

There are two different variants of Tau, viz. Tau b and Tau c. These measures differ only as to how tied ranks are handled.

Kendall's Tau-b

Kendall's Tau-b is a measure of association often used with but not limited to 2-by-2 tables. It is computed as the excess of concordant over discordant pairs (C - D), divided by a term representing the geometric mean between the number of pairs not tied on X (X0) and the number not tied on Y (Y0):

Tau-b = (C - D)/ SQRT [(C + D + Y0)(C + D + Y0)]

There is no well-defined intuitive meaning for Tau -b, which is the surplus of concordant over discordant pairs as a percentage of concordant, discordant, and approximately one-half of tied pairs. The rationale for this is that if the direction of causation is unknown, then the surplus of concordant over discordant pairs should be compared with the total of all relevant pairs, where those relevant are the concordant pairs, the discordant pairs, plus either the X-ties or Y-ties but not both, and since direction is not known, the geometric mean is used as an estimate of relevant tied pairs.

Tau-b requires binary or ordinal data. It reaches 1.0 (or -1.0 for negative relationships) only for square tables when all entries are on one diagonal. Tau-b equals 0 under statistical independence for both square and non-square tables. Tau-c is used for non-square tables.


Kendall's Tau-c, also called Kendall-Stuart Tau-c, is a variant of Tau-b for larger tables. It equals the excess of concordant over discordant pairs, multiplied by a term representing an adjustment for the size of the table.

Tau-c = (C - D)*[2m/(n2(m-1))]


m = the number of rows or columns, whichever is smaller

n = the sample size.

Goodman – Kruskal Gamma

Another non-parametric measure of correlation is Goodman – Kruskal Gamma ( G) which is based on the difference between concordant pairs (C) and discordant pairs (D). Gamma is computed as follows:

 G = (C-D)/(C+D)

Thus, Gamma is the surplus of concordant pairs over discordant pairs, as a percentage of all pairs, ignoring ties. Gamma defines perfect association as weak monotonicity. Under statistical independence, Gamma will be 0, but it can be 0 at other times as well (whenever concordant minus discordant pairs are 0).

 Gamma is a symmetric measure and computes the same coefficient value, regardless of which is the independent (column) variable. Its value ranges between +1 to –1.

 In terms of the underlying assumptions, Gamma is equivalent to Spearman’s Rho or Kendall’s Tau; but in terms of its interpretation and computation, it is more similar to Kendall’s Tau than Spearman’s Rho. Gamma statistic is, however, preferable to Spearman’s Rho and Kandall’s Tau, when the data contain many tied observations.

Chi–square (c2)

Another useful way of looking at the relationship between two nominal (or categorical) variables is to cross-classify the data and get a count of the number of cases sharing a given combination of levels (i.e., categories), and then create a contingency table (cross-tabulation) showing the levels and the counts.

A contingency table lists the frequency of the joint occurrence of two levels (or possible outcomes), one level for each of the two categorical variables. The levels for one of the categorical variables correspond to the columns of the table, and the levels for the other categorical variable correspond to the rows of the table. The primary interest in constructing contingency tables is usually to determine whether there is any association (in terms of statistical dependence) between the two categorical variables, whose counts are displayed in the table. A measure of the global association between the two categorical variables is the Chi-square statistic, which is computed as follows:

Consider a contingency table with k rows and h columns. Let nij denote the cross-frequency of cell (i, j). Let denote the expected frequency of the cell. The deviation between the observed and expected frequencies (nij ) characterizes the disagreement between the observation and the hypothesis of independence. The expected frequency for any cell can be calculated by the following formula:

= (RT CT) / N


= expected frequency in a given cell (i, j)

RT = row total for the row containing that cell.

CT = column total for the column containing that cell.

N = total number of observations.

All the deviations can be studied by computing the quantity, denoted by

This statistic is distributed according to Pearson’s Chi-square law with (k-1) (h-1) degrees of freedom. Thus, the statistical significance of the relationship between two categorical variables is tested by using the χ2test which essentially finds out whether the observed frequencies in a distribution differ significantly from the frequencies, which might be expected according to a certain hypothesis (say the hypothesis of independence between the two variables).


The c2test requires that the expected frequencies are not very small. The reason for this assumption is that the Chi-square inherently tests the underlying probabilities in each cell; and when the expected cell frequencies fall, these probabilities cannot be estimated with sufficient precision. Hence, it is essential that the sample size should be large enough to guarantee the similarity between the theoretical and the sampling distribution of the c2 – statistic. In the formula for computation of c2, the expected value of the cell frequency is in the denominator. If this value is too small, the c2 value would be overestimated and would result in the rejection of the null hypothesis.

To avoid making incorrect inferences from the c2–test, the general rule is that an expected frequency less than 5 in a cell is too small to use. When the contingency table contains more than one cell with an expected frequency < 5, one can combine them to get an expected frequency 5. However, in doing so, the number of categories would be reduced and one would get less information.

It should be noted that the c2test is quite sensitive to the sample size. If the sample size is too small, the c2 value is overestimated; if it is too large, the c2 value is underestimated. To overcome this problem, the following measures of association are suggested in the literature: Phi–square (j2), Cramer’s V and Contingency Coefficient.

Phi – square is computed as follows:

j 2= c2/N

where N is the total number of observations.

For all contingency tables, which are 2 2, 2 k, or 2 h, Phi–square has a very nice property that its value ranges from 0 (no relationship) to 1 (perfect relationship). However, Phi-square loses this nice property, when both dimensions of the table are greater than 2. By a simple manipulation of Phi–square, we get a measure (Cramer’s V), which ranges from 0 to 1 for any size of the contingency table. Cramer’s V is computed as follows:

where L = min(h, k)

Contingency coefficient

The coefficient of contingency is a Chi-square -based measure of the relation between two categorical variables (proposed by Pearson, the originator of the Chi-square test). It is computed by the following formula:

Its advantage over the ordinary Chi-square is that it is more easily interpreted, since its range is always limited to 0 through 1 (where 0 means complete independence). The disadvantage of this statistic is that its specific upper limit is ‘limited’ by the size of the table; Contingency coefficient can reach the limit of 1, only if the number of categories is unlimited.

Lambda L

This is a measure of association for cross tabulations of nominal-level variables. It measures the percentage improvement in predictability of the dependent variable (row variable or column variable), given the value of the other variable (column variable or row variable). The formula is

Lambda –A( Row dependent)

Lambda B- (Columns dependent)

Symmetric Lambda

This is a weighted average of the Lambda A and Lambda B , The formula is

Fisher's Exact Test

Fisher's exact test is a test for independence in a 2 2 table.. . This test is designed to test the hypothesis that the two column percentages are equal. It is particularly useful when sample sizes are small (even zero in some cells) and the Chi-square test is not appropriate. The test determines whether the two groups differ in the proportion with which they fall in two classifications: The test is based on the probability of the observed outcome, and is given by the following formula:

where a, b, c, d represent the frequencies in the four cells;. N = total number of cases.

Mann Whitney U -Test

This test is the nonparametric substitute for the equal-variance t-test when the assumption of normality is not valid. When in doubt about normality, it is safer to use this test. Two fundamental assumptions of this test are:

This particular test is based on ranks and has good properties (asymptotic relative efficiency) for symmetric distributions.

The Mann-Whitney test statistic, U, is defined as the total number of times a Y precedes an X in the configuration of combined samples. It is directly related to the sum of ranks. This is why this test is sometimes called the Mann-Whitney U test and at other times called the Wilcoxon Rank Sum test The Mann-Whitney U test calculates UX and UY. The formula for UX is as follows.

The formula for UY is obtained by replacing X by Y in the above formula:

deviation makes little difference unless there are a lot of ties.

Wilcoxon Signed-Rank Test

This nonparametric test makes use of the sign and the magnitude of the rank of the differences of two related samples, and utilizes information about both the direction and relative magnitude of the differences within pairs of variables.

Sum Ranks (W)

The basic statistic for this test is the minimum of the sum of the positive ranks (SR+) and the sum of the negative ranks (SR-). This statistic is called W.

W =Minimum [ R +, R - ]

Mean of W

This is the mean of the sampling distribution of the sum of ranks for a sample of n items.

m W = n(n +1) 4

Standard deviation of W

This is the standard deviation of the sampling distribution of the sum of ranks. The formula is:

s W =SQRT [ { n (n+1) (2n+1)/24 } ] - { ( t i) 3 - ( t i) /48} ]

where t i represents the number of times the ith value occurs.

Number of Zeros: If there are zero differences, they are thrown out and the number of pairs is reduced by the number of zeros.

Number of Ties: - The treatment of ties is to assign an average rank for the particular set of ties. This is the number of sets of ties that occur in the data.

Approximations with (and without) Continuity Correction: Z-Value

If the sample size 15, a normal approximation method may be used to approximate the distribution of the sum of ranks. Although this method does correct for ties, it does not have the continuity correction factor. The z value is as follows:

If the correction factor for continuity is used, the formula becomes: