There are certain fundamental concepts in correspondence analysis: which are described below.
Primitive matrix
The original data matrix, N ( I , J ), or contingency table, is called the primitive matrix or primitive table. The elements of this matrix are nij.
Profiles
While interpreting a cross-tabulation, it makes little sense to compare the actual frequencies in each cell. Each row and each column has a different number of respondents, called the base of respondents. For comparison it is essential to reduce either the rows or columns to the same base.
Consider a contingency table N (I, J) with I rows (i=1, 2, I) and J columns ( j =1,2,…,J ) having frequencies nil. Marginal frequencies are denoted by ni+ and n+j
![]()
![]()
Total frequency is given by
![]()
Row profiles
The profile of each row i is a vector of conditional densities:
![]()
![]()
The complete set of the row profile may be denoted by I × J matrix R.
Matrix of Row Profiles
|
Rows |
Columns |
Total |
|||||||||||||||||||||||||||||||
|
|
1 |
2 j |
|
||||||||||||||||||||||||||||||
|
|
|
|
||||||||||||||||||||||||||||||
|
Column mass |
|
|
1 |
||||||||||||||||||||||||||||||
Column Profiles
The profile of each column j is a vector of conditional densities
. The complete set of the column
profiles may be denoted by (i ´ j) matrix C.
Matrix of Column Profiles
|
Rows |
Columns |
Row Mass |
|||||||||||||||||||||||||||||||
|
|
1 |
2 J |
|
||||||||||||||||||||||||||||||
|
|
|
|
||||||||||||||||||||||||||||||
|
Column mass |
1 |
…1
1 |
1 |
||||||||||||||||||||||||||||||
Average row profile
= n+j
/N (j=1,2,….J
)
Average column profile
= ni+/N (i=1,2,….,I )
Masses
Another fundamental concept in correspondence analysis is the concept of mass. The mass of the ith row =Marginal frequency of the ith row/Grand total
=n+i/n
Similarly the mass of the jth column = Marginal frequency of the jth column/Grand total
=nj+/n
Correspondence matrix
The correspondence matrix P is defined as the original table N divided by the grand total n, P = (1/n) N. Thus, each cell of the correspondence matrix is given by the cell frequency divided by the grand total.
The correspondence matrix shows how one unit of mass is distributed across the cells. The row and column totals of the correspondence matrix are the row mass and column mass, respectively.
Clouds of Points N (I ) and N ( J )
The cloud of points N (I) is the set of elements of points iÎ I,
whose coordinates are the components of the profile and whose mass is ![]()
The cloud of points N ( J ) is the set of elements of points j Î J, whose coordinates are the components of the profile and whose mass is nj+ / n++.
Distances
A variant of Euclidean distance, called the weighted Euclidean distance, is used to measure and thereby depict the distances between profile points. Here, the weighting refers to differential weighting of the dimensions of the space and not to the weighting of the profiles.
Distance between two rows i and i¢ is given by

In a symmetric fashion, the distance between two columns j and j¢ is given by

The distance thus obtained is called the Chi-square distance. The Chi-square distance differs from the usual Euclidean distance in that each square is weighted by the inverse of the frequency corresponding to each term.
The division of each squared term by the expected frequency is "variance – standardizing" and compensates for the larger variance in high frequencies and the smaller variance in low frequencies. If no such standardization were performed, the differences between larger proportions would tend to be large and thus dominate the distance calculation, while the differences between the smaller proportions would tend to be swamped. The weighting factors are used to equalize these differences.
Essentially, the reason for choosing the Chi-square distance is that it satisfies the principle of distributional equivalence, expressed as follows:
Inertia
Inertia is a term borrowed from the "moment of inertia" in mechanics. A physical object has a center of gravity (or centroid). Every particle of the object has a certain mass m and a certain distance d from the centroid. The moment of inertia of the object is the quantity md2 summed over all the particles that constitute the object.
Moment of inertia =![]()
This concept has an analogy in correspondence analysis. There is a cloud of profile points with masses adding up to 1. These points have a centroid ( i.e., the average profile) and a distance (Chi-square distance) between profile points. Each profile point contributes to the inertia of the whole cloud. The inertia of a profile point can be computed by the following formula.
For the ith row profile,
Inertia = ![]()
where rij
is the ratio nw/n i+ and
is n.j/n
The inertia of the jth column profile is computed similarly.
The total inertia of the contingency table is given by:
Total inertia ![]()
which is the Chi-square statistic divided by n.