2.1.2 Box Plots

The box plot shows three main features of a variable: its center, its spread, and its outliers. A box plot is made up of a box (a rectangle) with various lines and points added to it. The box plot yielded by GRAPHID has the following features.

  1. The base of the rectangle is proportional to the number of cases, and the lower and upper boundaries of the box show the lower and upper quartiles respectively. The length of the box is thus equal to the inter-quartile range (IQR), which is a convenient and popular measure of the spread.
  2. For each variable, a set of boxes are plotted, one for each group. For example, with IDAMS dataset ANJU.DAT, we can simultaneously examine the activity patterns of academic scientists in eight activities in four types of institutions– 32 box plots, four each in eight windows.
  3. The white line inside the box indicates the mean value and the green line indicates the median. The distance between the white line and the green line is an indicator of skewness. Greater the distance between these lines, greater is the skewness. When the mean and median coincide, only the white line is shown, and in that case the distribution would be perfectly symmetric. The mean and median for all the cases together are shown by the dotted lines.
  4. The left side of the window shows the scale of the variable.
  5. For each selected variable, GRAPHID plots a set of boxes, each corresponding to one group of cases or if no groups are specified, box plots of eight variables are plotted. The box plots can be zoomed, one at a time, to visualize their features more clearly.