Bivariate statistics. For bivariate analyses, the following statistics can be requested:
- | t-tests of means (assumes independent populations) between pairs of rows, |
- | chi-square, contingency coefficient and Cramers V, |
- | Kendalls Taus, Gamma, Lambdas, |
- | S (numerator of the tau statistics and of gamma), its standard and normal deviations, and its variance, |
- | Spearman rho, |
- | non-parametric tests: Wilcoxon, Mann-Whitney and Fisher. |
Note. The following options are available to control the appearance of the printout:
Percentages and mean values, if requested, may be printed in separate tables.
The grid can be suppressed.
Rows which have no entries in a particular section of a large frequency table can be printed; tables with more than ten columns are printed in sections and the use of this "zero rows" option ensures that the various sections have the same number of rows (which is important if they are to be cut and pasted together).
Other tables are printed in the order of the table specifications except for tables for which only univariate statistics are requested; these are always grouped together and printed last.
Bivariate tables. Each bivariate table starts on a new page; a large table may take more than one page. Tables are printed with up to 10 columns and up to 16 rows per page depending on the number of items in each cell. Columns and rows are printed only for codes which actually appear in the data. Row and column totals, and cumulative marginal frequencies and percentages if requested, are printed around the edges of the table.
A large table is printed in vertical strips. For example, a table with 40 row codes and 40 column codes would normally be printed on 12 pages as indicated in the following diagram, where the numbers in the cells show the order in which the pages are printed:
1st 2nd 3rd 4th 10 10 10 10 codes 1st 16 codes 1 4 7 10 2nd 16 codes 2 5 8 11 last 8 codes 3 6 9 12Bivariate statistics. (Optional: see the table parameter STATS).
t-tests. (Optional: see the table parameter STATS). If t-tests were requested, they and the means and standard deviations of the column variable for each row are printed on a separate page.
Matrices of bivariate statistics. (Optional: see the table parameter PRINT). The lower-left corner of the matrix is printed. Eight columns and 25 rows are printed per page.
Matrix of Ns. (Optional: see the table parameter PRINT). This is printed in the same format as the corresponding statistical matrix.
Univariate tables. (Optional: see the table parameter CELLS). Normally each univariate table is printed beginning on a new page. Frequencies, percents and mean values of a variable, if requested, for ten codes are printed across the page.
Univariate statistics. (Optional: see the table parameter USTATS).
Quantiles. (Optional: see the table parameter NTILE). N-1 points are printed; e.g. if quartiles are requested, the parameter NTILE is set to 4 and 3 breakpoints will be printed.
Page numbers. These are of the form: ttt.rr.ppp where
ttt | = | table number | |
rr | = | repetition number (00 if no repetition used) | |
ppp | = | page number within the table. |
Variable identification records (#R and #C) contain code values and code labels for the row and the column variable respectively.
The statistics are written as 80 character records according to a 7F10.2 Fortran format. Columns 73-80 contain an ID as follows:
Note that the missing data codes are not included in the matrix.
Note. If only ROWVARS is provided, dummy means and standard deviations records are written, 2 records per 60 variables. The second format (#F) record in the dictionary specifies a format of 60I1 for these dummy records. This is so that the matrix conforms to the format of an IDAMS square matrix.
The input is a data file described by an IDAMS dictionary. All variables referenced must be numeric.
$RUN TABLES $FILES File definitions $RECODE (optional) Recode statements $SETUP 1. Filter (optional) 2. Label 3. Parameters 4. Subset specifications (optional) 5. TABLES 6. Table specifications (repeated as required) $DICT (conditional) Dictionary $DATA (conditional) Data Files: FT02 output tables/matrices DICTxxxx input dictionary (omit if $DICT used) DATAxxxx input data (omit if $DATA used) PRINT printed output (default IDAMS.LST) |
Example: INCLUDE V3=6
Example: FREQUENCY TABLES
Example: BADDATA=SKIPINFILE=IN /xxxx
BADDATA=STOP /SKIP/MD1/MD2
MAXCASES=n
MDVALUES=BOTH /MD1/MD2/NONE
PRINT=(CDICT/DICT, TIME)
Example: CLASS INCLUDE V8=1,2,3,-7,9There are two types of subset specifications: local filters and repetition factors. Each has a different function, but their formats are very similar. One specification may be used as a local filter for one or more tables and as a repetition factor for other tables. The format of these specifications is a standard IDAMS filter statement preceded by a subset name beginning with an alphabetic character in columns 1-8. This name must match exactly the name to be used on the table specifications; leading or embedded blanks are not ignored. It is always changed to upper case.
It is recommended that all names be left-justified. The filter statement must start in column 9 or beyond.
For repetition factors, only one variable may be specified in the expression.
The way local filters and repetition factors work is described below.
Local filters. A subset specification is identified as a local filter for a table or set of tables by specifying the subset name with the FILTER parameter. The local filter operates in the same manner as the standard filter except that it applies only to the table specification(s) in which it is referenced.
Example: EDUCATN INCLUDE V4=0-4,9 AND V5=1 (subset name) (expression)In the example above, if EDUCATN is designated as a local filter on the table specification, the table would be produced including only cases coded 0, 1, 2, 3, 4 or 9 for V4 and 1 for V5.
Repetition factors. A subset specification is identified as a repetition factor for a table or set of tables by specifying the subset name with the REPE parameter. Only one variable may be given on a subset specification to be used as a repetition factor. Repetition factors permit the generation of 3-way tables where the variable used in the repetition factor can be considered as the control or panel variable. Using a repetition factor and a filter, 4-way tables may be produced.
INCLUDE expressions cause tables to be produced including cases for each value or range of values of the control variable used in the expression. Commas separate the values or ranges. Thus if there are n commas in the expression, n+1 tables will be produced.
Example: EDUCATN INCLUDE V4=0-4,9 (subset name) (expression)In the above example, if EDUCATN is designated as a repetition factor, two tables will result: one including cases coded 0-4 for variable 4, and another including cases coded 9 for variable 4.
EXCLUDE may be used to produce tables with all values except those specified.
Example: EDUCATN EXCLUDE V1=1,4 (subset name) (expression)In the above example, if EDUCATN is designated as a repetition factor, two tables will result: one including all values except 1 and another including all values except 4.
Examples: R=(V6,1,8) CELLS=FREQS (One univariate table). R=(V6,1,8) C=(V9,0,4) - (One bivariate table with repetition REPE=SEX CELLS=(ROWP,FREQS) factor, i.e. 3-way table). ROWV=(V5-V9) CELLS=FREQS USTA=MEAN (Set of univariate tables). ROWV=(V3,V5) COLV=(V21-V31) - (Set of bivariate tables). R=(0,1,8) C=(0,1,99)ROWVARS=(variable list)
COLVARS=(variable list)
R=(var, rmin, rmax)
C=(var, cmin, cmax)
TITLE=table title
CELLS=(ROWPCT, COLPCT, TOTPCT, FREQS /NOFREQS, UNWFREQS, MEAN)
VARCELL=variable number
MDHANDLING=ALL /R/C/NONE
WEIGHT=variable number
FILTER=xxxxxxxx
REPE=xxxxxxxx
USTATS=(MEANSD, MEDMOD)
NTILE=n
STATS=(CHI, CV, CC, LRD, LCD, LSYM, SPMR, GAMMA, TAUA, TAUB, TAUC, WILC, MW, FISHER, T)
DECPCT=2 /n
DECSTATS=2 /n
WRITE=MATRIX/TABLES
PRINT=(TABLES/NOTABLES, SEPARATE, ZEROS, CUM, GRIDNOGRID, N, WTDN, MATRIX)
In the example below, the following tables are requested:
$RUN TABLES $FILES PRINT = TABLES.LST FT02 = TREE.MAT matrices of statistics DICTIN = TREE.DIC input dictionary file DATAIN = TREE.DAT input data file $RECODE R7=BRAC(V7,0-15=1,16-25=2,26-35=3,36-45=4,46-98=5,99=9) NAME R7'GROUPED V7' $SETUP TABLE EXAMPLES BADDATA=MD1 MALE INCLUDE V10=1 SEX INCLUDE V10=1,2 REGION INCLUDE V3=1-2,3-4,5 MD EXCLUDE V19=9 OR V52=9 TABLES 1. ROWV=(V201-V220) TITLE='Frequency counts' 2. ROWV=(V54-V62,V64) USTATS=MEANSD PRINT=NOTABLES DECSTAT=1 3. ROXV=(V25-V30,R7) USTATS=MEDMOD CELLS=(FREQS,UNWFREQS,ROWP) - WEIGHT=V9 PRINT=CUM MDHAND=NONE 4. R=(V201,1,3) CELLS=(FREQS,MEAN) VARCELL=V54 5. ROWV=(V25-V28) COLV=(V29-V30) - CELLS=(FREQS,ROWP,COLP,TOTP) STATS=(CHI,TAUA) REPE=SEX 6. ROWV=(V201-V203) COLV=V206 - CELLS=(FREQS,MEAN) VARCELL=V54 REPE=REGION FILT=MALE 7. R=V19 C=V52 WEIGHT=V9 FILT=MD 8. ROWV=(V54-V62) STATS=(TAUA,GAMMA) PRINT=(MATRIX,N) WRITE=MATRIX