|
Research Question |
: |
What is the pattern of relationships between eleven scientific fields of Indias cooperation links with foreign countries? |
|
Methodology |
: |
Pearson correlation |
|
Dataset |
: |
COOP.DAT |
$RUN PEARSON $FILES PRINT = PEARSON.LST DICTIN = COOP.DIC DATAIN =COOP.DAT $SETUP PROTOTYPE FOR PEARSON PROGRAM BADDATA=MD1 - MDHANDLING=CASE - ROWVARS=(V1-V11) - PRINT=(DICT,COVA,PAIR,XPRODUCTS) WRITE=(CORR)
|
After filtering 105 cases read from the input data file 1 cases contained illegal characters and were treated according to BADDATA specification Number of processed cases: 104 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Variable adjusted Mean S. D. Mean S. D. T-test Correlation coeff. Pair Wt. sum X X Y Y T R(i,j) 1 - 2 104. 2.365 10.650 32.769 102.245 17.469 .8657 1 - 3 104. 2.365 10.650 8.317 32.037 20.092 .8935 Unpaired means and standard deviations*** Variable Variable Adjusted Adjusted Mean S. D. Name No. N Wt. sum Sum X Sum X2 X X v1 1 104 104 2.4600000E+02 1.2264000E+04 2.365 10.650 v2 2 104 104 3.4080000E+03 1.1884340E+06 32.769 102.245 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Correlation matrix ***
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Cross Products Matrix ***
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Covariance Matrix (with diagonal) ***
|
|
IDAMS reports that 105 cases were read; one case which had illegal characters was treated as bad data. |
||
|
|
Descriptive statistics for all pairs of variables. Pairwise comparison of means by t-test.// descriptive statistics of single variables. |
|
| Correlation matrix | ||
|
Matrix of cross products The elements of the matrix are computed by the following formula:
|
||
|
Covariance Matrix The elements of the matrix are computed by the following formula: Covariance (X,Y) = S (X- The Pearson, Covariance and Cross products measures are related. If each entry of the Cross product matrix is divided by n 1, the result is a Covariance matrix. If each entry of the Covariance matrix is divided by the product of the standard deviations of the two variables, the result is a Correlation matrix. |
|
Research Question |
: |
Are there any differences in the distribution of academics of different ranks in different type of institutions? In other words, is there any association between rank of an academic and type of institution. |
|
Methodology |
: |
Chi-square |
|
Dataset |
: |
ANJU.DAT |
$RUN TABLES $FILES PRINT = MYTAB.LST DICTIN = ANJU.DIC DATAIN = ANJU.DAT $SETUP EXAMPLES OF TABLES PRINT=DICT TABLE SR=V14 C=V9 CELL=(FREQ,ROWP,COLP,TOTP)- STAT=(CHI,CV) MDHANDL=ALL
|
The data matrix is 2 variables and 1073 cases Row Variable number: 14 Column Variable number: 9 sv:inst type v204:rank |
||
|
|
| 1| 2| 3| 9|
|prof |reader |lecturer| | Total Revised
________|________|________|________|________|
1| | | | |
type1 | 97| 126| 51| 3| 377 374
Row %| 25.94| 33.69| 40.37| .00| 100.00
Col %| 26.43| 33.51| 48.40| .00| 35.45
Tot %| 9.19| 11.94| 14.31| .00| 35.45
________|________|________|________|________|
2| | | | |
type2 | 147| 96| 42| 12| 297 285
Row %| 51.58| 33.68| 14.74| .00| 100.00
Col %| 40.05| 25.53| 13.46| .00| 27.01
Tot %| 13.93| 9.10| 3.98| .00| 27.01
________|________|________|________|________|
3| | | | |
type3 | 41| 17| 2| 2| 62 60
Row %| 68.33| 28.33| 3.33| .00| 100.00
Col %| 11.17| 4.52| .64| .00| 5.69
Tot %| 3.89| 1.61| .19| .00| 5.69
________|________|________|________|________|
4| | | | |
type4 | 82| 137| 117| 1| 337 336
Row %| 24.40| 40.77| 34.82| .00| 100.00
Col %| 22.34| 36.44| 37.50| .00| 31.85
Tot %| 7.77| 12.99| 11.09| .00| 31.85
________|________|________|________|________|
Totals 367 376 312 18 1073
Col % 100.00 100.00 100.00 .00
Tot % 34.79 35.64 29.57 .00 100.00
Revised 367 376 312 0 1055
Column 9 is missing data and was deleted
|
|
| Chi square 118.50 Cramer's V .24 Contingency coefficient .32 Degrees of freedom 6 Adjusted n 1055 |
|
IDAMS reports that there are two variables and 1073 cases. Row variable is type of institution and column variable is rank. |
||
|
|
Cross tabulation of ranks of academics and types of institutions. | |
| The value of Chi-square is statistically highly significant (p < .301) which means that the association between categories of rank and type of institution is not random. |
|
Research Question |
: |
How does the (time) involvement of an academic scientist in teaching vary with his rank? |
|
Methodology |
: |
Oneway Analysis of Variance |
|
Dataset |
: |
ANJU.DAT |
$RUN ONEWAY $FILES PRINT = ONE_WAY.LST DICTIN = ANJU.DIC DATAIN = ANJU.DAT $SETUP EFFECT OF RANK ON INVOLVEMENT IN ADMINISTRATIVE WOR BADDATA=MD1 - PRINT=CDICT DEPVARS=(V2) CONVARS=(V9) BADDATA=MD1 - PRINT=CDICT DEPVARS=(V2) CONVARS=(V9)
|
After filtering 1073 cases read from the input data file 3 cases contained illegal characters and were treated according to BADDATA specification |
||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Control variable = var 9 v204:rank Depend. variable = var 2 v262:teaching |
|||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| IDAMS reports that 1073 cases were read, out of 1038 cases were used in the analysis ( 3 cases with illegal characters and 32 cases with missing data were treated as bad data. | ||
|
|
Specification Dependent variable = Time spent on teaching, Control variable = Rank ( 3 categories PRO(FESSOR), READER, LECTURER) | |
| Descriptive statistics | ||
|
Eta indicates the strength of relationship between the dependent variable and the control variable (Eta=1 signifies perfect relationship and Eta=0 signifies no relationship). Eta adjusted : Eta adjusted for degrees of freedom. F ratio is statistically highly significant (p > .005. So we can conclude that the involvement of an academic scientist varies with his rank |
|
Research Question |
: |
How does the involvement of an academic scientist in teaching affect his involvement in research? |
|
Methodology |
: |
Simple linear regression |
|
Dataset |
: |
ANJU.DAT |
$RUN REGRESSN $FILES PRINT = ANJU.LST DICTIN = ANJU.DIC DATAIN = ANJU.DAT $SETUP REGRESSN OF ACADEMIC INVOLVEMENT BADDATA=MD1 - MDHANDLING=50 - PRINT=(DICT,MATRIX) DEPVAR=V3 - VARS=(V2)
|
After filtering1073 cases read from the input data file |
|||||||||||||||||||||||||||||||||||||
|
|
specification Number of variables = 2 Number of cases = 1055 |
||||||||||||||||||||||||||||||||||||
|
General statistics
|
|||||||||||||||||||||||||||||||||||||
|
Total correlation matrix,R(i,j)
|
|||||||||||||||||||||||||||||||||||||
|
Dependent variable is V 3 v263:research
|
|||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||
| IDAMS reports that 1073 cases were read, out of which 1055 cases were used in the analysis 3 cases with illegal character and 15 cases with missing data were excluded.. | ||
|
|
Specification: Number of variables=2; Dependent variable:= Time on Research, Independent variable =:Ttime on teaching |
|
| Descriptive statistics of both dependent and independent variables. | ||
| Correlation matrix shows that the two variables are correlated negatively. | ||
|
Standard error of the estimate is a measure of the reliability of the estimating equation, indicating the variability of the observed points around the regression line in other words, the extent to which the observed values differ from their predicted values on the regression line. : F ratio in the aanalysis of variance table is used to test the hypothesis that the slope ( b ) of the regression line is 0. F ratio is large when the independent variable explains the variation in the dependent variable. There is a significant negative linear relationship between time spent on research and time spent on teaching. (F ratio=142.153; degrees of freedom = 1, 1053 ; p < .001) Multiple correlation coefficient (Multiple R) is the correlation between the dependent variable (cooperation links) and the predicted value. Greater the value of Multiple R, greater is the agreement between the predicted and observed values. . Fraction of explained variance (RSQD) can be interpreted as the proportion of the variation in the dependent variable explained by the regression line. It is also called the coefficient of determination. Both Multiple R and Coefficient of Determination are indicators of goodness of overall effectiveness of the linear regression. If the value of R2=1, then the regression line is the perfect estimator. If R2 = 0, then there is no relationship between X and Y. . Determinant of the Correlation Matrix .is the determinant of the correlation matrix of the predictors. It represents as a single number the generalized variance in a set of variables., and varies from 0 to 1. However, this has no meaning in the case of simple linear regression Residual degrees of freedom: If the constant is not constrained to be zero, df=N-p-1., where N is the total number of observations and p is the number of predictors. Constant term: This is the constant in the regression equation. |
||
|
B is the regression coefficient i.e the slope of the regression line. Sigma B is the standard error of the regression coefficient, which is a measure of the sample regression coefficient around the population regression coefficient. It is an indicator of the reliability of the coefficient. Smaller values indicate greater reliability. Beta is the standardized regression coefficient, which is independent of the scale of measurement. In the case of simple regression , Beta is equal to Multiple R. . Sigma Beta is the standard error of Beta. RSQD is the fraction of the explained variance. Marginal RSQD: Since there is only one predictor, Marginal RSQD ( .1183) is equal to RSQD (.1183). T ratio is used to test the hypothesis that B =0. T ratio = B/ Sigma B. Its significance can be tested from the table of t with n-p-1 degrees of freedom. Here, the value of t =11.005, df = 1053, which is highly significant ( p < .0001). Covariance ratio of a variable is equal to the square of Multiple correlation coefficient with other independent variables in the regression equation. It has no meaning in the case of simple linear regression. |
|
Research Question |
: |
What is the effect of the status of a scientist on the time devoted to administration? |
|
Methodology |
: |
Simple linear regression |
|
Dataset |
: |
ICSOPRU (R2CM.DAT) |
$RUN REGRESSN $FILES PRINT = DUM.LST DICTIN = R2R3CM.DIC DATAIN = R2CM.DAT $SETUP INCLUDE V1=360 DUMMY REGRESSION ICSOPRU DATA BADDATA=MD1 - MDHANDLING=50 - CATE - PRINT=(DICT,MATRIX) V201(1,2) DEPVAR=V222 - VARS=(V201)
|
After filtering 1151 cases read from the input data file Number of variables = 3 Number of cases = 1149 |
||||||||||||||||||||||||||||||||||||||||||
|
|
General statistics
|
|||||||||||||||||||||||||||||||||||||||||
|
Total correlation matrix, R(i,j)
|
||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||
|
IDAMS reports that 1151 cases were taken after filtering, out of which 1149 cases were used for regression analysis. Two cases with missing data were excluded. The independent variable Rank is categorized into two dummy variables ( Rank_1 = Head, Rank_2 = Scientist). Thus, the total number of varibles=3 (One dependent and two independent dummy variables) Dependent variable is the percentage of work time spent on administrative work. |
||
|
|
Descriptive statistics of both dependent and independent variables. | |
| Total Correlation Matrix: The elements of this matrix are computed directly from the matrix of residual sums of squares and cross products. | ||
|
Standard error of estimate is the standard deviation of the residuals. F Ratio in the Analysis of Variance table is used to test the hypothesis: ( b 1, b 2 = 0 ). F Ratio is large when the independent variable explains the variation in the dependent variable. There is a significant linear relationship between the rank of a scientist and the time devoted to Administrative work. ( F Ratio=216.336; degrees of freedom = 2, 1146; p < .001) This implies that Rank does affect the time devoted by a scientist to Administrative work. Multiple correlation coefficient (Multiple R) is the correlation between the dependent variable ( time spent on Administrative work) and the predicted value. Greater the value of Multiple R, greater is the agreement between the predicted and observed values. Here, the value of Multiple R is sufficiently large. Fraction of explained variance (RSQD) can be interpreted as the proportion of the variation in the dependent variable explained by the predictor variables. It is also called the coefficient of determination. It is equal to the square of Multiple R. Adjusted squared Multiple R ( Adjusted Fraction of the variance explained) = R2 (p-1) (1-r2)/ (n-p), where p is the number of predictors. Both Multiple R and Coefficient of Determination are indicators of goodness of overall effectiveness of the linear regression. Determinant of the Correlation Matrix .is the determinant of the correlation matrix of the predictors. It represents as a single number the generalized variance in a set of variables., and varies from 0 to 1. Determinants near zero indicate that some or all predictors are highly correlated. Here, the determinant of the correlation matrix ( .70791) is quite large, which indicates that the predictor variables (i.e. the categories of Rank) are not highly correlated. Note that a high correlation among the predictors can threaten computational accuracy, since it inflates the standard errors of the regression coefficients, which in turn attenuate the associated F statistics. Residual degrees of freedom: If the constant is not constrained to be zero, df=N-p-1., where N is the total number of observations and p is the number of predictors. |
||
|
Regression coefficient for Rank_1 is statistically highly significant ( t = 18.66, df= 1146, p = .001 ) Partial R squared (RSQD). This is a partial correlation, squared, between the predictor (Rank_1 ) and the dependent variable, with the influence of the other variable (Rank_2 ) eliminated. The Partial correlation coefficient squared is a measure of that part of the variance in the dependent variable that is not explained by other predictors. Here, 23.31 % of the variance in the dependent variable is explained by the dummy variable Rank_1. Regression coefficient for the dummy variable Rank_2 is also statistically significant ( t = 18.66, df= 1146, p = .001). The value of Partial correlation squared indicates that the dummy variable Rank_2 explains only 4.48 % of the variance. |