Chapter 6: Analysing the Data Part III: Common Statistical Tests

# Example 2

Another example of how chi-square can be used, is if you wanted to check if your sample had too many males in it or too many females and so be unrepresentative of the general population. For this exercise we will take the data from Figure 6.12 and the information in the Output 6.7 for gender.

 males females Observed 11 9 Expected 10 10

Note the expected value in each case is 10 because we would expect 50% of our total sample to be male and 50% to be female.

= 0.2

2 (1) = 0.20 is definitely not significant (critical value = 3.84), so we have no reason to suspect our sample is unbalanced with regard to gender representation.

Cross-tabulation and contingency tables

Chi-square tests for contingency tables are extremely useful statistical procedures for determining whether two categorical measures are related. If one of the variables is group membership and the other a dependent variable, the test may be used to analyse data from a simple randomised design, and the research may be either experimental or quasi-experimental.

The data are organised into a row X column table, and the statistical test is made to determine whether classification on the column variable is independent of classification on the row variable. For example, suppose that the column variable was used to classify the subjects with respect to political affiliation while the row variable was used to classify subjects with respect to religion. The chi-square test is then used to determine whether there is an association between religion and political affiliation.

There is no restriction with respect to the number of categories in either the row or column variable when the chi-square statistic is used to analyse data in a contingency table. There are, however, restrictions with respect to sample size similar to those encountered in the chi-square tests of goodness of fit (i.e., the Expected value in any cell should be >5).

The expected frequencies are derived from the marginal frequencies. These expected frequencies may be calculated from the formula:

Eij =

whereEij=the expected frequency for the cell in row i, column j

Ri=the sum of the frequencies in row i

Ci=the sum of the frequencies in column j

N=the sum of the frequencies for all cells

The chi-square statistics is calculated by summing over all cells:

= S

The degrees of freedom associated with the contingency table chi square is found by (r-1)(c-1). That is the number of rows minus one multiplied by the number of columns minus one. In a 2 X 2 contingency table, df = 1.