Cell Contents
|-------------------------|
| Count |
| Expected Values |
| Chi-square contribution |
| Std Residual |
|-------------------------|
Total Observations in Table: 200
|
| [,1] | [,2] | Row Total |
-------------|-----------|-----------|-----------|
[1,] | 40 | 60 | 100 |
| 45.000 | 55.000 | |
| 0.556 | 0.455 | |
| -0.745 | 0.674 | |
-------------|-----------|-----------|-----------|
[2,] | 50 | 50 | 100 |
| 45.000 | 55.000 | |
| 0.556 | 0.455 | |
| 0.745 | -0.674 | |
-------------|-----------|-----------|-----------|
Column Total | 90 | 110 | 200 |
-------------|-----------|-----------|-----------|
Statistics for All Table Factors
Pearson's Chi-squared test
------------------------------------------------------------
Chi^2 = 2.020202 d.f. = 1 p = 0.1552185
Pearson's Chi-squared test with Yates' continuity correction
------------------------------------------------------------
Chi^2 = 1.636364 d.f. = 1 p = 0.2008251
Minimum expected frequency: 45
21 Chi-square
This chapter will cover the chi-square test, a statistical method used to examine the relationship between categorical variables. Unlike regression, which focuses on continuous dependent variables, the chi-square test assesses whether there is an association between two categorical variables. But why do researchers need to examine associations between categorical variables?
Understanding relationships between categorical variables is essential in many fields of research. Real-world behaviors, traits, and classifications are often categorical—such as gender, education level, voting preferences, or disease status. The chi-square test allows researchers to determine whether observed frequencies in different categories differ significantly from what would be expected by chance. By doing so, we can identify patterns and relationships that might not be immediately apparent.
In short, it’s another tool to add to your statistical toolbox.
Note that the chi-square test can be applying to more than two categorical variables. However, in this chapter we will primarily deal with two variables.
21.1 Some Additional Details
The chi-square test is particularly useful when researchers want to examine whether two categorical variables are independent or related. For example, a researcher might investigate whether gender is associated with voting preference or whether treatment group membership affects recovery rates.
The general form of the chi-square test statistic is:
where:
represents the observed frequency for cell (actual counts in each category), represents the expected frequency for cell (counts that would occur under the assumption of independence), is the chi-square test statistic, which follows a chi-square distribution.
21.1.1 Key Assumptions
Like all of our analyses thus far, a chi-square test is valid under the certain assumptions. Some of which we have already explored:
1. Independence of Observations
Each observation should belong to only one category, and observations should not be related to one another.
2. Expected Frequency Rule
Expected counts in each category should generally be 5 or more for the chi-square approximation to be valid. When expected counts are low, alternative methods (e.g., Fisher’s Exact Test) may be needed.
3. Large Sample Size
The chi-square test performs best with a sufficiently large sample, as small sample sizes may produce unreliable results.
4. Categorical Data
Both variables should be measured at the categorical level (e.g., nominal or ordinal scales) rather than continuous.
21.2 Contingency Tables and Expected Frequencies
Before conducting a chi-square test, it is important to organize the data into a contingency table. A contingency table, also known as a cross-tabulation or crosstab, displays the frequencies of observations of the two categorical variables. This table allows researchers to compare observed frequencies with expected frequencies under the assumption of independence.
A simple contingency table for two categorical variables (e.g., Gender and Voting Preference) might look like this:
Candidate A ( |
Candidate B ( |
||
---|---|---|---|
Male ( |
40 | 60 | Row total: 100 |
Female ( |
50 | 50 | Row total: 100 |
Column total: 90 | Column total: 110 | Total sample size: 200 |
While a contingency table may only display the actual frequencies in each cell (block), it is helpful to also write the row, column, and grand total, like the above table. It is also helpful to think of each row (
What is the cell frequency for the cell in the second row, first column:
Continuing, to determine whether the variables are independent, we need to calculate the expected frequency for each cell using the formula:
For example, the expected frequency for Male/Candidate A would be:
We need to do this for each cell in our contingency table. Doing so, we would get the following. This first table represents the observed frequencies:
Observed Frequencies
Candidate A ( |
Candidate B ( |
|
---|---|---|
Male ( |
40 | 60 |
Female ( |
50 | 50 |
This second table represents the expecte frequencies:
Expected Frequencies
Candidate A ( |
Candidate B ( |
|
---|---|---|
Male ( |
45 | 55 |
Female ( |
45 | 55 |
Comparing these expected frequencies with the observed counts allows us to determine whether any differences are statistically significant.
The next step is to compute the chi-square test statistic and assess its significance using the chi-square distribution.
21.3 Calculating the Chi-Square Test Statistic
After obtaining the observed and expected frequencies, we compute the chi-square test statistic using the formula:
For our example, the chi-square test statistic is calculated as follows:
Computing each term:
21.4 Determining Statistical Significance
Once we calculate the chi-square test statistic, we compare it to the critical value from the chi-square distribution table, or we compute a p-value.
The degrees of freedom (df) for a chi-square test are calculated as:
For our example:
Using a chi-square table or statistical software, we determine the critical value for our chosen significance level (e.g.,
You can find critical chi-square tables online. Additionally, there are websites that can caclulate an exact p-value for a given
21.5 Effect Size
It’s important to assess the strength of the association between the variables. One common measure of effect size for chi-square tests is Cramer’s V. Cramer’s V provides a standardized measure of association and is calculated as:
Where:
is the chi-square statistic, is the total sample size, is the number of rows in the contingency table, is the number of columns in the contingency table.
For example, for our 2x2 table, the effect size can be computed as follows:
Interpretation of Cramér’s V:
- Small effect:
- Medium effect:
- Large effect:
In this case, the effect size of 0.101 suggests a small association between the variables.
21.6 Post-hoc Analyses: Residuals
Residuals in a chi-square test help us understand the magnitude of discrepancies between observed and expected frequencies. They are calculated as:
The residuals give us an indication of how much each observed frequency deviates from its expected frequency in terms of standard deviations. For each cell, a large residual indicates a large difference between observed and expected frequencies, which could be important for identifying patterns in the data.
For our example:
For Male/Candidate A:
For Male/Candidate B:
For Female/Candidate A:
For Female/Candidate B:
These residuals can help us determine which specific categories contribute to the overall chi-square statistic.
Our formal test would result in:
and for Cramer’s V:
Two-sided 95% chi-squared confidence interval for the population
Cramer's V
Sample estimate: 0.1005038
Confidence interval:
2.5% 97.5%
0.0000000 0.2493303
Let’s now explore a full example relevant to the study of psychology.
21.7 Predominantly effective?: Another Example
You want to investigate whether teenagers with different ADHD subtypes will prefer various forms of treatment. You have reason to believe, based on a review of the literature, that individuals may prefer psychosocial treatments as opposed to medication treatments; however, results are mixed (e.g., Schatz et al., 2015). You decide to formally investigate the topic.
21.8 Step 1. Generate Hypotheses
The main null and alternative hypotheses for this chi-square test can be stated as follows:
- Null Hypothesis (
):- Therapy preference is independent of ADHD subtype.
- In other words, there is no relationship between ADHD subtype and therapy preference.
- Alternative Hypothesis (
):- Therapy preference is dependent on ADHD subtype.
- That is, different ADHD subtypes are associated with different therapy preferences.
Any post-hoc analyses will used standardized residuals
21.9 Step 2. Designing the Study
You and your team plan a research study. The method follows:
Participants:
A power analysis using an effect size of
Power analysis can be completed in R. The pwr.chisq.test()
function from the pwr
package is a sound method. It does, however, require Cohen’s
Where:
A major difference in this and the typical analyses we have been doing is that these are proportions, not frequencies.
This may seem taxing, particularly because you don’t have proportions. Well, we can approximate W using:
We can then use R to compute our power analysis:
Chi squared power calculation
w = 0.2
N = 298.3821
df = 4
sig.level = 0.05
power = 0.8
NOTE: N is the number of observations
Which suggests a sample of of
Participants were recruited from local ADHD support groups and clinical settings. Flyers and online advertisements were used to reach individuals diagnosed with ADHD. Eligible participants were required to have a confirmed ADHD diagnosis of one of the three subtypes: Predominantly Inattentive (PI), Predominantly Hyperactive-Impulsive (PHI), or Combined Type (CT). A total of 200 participants were surveyed.
Materials:
A structured questionnaire was used to collect self-reported therapy preferences. Participants selected their preferred treatment from three options: Cognitive Behavioral Therapy (CBT), Behavioral Therapy, or Medication
Procedure:
Participants completed an online survey that collected demographic information, ADHD subtype (based on a clinical diagnosis), and their preferred therapy type. Informed consent was obtained before participation. The ethics review board at Grenfell Campus reviewed and approved the study.
21.10 Step 3. Conducting the Study
The study was completed as described, and a total of 300 participants provided data. The responses were summarized in the following contingency table:
ADHD Subtype | CBT | Behavioral Therapy | Medication | Total |
---|---|---|---|---|
PI | 50 | 30 | 20 | 100 |
PHI | 30 | 50 | 70 | 150 |
CT | 20 | 40 | 40 | 100 |
Total | 100 | 120 | 130 | 300 |
21.11 Step 4. Analysing the Data
A chi-square test of independence was conducted to determine whether there was a significant relationship between ADHD subtype and therapy preference. The results are as follows:
Cell Contents
|-------------------------|
| Count |
| Chi-square contribution |
| Std Residual |
|-------------------------|
Total Observations in Table: 350
|
| CBT | BT | Med | Row Total |
-------------|-----------|-----------|-----------|-----------|
PI | 50 | 30 | 20 | 100 |
| 16.071 | 0.536 | 7.912 | |
| 4.009 | -0.732 | -2.813 | |
-------------|-----------|-----------|-----------|-----------|
PHI | 30 | 50 | 70 | 150 |
| 3.857 | 0.040 | 3.663 | |
| -1.964 | -0.199 | 1.914 | |
-------------|-----------|-----------|-----------|-----------|
CT | 20 | 40 | 40 | 100 |
| 2.571 | 0.952 | 0.220 | |
| -1.604 | 0.976 | 0.469 | |
-------------|-----------|-----------|-----------|-----------|
Column Total | 100 | 120 | 130 | 350 |
-------------|-----------|-----------|-----------|-----------|
Statistics for All Table Factors
Pearson's Chi-squared test
------------------------------------------------------------
Chi^2 = 35.82265 d.f. = 4 p = 3.147259e-07
Minimum expected frequency: 28.57143
Two-sided 95% chi-squared confidence interval for the population
Cramer's V
Sample estimate: 0.2262194
Confidence interval:
2.5% 97.5%
0.1592337 0.3015782
Our overall Chi-square was statistically significant, indicating that the observed data are unlikely given our expected data. We can further explore which cells seem to be driving our results by inspecting the standardized residuals. In our results, there are two cells that seem to be particularly influential: individuals with predominantly inattentive type (PI) seem to prefer CBT much more than expected, and prefer medication much less than expected.
21.12 Step 5: Write up your results
A chi-square test of independence was conducted to examine the relationship between ADHD subtype (PI, PHI, CT) and therapy type (CBT, Behavioral Therapy, Medication). The results of the chi-square test were statistically significant,
To further explore these results, we examined the standardized residuals for each cell. The standardized residuals indicated that individuals with predominantly inattentive type (PI) were more likely to prefer CBT (standardized residual
These findings suggest a strong preference for CBT among individuals with PI. Further research may be necessary to explore the underlying factors contributing to these preferences.
21.13 Conclusion
The chi-square test is a powerful tool for analyzing relationships between categorical variables. By comparing observed and expected frequencies, we can determine whether a meaningful association exists. While straightforward to compute, the test has key assumptions that must be met for valid results. Understanding and applying the chi-square test correctly is an essential skill for researchers working with categorical data.