The ANOVA

What it’s for:

The ANOVA, or analysis of variance, is used to compare more than two means, testing the null hypothesis that all the means are equal (e.g. Mean A = Mean B = Mean C). A significant result in an ANOVA does not tell you which means are significantly different (e.g. Is Mean A different from Mean B?). Additional pairwise tests (not discussed here) must be performed if this kind of detailed analysis is needed. The ANOVA can be used to compare two means, and will return a result exactly equivalent to a t-test.

Assumptions/Cautions:

Use only with data which are normally distributed within each treatment group (parametric test).
Data points must be independent.
Sample sizes should be as similar as possible (Zar 1996).
How to use it:

Term
Definition
ni The sample size within each treatment group.
N The total sample size (sum for all treatment groups)
k The number of treatment groupsCalculation of the test statistic used in an ANOVA (called F) involves some fairly complicated calculations. First, note the definitions in the table to the right.

1) Calculate SSG (the groups sum of squares) and DFG (the groups degrees of freedom) using the formula in the box at right. Keep track of the intermediate calculation before subtracting C (the correction factor) because you will use this number again later. (All formulae used are from Zar 1996).

The sums involved look complicated, but if you do them step by step, they are easier to follow and understand. For SSG, start with your first treatment group: add up all the values, square this sum, and then divide by the number of measurements (ni).Now repeat this for each of your other treatment groups, and add the sums for each treatment together to get a grand total.

To calculate C, simple add up all your measurements in all treatment groups, square this number, and divide by the total number of measurements in all treatment groups (N).

2) Calculate SSE (the error sums of squares) and DFE (the error degrees of freedom), using the formula in the box at right.

Once again the complicated formula is easier to handle if you do it in steps. For the first term in SSE, square all your measurements (in all treatments). Now add all of these squared values together. The second term in the equation (after the minus sign) is the same as the first part of the calculation of SSG, so you should already have this number ready at hand.

3) Calculate MSG (groups mean square) and MSE (error mean square), using the formulae in the box to the right.

Finally, you are ready to calculate your test statistic, F, using the formula in the bottom of the box at right.

4) Now you must use your F statistic and the two degrees of freedom you have calculated to estimate a p-value. Your groups degrees of freedom (DFG) is your numerator degrees of freedom, and your error degrees of freedom (DFE) is your denominator degrees of freedom. You can estimate your p-value using a computer program or a table of critical values. Whichever method you choose, keep in mind the ANOVA uses a one-tailed probability, not the two-tailed probabilities used by the F test.

5) Draw a conclusion based on the p-value from 4). See also Types of Error.

MS Excel Tips:

Spreadsheets such as MS Excel make it much easier to calculate sums of values, and sums of values squared (these are built-in functions in Excel, SUM, and SUMSQ respectively). You can also carry out a full ANOVA using the Data Analysis Tools (found under TOOLS, DATA ANALYSIS) by choosing ANOVA: Single Factor. These tools are included in the version of Excel installed in student computer labs at The University of Lethbridge, but are not included in the default installation of Excel. Custom installation is required. Use of the ANOVA in Excel is fairly straightforward. Numerator degrees of freedom is reported as “Between groups df” and denominator degrees of freedom is reported as “Within groups df”. Details of sums of squares and mean squares are also reported, but these are not normally included in a research paper.

RETURN