ANOVA stands for ‘analysis of variance’. One-way ANOVA tests if there is evidence that the means of different populations are all equal. The variability between the groups is compared to the variability within the groups.
One-way ANOVA visualized
Below, is an example of how a one-way ANOVA can be visualized. Both the groups A, B, C and D, E, F have the means 6; 5 and 7. But the difference in variances for each group is greater for groups D, E and F compared to the ones for groups A, B and C.
Say, we have conducted a one-way ANOVA test for the blue groups (A, B, C) and it returns a ‘very low’ p-value:
We would then say that there is strong evidence against our null hypothesis that the means for groups A, B, C should all be equal. There would be a ‘very strong’ evidence that these three samples come from populations that do not have the same population mean.
The opposite goes for the groups D, E, F, although they all have the same spread in means. But their ANOVA test shows no evidence against the null hypothesis that the means are equal. So, although these three groups have the same spread in means as A, B and C, these do seem to have equal means. This is due to the greater variability that they have compared to A, B, C.
The one-way ANOVA has similarities to the pooled variance t-test testing the difference between different means. And so, the assumptions of the one-way ANOVA are identic to those of the pooled variance two-sample t-test:
- The samples are independent simple random samples
- The populations are normally distributed
- The population variances are equal
One-way ANOVA table and notations
The one-way ANOVA test often displays the results in an ANOVA table like the one below. In this table I have inserted the different terms that are being used. For example, for the upper source these three terms can be seen: ‘Treatment’, ‘Groups’ and ‘Between’.
The figure shows some of the different names and labeling that are used for the calculations. Here an overview of the different names that I have been able to find:
Different notations in the ANOVA table: The number of groups is seen as m and k and the sample sizes as n.
Formulas and more notations
Let’s work out the values of the table above:
- k represent the number of groups
- Xij represent the jth observation in the ith group
- X̄i represent the mean of the ith group
- X̄ represent the overall mean, also called the grand mean
- si represent the standard deviation of the ith group
- ni represent number of observations in the ith group
- n = n1+n2+n3… nk represent the total number of observations
We subtract the overall mean (X̄) from each observation (Xij), square this difference and sum up all these differences:
The SS(Total) consists of the sum of the two components: Sum of squares within (SSwithin) + Sum of squares between (SSBetween) for which the formulas are:
So, we get that: SS(Total) = SSWithin + SSbetween
The mean square is the mean of squares which is the sum of squares divided by degrees of freedom:
The F distribution is the ration between two Chi-square distributions. In the case of one-way ANOVA, the F-statistics compare the variability between groups to the variability within groups:
So, if H0 is false, the MST will tend to be greater than MSE and thus the F-statistics will be larger.
The F-table is embedded in statistical software. In case, you wish to look up the F-values in an F-table, there is a different F-table for each significance level (α).
One-way ANOVA worked example
The following example is from the Khan Academy video: ANOVA 1: Calculating SST. Say we run the three treatments A, B and C. We wish to test if all three means are the same:
As we see, the means for the three samples we have taken are 2,4 and 6. Now, we will ask the question: Is there sufficient evidence to reject that all three means are equal?
Is there sufficient evidence to reject the hypotheses that all three means are equal? That’s what we can get an answer to through this ANOVA test:
Let’s start by calculating the grand mean, Sum of Squares Within and Sum of Squares Between
Then we calculate the F-statistics:
The ANOVA table for this calculation gives:
As shown the critical F-value at a significance level (α) of 0.05 is 5.143, so our F-statistic of 12 and a p-value is 0.008 show strong evidence against the null hypothesis that all means should be equal.
Box plot of ANOVA test conclusion
Say that we have run a one-way ANOVA test for our box plot example above. We could display the boxplot like this:
Caution when interpreting
That we reject the null hypothesis only means that we have evidence that not ALL means seem to equal. So, still one or more can be equal. We don’t know at this point. To solve for this would conduct a multiple comparison test in ANOVA.
One-way ANOVA in MS Excel
One-way ANOVA can be carried out in MS Excel via Data >> Data Analysis >> ANOVA Single Factor:
One-way ANOVA in R statistical programming
On its way…
- Khan Academy:
- Video (7:38): ANOVA 1: Calculating SST (total sum of squares)
- Video (13:19): ANOVA 2: Calculating SSW and SSB (total sum of squares within and between)
- Video (10:14): ANOVA 3: Hypothesis test with F-statistic
Freelance Data Analyst
+34 616 71 29 85
Spain: Ctra. 404, km 2, 29100 Coín, Malaga
Denmark: c/o Musvitvej 4, 3660 Stenløse
Drop me a line
What are you working on just now? Can I help you, and can you help me?
Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children.
Connect with me
What they say
20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.