# One-way ANOVA

ANOVA stands for ‘analysis of variance’. One-way ANOVA **tests if there is evidence that the means of different populations are all equal**. The variability *between *the groups is compared to the variability *within *the groups.

**On this page**hide

## One-way ANOVA visualized

Below, is an example of how a one-way ANOVA can be visualized. Both the groups A, B, C and D, E, F have the means 6; 5 and 7. But the difference in variances for each group is greater for groups D, E and F compared to the ones for groups A, B and C.

Say, we have conducted a one-way ANOVA test for the blue groups (A, B, C) and it returns a ‘very low’ p-value:

We would then say that there is strong evidence against our null hypothesis that the means for groups A, B, C should all be equal. **There would be a ‘very strong’ evidence that these three samples come from populations that do not have the same population mean.**

**The opposite goes for the groups D, E, F**, although they all have the same spread in means. But their ANOVA test shows **no evidence** against the null hypothesis that the means are equal. So, **although these three groups have the same spread in means as A, B and C, these do seem to have equal means**. This is **due to the greater variability** that they have compared to A, B, C.

## Assumptions

The one-way ANOVA has similarities to the pooled variance t-test testing the difference between different means. And so, the assumptions of the one-way ANOVA are identic to those of the pooled variance two-sample t-test:

- The samples are independent simple random samples
- The populations are normally distributed
- The population variances are equal

## One-way ANOVA table and notations

The one-way ANOVA test often displays the results in an ANOVA table like the one below. In this table I have inserted the different terms that are being used. For example, for the upper source these three terms can be seen: ‘Treatment’, ‘Groups’ and ‘Between’.

The figure shows some of the different names and labeling that are used for the calculations. Here an overview of the different names that I have been able to find:

Different notations in the ANOVA table: The** number of groups **is seen as m and k and the sample sizes as *n.*

** **

## Formulas and more notations

Let’s work out the values of the table above:

Let:

- k represent the number of groups
- X
_{ij}represent the j^{th}observation in the i^{th}group - X̄
_{i}represent the mean of the i^{th}group - X̄ represent the overall mean, also called the grand mean
- s
_{i }represent the standard deviation of the i^{th}group - n
_{i }represent number of observations in the i^{th}group - n = n
_{1}+n_{2}+n_{3}… n_{k}represent the total number of observations

We subtract the overall mean (X̄) from each observation (X_{ij}), square this difference and sum up all these differences:

The SS(Total) consists of the sum of the two components: Sum of squares within (**SS _{within}**) + Sum of squares between (SS

_{Between}) for which the formulas are:

So, we get that: **SS(Total) = SS _{Within} + SS_{between}**

The mean square is the mean of squares which is the sum of squares divided by degrees of freedom:

** **F-statistics

The F distribution is the ration between two Chi-square distributions. In the case of one-way ANOVA, the **F-statistics compare the variability between groups to the variability within groups**:

So, if H_{0} is false, the MST will tend to be greater than MSE and thus the F-statistics will be larger.

The F-table is embedded in statistical software. In case, you wish to look up the F-values in an F-table, there is a different F-table for each significance level (α).

## One-way ANOVA worked example

The following example is from the Khan Academy video: ANOVA 1: Calculating SST. Say we run the three treatments A, B and C. We wish to test if all three means are the same:

As we see, the means for the three samples we have taken are 2,4 and 6*. Now, we will ask the question: Is there sufficient evidence to reject that all three means are equal?*

Let’s conduct the hypothesis test for one-way ANOVA to find the answer:** **

*Is there sufficient evidence to reject the hypotheses that all three means are equal?* That’s what we can get an answer to through this ANOVA test:

Let’s start by calculating the **grand mean**, **Sum of Squares Within** and **Sum of Squares Between**

Then we calculate the **F-statistics:**

The ANOVA table for this calculation gives:

As shown the critical F-value at a significance level (α) of 0.05 is **5.143**, so our F-statistic of **12** and a p-value is 0.008 show **strong evidence against the null hypothesis** that all means should be equal.

## Box plot of ANOVA test conclusion

Say that we have run a one-way ANOVA test for our box plot example above. We could display the boxplot like this:

## Caution when interpreting

That we reject the null hypothesis only means that we have evidence that not *ALL* means seem to equal. So, still **one or more can be equal**. **We don’t know at this point.** To solve for this would conduct a multiple comparison test in ANOVA.

## One-way ANOVA in MS Excel

One-way ANOVA can be carried out in MS Excel via **Data >> Data Analysis >> ANOVA Single Factor:**

** **

** **

## One-way ANOVA in R statistical programming

On its way…

## Learning statistics

- JBstatistics:
- Video (5:43): Introduction to one-way ANOVA
- Video (9:06): One-way ANOVA: The formulas
- Video (5:25): A one-way ANOVA example

- Khan Academy:
- Video (7:38): ANOVA 1: Calculating SST (total sum of squares)
- Video (13:19): ANOVA 2: Calculating SSW and SSB (total sum of squares within and between)
- Video (10:14): ANOVA 3: Hypothesis test with F-statistic

#### Carsten Grube

Freelance Data Analyst

##### Normal distribution

##### Confidence intervals

##### Simple linear regression, fundamentals

##### Two-sample inference

##### ANOVA & the F-distribution

+34 616 71 29 85

Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga

...........

Denmark: c/o Musvitvej 4, 3660 Stenløse

**Drop me a line**

*What are you working on just now? Can I help you, and can you help me? *

**About me**

Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children.

**Connect with me**

**What they say**

20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.

## 0 Comments