Statistical power, also called Power of Test, is the probability of rejecting the null hypothesis (H0) when it is false. So, it is when we correctly reject H0. Therefore, statistical power can only “live” in a world where the H0 is false. It is a conditional probability and depends on the so-called Type I and II Errors.
Statistical power & sample space
Statistical power, or power of test, is the probability that we reject a H0 that is, in fact, false. It is a conditional probability because it expresses the probability of one event occurring given that another event has already occurred:
P (event A|event B)
When we conduct a hypothesis test, we can either reject or fail to reject the H0, and the H0 can either be true or false. The scenario, or sample space, therefore, consists of these four possible outcomes:
As shown, the Power of Test expresses the correct decision of rejecting a false H0.
We only know that we don’t know…
We might never know about reality. We might never know if H0 is true or false. That’s important to have in mind. That’s why we are estimating and testing – because we don’t know. We intent to approximate what can expected to be the true world.
We test our estimates to see how likely they are to approximate the true parameters of some population that we cannot measure. Like during an election period, it is impossible to go and get the opinion of each citizen in the whole country and gather, analyze and conclude on data before the actual election takes place. We therefore take a sample in the attempt to estimate the voter’s opinions.
We take a sample in order to estimate the parameters of an immeasurable population. So, we don’t know if the null hypothesis is true or false. We are only in control of the decision to be taken regarding the unknown situation. Whether to reject or not reject the H0:
Type I & II Errors
A Type I Error is rejecting H0 when, in reality, it is true. A Type II is accepting H0, when, in reality, it is false.
Say that a statistician is conducting the following hypothesis test:
H0: µ = 5
HA: µ ≠ 5
Scenario A: We reject H0, so one of these two scenarios have occurred:
- H0 is false => Right decision (power of test)
- H0 is true => Wrong decision (Type I error)
Scenario B: We fail to reject H0, so one of these two scenarios have occurred:
- H0 is true => Right decision
- H0 is false => Wrong decision (Type II error)
Example: Bob and the deputy
Say that in a criminal trial Bob claims his innocence, and with the benefit of the doubt the following hypotheses are stated:
H0: Bob did not shoot the deputy
HA: Bob did shoot the deputy
Now, Bob can be either convicted (if we reject H0), or he can get acquitted (if we don’t reject H0):
If we transpose the events, the same situation can be visualized like this:
So, the Type I & II Errors in Bob’s case are:
- Type I Error = Convicting Bob when he, in reality, did not commit the crime.
- Type II Error = Acquitting Bob when he, in reality, did commit the crime.
The type I error seems unbearable, sending our Bob to jail in case he really is innocent. So, we are interested in reducing the Type I error. On the density curve, we see that the Type I error is the rejection area and thus, Type I is the significance level (α) which can be denoted:
Type I Error = P (Reject H0 | H0 is true) = α
The probability of a Type II Error is denoted beta (β) and is the probability of failing to reject a false H0. The two events: Type II Error and the statistical power compose the sample space in the world, where the H0 is false.
Having defined α (the probability of a Type I Error) and β (the probability of a Type II Error), we can now denote the following calculation for the power of test:
Power of test = 1 – P (Type II error) = 1-β
Because, when H0 is, in fact, false we can have one of the two events: Power or a Type II Error. We can either reject or not.
Relationship between α, β and Power
If we don’t like the α (the probability of committing a Type I Error, sending our innocent Bob to jail), why not minimize as much as possible? Because of the relation to β. By decreasing α, we increase β. By decreasing the probability of Bob being falsely convicted, we increase the probability of acquitting him which is not interesting in case he should be guilty (as in the case of Sheriff John Brown).
In other words, the smaller the α, the smaller the chance of rejecting the H0 which increases the probability of Type II Errors.
Let’s visualize the relationship between α, β and the Power of test in a so-called power curve:
The lower the α, the greater the β, and the lower the Power.
Visualizing alpha, beta and Power in bell curves:
How statistical power is increased
The statistical power increases when:
- Sample size (n) is increased (controlled by us)
- Significance level (α) is increased (controlled by us)
- Variance and the standard deviation decrease (not controlled by us)
- The true parameter is further away from the sample distribution (not controlled by us)
We control the size of the sample (n) and the significance level (α). Increasing the sample size is always a good idea, if the conditions allow for it (time, money, data accessibility, etc.). Increasing α also increases the probability of committing a Type I Error. The α is therefore usually relatively low (typically from 0.01 to 0.05 and not above 0.1)
Statistical power in Excel
The =NORM.INV function returns the critical value and the =NORM.DIST the beta (β) and 1-β = Power:
I find these tutorials on statistical power/power of test very useful:
- Khan Academy (video 9:44): Introduction to power in significance tests
- JBstatistics (video 8:10): Type I Errors, Type II Errors, and the Power of Test
- Statistics How To (text page): Statistical power: What it is. How to calculate it.
Freelance Data Analyst
+34 616 71 29 85
Spain: Ctra. 404, km 2, 29100 Coín, Malaga
Denmark: c/o Musvitvej 4, 3660 Stenløse
Drop me a line
What are you working on just now? Can I help you, and can you help me?
Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children.
Connect with me
What they say
20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.