Confidence intervals for proportions
Confidence intervals for proportions calculate an interval of proportions in which there is a certain degree, usually 90; 95 or 99% confidence that the true proportion lies within.
When working with confidence intervals for proportions we work with z-statistics because we can calculate the standard deviation of the sampling distribution of the sample proportion:
Confidence intervals vs point estimate
Say that HR survey 18 employees out of a total of 200. A point estimate of their sample statistics can be that 33% of the employees are unsatisfied but they wish to associate some degree of uncertainty to this percentage, and therefore decides to do a confidence interval.
For example, they can choose to run a 95% confidence interval which returns a range for which they can be 95% confident that the true proportion lies within.
Proportion estimate = p̂
With a confidence interval for population proportion, we are trying to estimate a population, that typically is too large measure. So, we estimate. We don’t measure. The estimate for the population proportion (p) is denoted p̂ and follows the sampling distribution of p̂. With this sample statistics we can calculate estimates for the true population proportion.
Confidence intervals for proportions: Conditions
Three conditions must be complied with in order to estimate for a population proportion:
- The sample must be
- The sample size (n) must have a minimum size of np̂>10 and n(1-p̂) > 10 in order to apply the normal distribution as an approximation to the binomial distribution. In other words, there must be at least 10 expected successes or failures in a sample.
- Independence condition also called the ‘The 10% Rule’: If we sample without replacement, the sample size (n) must be smaller than 10% of the population (N): n<0.1N.
Formula components: Confidence intervals for proportions
Similarly, to how we calculate the confidence interval for population means with known σ, the formula of calculating for proportions is:
Margin of error
Alpha (α) and z-score
As described in Confidence intervals Alpha (α) = 1 – the confidence level. If we want a confidence interval is 99%, Alpha is 0.01. Alpha is the area outside of the confidence interval.
z-score: As described above, we can calculate our standard deviation of the sampling distribution of the sample proportion (σ of p̂) we apply the z-table.
The z-score for confidence level has 2 limits: an upper and a lower, so the alpha is divided by 2. Therefore, if we are looking for a confidence level of 95%, the alpha is 0.05/2 = 0.025 in upper and 0.025 in the lower. The z-scores for these alpha 0.05/2 levels are +1.96 and -1.96:
Just as we defined the confidence interval for a population mean with known σ, we now have the formula for the confidence level of a population proportion with known σ:
Understanding the components of the formula:
Confidence intervals in MS Excel
The screenshot below explains how confidence levels for proportions can be calculated in Excel. The =NORM.S.INV function calculates the z-score:
My favorite resources for learnings on confidence intervals:
- Jbstatistics video: Introduction to confidence intervals
- Khan Academy video on confidence intervals for proportions: Confidence interval example
- Khan Academy video: Example constructing and interpreting a confidence interval for p
- Wolfram Mathematica: Short demo of their confidence interval simulator with link to the simulator: Confidence Intervals: Confidence Level, Sample Size, and Margin of Error
Freelance Data Analyst
+34 616 71 29 85
Spain: Ctra. 404, km 2, 29100 Coín, Malaga
Denmark: c/o Musvitvej 4, 3660 Stenløse
Drop me a line
What are you working on just now? Can I help you, and can you help me?
Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children.
Connect with me
What they say
20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.