+34 616 71 29 85 carsten@dataz4s.com

Confidence intervals for proportions

Confidence intervals for proportions calculate an interval of proportions in which there is a certain degree, usually 90; 95 or 99% confidence that the true proportion lies within.

 

 

Z-statistic

When working with confidence intervals for proportions we work with z-statistics because we can calculate the standard deviation of the sampling distribution of the sample proportion:

Confidence intervals for proportions_standard deviation

 

Confidence intervals vs point estimate

Say that HR survey 18 employees out of a total of 200. A point estimate of their sample statistics can be that 33% of the employees are unsatisfied but they wish to associate some degree of uncertainty to this percentage, and therefore decides to do a confidence interval.

For example, they can choose to run a 95% confidence interval which returns a range for which they can be 95% confident that the true proportion lies within.

 

Proportion estimate = p̂

With a confidence interval for population proportion, we are trying to estimate a population, that typically is too large measure. So, we estimate. We don’t measure. The estimate for the population proportion (p) is denoted p̂ and follows the sampling distribution of p̂. With this sample statistics we can calculate estimates for the true population proportion.

 

Confidence intervals for proportions: Conditions

Three conditions must be complied with in order to estimate for a population proportion:

  • The sample must be
  • The sample size (n) must have a minimum size of np̂>10 and n(1-p̂) > 10 in order to apply the normal distribution as an approximation to the binomial distribution. In other words, there must be at least 10 expected successes or failures in a sample.
  • Independence condition also called the ‘The 10% Rule’: If we sample without replacement, the sample size (n) must be smaller than 10% of the population (N): n<0.1N.

 

Formula components: Confidence intervals for proportions

Similarly, to how we calculate the confidence interval for population means with known σ, the formula of calculating for proportions is:

 Confidence intervals for proportions_formula structure

 

Margin of error

The margin of error (ME) is composed by the z-score multiplied with the standard deviation:Confidence intervals for proportions_formula structure

Alpha (α) and z-score

As described in Confidence intervals Alpha (α) = 1 – the confidence level. If we want a confidence interval is 99%, Alpha is 0.01. Alpha is the area outside of the confidence interval.

z-score: As described above, we can calculate our standard deviation of the sampling distribution of the sample proportion (σ of p̂) we apply the z-table.

The z-score for confidence level has 2 limits: an upper and a lower, so the alpha is divided by 2. Therefore, if we are looking for a confidence level of 95%, the alpha is 0.05/2 = 0.025 in upper and 0.025 in the lower. The z-scores for these alpha 0.05/2 levels are +1.96 and -1.96:

z-table look up

 

The formula

Just as we defined the confidence interval for a population mean with known σ, we now have the formula for the confidence level of a population proportion with known σ:

Confidence intervals for proportion_formula

 

Understanding the components of the formula:

 

Confidence intervals for proportion_formula explained

 

 

Confidence intervals in MS Excel

 

The screenshot below explains how confidence levels for proportions can be calculated in Excel. The =NORM.S.INV function calculates the z-score:

Confidence intervals for proportion with Excel

 

Learning statistics

My favorite resources for learnings on confidence intervals:

 

 

Carsten Grube

Carsten Grube

Freelance Data Analyst

0 Comments

Submit a Comment

+34 616 71 29 85

Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga

...........

Denmark: c/o Musvitvej 4, 3660 Stenløse

Drop me a line

What are you working on just now? Can I help you, and can you help me? 

About me

Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children. 

What they say

20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.