Census and sampling
Observing every element of a population is called census. However, the population is often too large, and we cannot contact and make observation of each element in the population and we therefore need to make a sample and estimate the population parameter. Census and sampling could maybe, well loosely, be said to be like counting vs. estimating.
In an opinion poll pending to the elections, we are interested in knowing what party is supposed to win and thus we are interested in knowing the voters’ opinions. But to go and knock every door in the country and to interview every citizen in the country is not realistic. There are too many doors and citizens, and it would take too much time and be too costly.
For a census, the results would be obsolete at the time when it was all collected and processed. Therefore, we do a sample which is a representative subset of a population. In this case, it would be a representative subset of the all voters.
Statistics offer a framework for calculating sample estimates. But estimates can be biased, incorrectly calculated and/or improperly interpreted.
Example when sampling preferred instead of census
How much force can a certain product take until it breaks? If we wish to test the entire population, we will have no products left, as they would all be broken. We need to run samples in order to calculate estimates for the true population of this product.
What is the probability of an infinite number of trials?
This can be interesting to know if we wish to approximate the true parameters for a population. But we cannot run an infinite number of trials.
Say we wish to know the probability of throwing head flipping a fair coin. We cannot flip an infinite number of times, so we can do a sample of 10 coin flips. This sample of 10 flips might not give the 0.5 that we, in this case, know that we are supposed to get. This is called the sample error, and as we increase the sample size, we get closer to the true proportion of the 0.5.
More on samples in Sampling distributions.
There are 2 overall sampling methods:
- Probability sampling: The probability sampling selects random population elements as to the pre-defined and formal selection framework. This procedure assures equal probability for each population element to take part of the samples.
- Non-probability sampling: This way of sampling relies on the analyst’s ability of selecting randomly in the population. It other words, it is up to the individual analyst which population elements she/he will select. This can be the easiest, fastest and most economical way around it, but it also opens for bias as not all population elements have equal probability of being chosen.
The following chart is inspired from QuestionPro’s page on sampling methods:
More on census and sampling
Freelance Data Analyst
+34 616 71 29 85
Spain: Ctra. 404, km 2, 29100 Coín, Malaga
Denmark: c/o Musvitvej 4, 3660 Stenløse
Drop me a line
What are you working on just now? Can I help you, and can you help me?
Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children.
Connect with me
What they say
20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.