+34 616 71 29 85 carsten@dataz4s.com

Chi-square test

The Chi-square test is also called “Goodness of fit”, as it compares fit of the observed sample data with the expected data. The chi-square test analyzes the dependence between different categorical datasets. Example: Do more women than men vote for some political party?



Key points for chi-square test

  • The Chi-square distribution is applied when testing for dependence between categorical datasets
  • It works with discrete and mutually exclusive data


Chi-square test worked example

Say that a HR department, as part of ongoing training program, is running a periodical test among the employees on their product knowhow. A multiple-choice test is used for the purpose. The test has four possible answers: A, B, C and D. The test producers claim that there is an equal probability that the correct answer is either four of these.

Some of the HR staffs get curious and wish to test the truth of this claim: Is the probability of a correct answer really equal between A, B, C and D?

We would define the following chi-square hypotheses:

H0: The correct choices are equally distributed (A: 25%, B: 25%, C: 25%, D: 25%)

H1: The correct choices are not equally distributed

Let’s set a significance level (α) of 0.05.


Purpose of the chi-square hypothesis test

The Chi-square test is a hypothesis tests and follow the same procedure and concepts, so we will reject the null hypothesis in case our p-value is lower than our significance level (α). Rejecting the null hypothesis, in our case, will mean that we reject that the probabilities can be equally distributed between the four options A, B, C and D.


Expected vs observed data

HR take a sample of 100 tests randomly selected from the past years of testing with this method. As the null hypothesis says that the correct answers are equally distributed, we expect 25 correct answers for each of the four questions. These expected values are compared to the once observed in the sample and we can make the following contingency table: 

Chi-square test example

Now, how can we calculate if the result that we got from our sample is more extreme that what our significant level allows for?  How can we know if our sample result is “statistically significant”? We apply the Chi-square distribution:


Chi-square distribution

With the Chi-square test we can calculate for dependence or independence between different categories, in our case between A, B, C and D. It works with mutually exclusive data, meaning that if for a question the correct choice is A, then it cannot be D at the same time.

Data in the Chi-square distribution is countable and therefore discrete. It is countable data. We can count each question and choice as a whole integer

The Chi-square distribution is denoted with the Greek letter Chi squared: ꭓ2. To calculate the Chi-squared statistics, we calculate the sum of the squared differences between the observed and the expected values. This value is the related to the expected value. Thus, we get the formula for Chi-square statistics:

Chi-square test statistic formula


The Chi-square formula explained

To find the distance between the observed and the expected, we subtract the expected value from the observed. This is also called “residual”.

The differences are squared in order to obtain only positive values and are divided by the expected value in order to normalize independently of the number of counts. Otherwise, the Chi-square statistic would increase with the number of counts, so for large datasets we would get large statistics. This is the idea about normalizing or standardizing (ref. the Z-score chapter).

The operation is carried out for each count, or in our situation, for each row and then added up. The adding up of each count is expressed by the large sigma in front of the formula:

Chi-square test statistic calculation

Our calculated value, or our test statistic is 6.0. This is now tested against the corresponding value from the chi-square table which is found by looking up under the degrees of freedom:


Degrees of freedom

The degrees of freedom (df) is the number of category values, or cells in our table, that are independent. If we have the totals, and four values that add up to the total, filling in 3 cells will let us know what the fourth value is.

For example, in our table, for the observed data, we would know that the value for D must be equal to 20, knowing that A+B+C+D = 100. Knowing the values for A, B and C and that they, together with D will add up to 100, tells that D must be 20. So, D, in this case, is not free to vary. A degree of freedom is lost. 

Degrees of freedom

 So, the values for A, B and C can be any values, but D must be the missing puzzle that makes all four add up to 100. In mathematical terms this expresses that A, B and C are free to vary. They are independent and free to vary and therefore express the degrees of freedom.

The degrees of freedom for Chi-square tables, like the table in our example, is (Row-1) × (Column-1). We express this as r-1, c-1. In our table, we have four rows and two columns, so our degree of freedom is (4-1) × (2-1) = 3.

With the degrees of freedom and the significance level, you can look up the probability, or the p-value, for independence.


Looking up in the Chi-square table

To look up the p-value in the Chi-square distribution table, we look at the row of degrees of freedom (df) = 3 and follow the line to the column that corresponds with our significance level (α) of 5%.


Lookup in Chi-square table


At df=3 and α=0.05, we find a critical value of 7.81. Visualizing this with the Chi-square probability density curve for df=3 compared to our test statistic of 6.0:


Chi-square test statistics visualized in density curve


Chi-square test conclusion

We find a critical value of 7.81 which is greater than our 6.00. So, we fail to reject the null hypothesis concluding that, based on our sample results, we cannot reject that the choices are equally distributed.

We recall that we do not conclude that the H0 is the actual result. Failing to reject the H0 only means that we cannot reject that it could be true. We do not conclude that it is, in fact, true. In fact, our 6.00 is “pretty” close to the critical value of 7.28. And from the table, we can read that 6.00 is little more than 10%, because the 0.1 column at df=3 returns 6.25.

So, we get a p-value of a little greater than 10%. This means that there is “a little” more than 10% probability that we will get as extreme a result as the one we got at 6.0. Or expressed as: “There is + 10% chance of getting 6.00 or more”.


Visualizing multiple chi-square distributions

The following graph shows multiple chi-square distributions with each of their different degree of freedom:


Chi-square probability density curves



Chi-square test with MS Excel

The CHISQ.TEST and CHISQ.INV and CHISQ.DIST functions in Excel return values in the Chi-square distribution and available from off the Excel 2010 version and later.



The Excel function CHISQ.TEST conducts a Chi-square test on the array of observed values and on the array of expected frequencies. It returns the p-value and thereby the probability that our result is due to chance or sampling error.

CHISQ.TEST function in Excel



The CHISQ.INV returns the critical value or the inverse of the left-tailed probability:

CHISQ.INV function in Excel



The Excel function CHISQ.DIST with the arguments (x,df,cumulative=TRUE) returns the cumulative distribution function.

CHISQ.DIST function in Excel



When cumulative set to ‘FALSE’ (x,df,cumulative=FALSE) it returns the probability density function. ‘x’ is the calculated test statistic which for Chi-square statistics is ∑(O-E)2/E

CHISQ.DIST function in Excel



Learning statistics


Carsten Grube

Carsten Grube

Freelance Data Analyst


Submit a Comment

+34 616 71 29 85

Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga


Denmark: c/o Musvitvej 4, 3660 Stenløse

Drop me a line

What are you working on just now? Can I help you, and can you help me? 

About me

Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children. 

What they say

20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.