+34 616 71 29 85 carsten@dataz4s.com

Mean and single response intervals

Mean and single response intervals can be referred to as confidence interval and prediction interval in simple linear regression. A mean response interval is a confidence interval for the mean of all Y’s at a given X value. A prediction interval is a prediction interval for one single Y at a given X value.

 

 

Key points for mean and single response intervals

  • Mean response interval can be referred to as confidence intervals for the mean of all Y values at a given X value. It can be denoted µ̂(Y|X)
  • Single response interval can be referred to as the prediction interval for one single Y value at a given X value: It can be denoted Y(pred)
  • Y(pred) has a wider interval than µ̂(Y|X).

 

 

Overall on µ̂(Y|X) and Y(pred)

Jeremy Balka, in his video, Intervals for the mean response…, states the following example which I find helps understanding the two concepts and thereby to distinguish between them:

Say a power plant wish to estimate the mean daily power consumption (Y) for a given temperature (X). Or, they wish to predict the power consumption (Y) tomorrow (X) as weather forecasts says, “hot day tomorrow”. Say the temperature for tomorrow is expected to be X degrees. These would be the questions that we would work to answer for:

  • What consumption can the power plant expect to have for tomorrow (single Y value)?
  • What is the mean consumption on a day with X degrees (mean of all Y’s)?

 

Calculating the point estimates

Let’s take another example. Here, I will work the values by hand with a few holds for reflection and analyzing of the general concepts and formulas. The values are from the video ‘Prediction Interval…’ of Rick Vaughn.

 

We have the following relation observational data:

Mean response and single response intervals_Excel table

 

The regression model for this relationship is:

Mean response and single response intervals_regression equation

 

We assume that there is sufficient evidence to support the claim that the LINER model’s conditions are met and that we therefore can proceed with the inference statistics. First, let’s see what the predicted temperature is at 6327 ft:

Mean response and single response intervals_calculation

 

The predicted temperature at 6327 feet Y(Pred) = 48 degrees Fahrenheit.

This result is the same whether we calculate for mean of Y at this X value as for this single predicted Y value: 

Point estimates

 

But these are only point estimates. Let’s see what happens when calculating the corresponding intervals:

 

Estimating and predicting the intervals

We should be suspicious about point estimates and ask what degree of uncertainty is associated with them. How sure can we be that 48 is a good estimate and what range can we expect our value to lie between?

In statistics we wish to associate point estimates to some level of uncertainty. For example, can we be only 30% confident that the 48 is the is the true mean, or can we be 90% confident? What is this the uncertainty associated with the point estimate? Below, we will calculate for this, but let’s first understand why the name differences: confidence vs. prediction:

 

Why the names ‘confidence’ and ‘prediction’ intervals?

Confidence intervals are created for parameters and the estimation of a single Y is not a parameter as it comes from our sample. Both formulas are similar to the structure of formulas for confidence intervals that we know from other statistical disciplines:

Mean response and single response intervals_formulas

 

Standard error of the estimated mean of Y:

As in confidence interval calculations the standard error (SE) is one of the components of these two formulas. However, standard error is the term used for estimators of parameters. And, as mentioned above, the estimation of a single Y (the Y(pred)) is an estimation of a sample mean and not a parameter.

But the formulas are typically defined as standard errors (SE) and it can help to see the similarity that these formulas have compared with the “usual” confidence interval formulas.

Let’s see the two SE formulas:

SE formula for interval calculation

 

Where X* is the given X for which we are to estimate the Y. So, the component of (X* – X̄)2 indicates that the further away the given X value is from the mean X (X̄), the further away the line for this formula, because the larger the calculated value of this formula.

 

Standard error for the single predicted value of Y:

The SE calculation for a single predicted value of Y is similar to the one above for the SE calculation of estimated mean Y:

Prediction interval standard error formula

 

Like the SE calculation of the mean Y, SE(Ypred) formula, for calculating a single predicted Y value, results in a greater value the further away the given X (X*) is from the mean of X (X̄).

The formula differs with the +σ2 compared to the other formula of intervals for mean of Y. The + σ2 results in the “1” in the SE formula. This gives a greater ME when we predict for a single Y value as opposed to when we estimate for the mean of Y for that given X.

 

Comparing the two interval formulas

When comparing the interval formulas for µ̂(Y|X) vs predicting interval for Ŷ(Pred) we see that the only difference between them, as mentioned above, is the extra “1” in the formula for Ŷ(Pred). This “1” leads to a wider interval for the single Y value compared to the one of the mean of all Y values at that given X:

 

Comparing SE-formulas for confidence and prediction intervals

 

 

 

Let’s see this difference in the example listed above, where our point estimate was 48 for both the µ̂(Y|X) and for Ŷ(Pred). We plug the values into the formulas and see get a greater interval for the (Ypred) than for the µ̂(Y|X):

Standard error calculations for confidence and prediction intervals

 

 

 

The final interval calculations

Let’s see what happens when we apply the example from above with the temperatures at the different heights during flights:

Excel table for Mean response and single response intervals

 

As we recall from above, the point estimates are the same:

Confidence and preditction interval calculations for given X

As in confidence intervals of other statistical disciplines, we add and subtract ME (ME=SE =t-crit × SE) to the point estimate. Here we will run 95% intervals at df=5:

Confidence and preditction interval calculations for given X

 

So, we see that (Ypred) has a wider interval than µ̂(Y|X). The confidence interval for the mean of Y values at X=6.327 is [43 to 54] whereas the prediction interval for a single Y value at X=6.327 is [38 to 59].

 

Conditions for inference

For these interval calculations to be valid the conditions for inference must be met. As described in The LINER model and Residual plots, these conditions are:

 

LINER model

 

 

 

Visualizing µ̂(Y|X) and (Ypred)

Due to the “1” in the formula the (Ypred) has a wider interval than µ̂(Y|X). It might also seem logical to think that there is more uncertainty in a prediction made for one single Y value compared to for the estimating of the mean of all the Y values at the given X level.

Mean response and single response intervals graphed

 

 

Mean and single response intervals – shorter example

One more example, just in short:

Say that a Supplier A markets Service 1 suspects that Supplier B is following their prices for their competing service, Service 2. Supplier A wishes to see the relationship between the past 12 prices changes they have done. Has the competitor changed Service 2 corresponding to the new price on Service 1? Supplier checks the movements of the prices in Service 2 up to 24 hours after the price change in Service 1. Below, the 12 observations.

First, Supplier A now calculates the regression line and the coefficient of determination (r2) – displayed below. She checks the LINER model and the residual plots and find that the conditions are met for inference.

She could then test the slope through confidence interval and hypothesis testing. In this case she finds an extremely low p-value (0.000008) for the slope meaning that there is a very high probability that there is a relationship between X and Y.

 

Question 1

What is the Service 2 predicted price (Y) if the Service 1 price (X) is set to 13.0 at a 95% confidence level? This is the prediction interval for the single Y value: Y(pred)

 

Question 2

What is the confidence interval for the Service 2 mean price (Y), if the Service 1 price (X) is set to 13.0 at a 95% confidence level? This is the mean of all Y’s: µ̂(Y|X)

 

Excel calculation for mean response and single response intervals

 

Answer to Question 1

The predicted price interval for Service 2 (Y), at a 95% confidence level, if the Service 1 price (X) is set to 13.0 is 11.98 to 12.99. In other words, the prediction interval for the single Y value at a 95% confidence level = [11.98; 12.99].

 

Answer to Question 2

The confidence interval for the Service 2 mean price (Y), at a 95% confidence level, if the Service 1 price (X) is set to 13.0 is [12.33; 12.63].

 

Mean and single response intervals in Excel

There is no Excel function for calculation of the SE values nor for the intervals in themselves. That means that the SE-formulas must be written in by hand which is source for making errors. As such, I believe that these calculations should be done with a statistical software, or by obtaining statistical add-ins for Excel.

However, I have run the examples above in Excel. First I have calculated sum, mean, stdev, etc. in the data columns, as you can see. Second, I have run the Data >> Data Analysis >> Regression mainly to use its ‘Standard Error’ as the ‘s’ in the SE formulas, but also nice to have the regression equation, the coefficient of determination (r2) and the p-value at hand.

In the example above, we calculated confidence interval and prediction interval for X=13. In Excel we might as well calculate for the whole range of exes with the intervals that you find suitable. From that get the regression line together with the four interval lines.

Mean response and single response intervals in Excel

Confidence interval and prediction interval graph

 

I might do a short video to show how to, but you do a Line chart, right click the chart and click ‘Select data’. Here, you add to Legend Entries (Series) and make sure that the horizontal axis are your x values. Maybe, this screenshot can be of use for you:

Mean response and single response intervals in Excel

 

Learning statistics

Some of my preferred materials for learning on mean and single response intervals, finding JBstatistics great!

 

Carsten Grube

Carsten Grube

Freelance Data Analyst

+34 616 71 29 85

Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga

...........

Denmark: c/o Musvitvej 4, 3660 Stenløse

Drop me a line

What are you working on just now? Can I help you, and can you help me? 

About me

Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children. 

What they say

20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.