+34 616 71 29 85 carsten@dataz4s.com
Select Page

# Squared errors of line

The squared errors of line are the vertical distances (squared) from the estimated regression line to each datapoint in the line fit plot. They are therefore errors and crucial in the process of making inference in regression analysis. For this inference process the errors are squared and known as residuals or as the squared errors of line.

## The errors = distance from sampled points to the line

The regression line is only a “best possible fit” to the actual observed datapoints. It will not go through every datapoint and it might not go through any of the datapoints.

The regression line goes through (X,Y)-coordinates that are not equal to the (X,Y)-coordinates of the observations sample. And the distance from the two sets of coordinates is called the errors, or the residuals.

So, the errors are what the regression line calculated “erroneously”. So, it is the distance from reality to the predicted line.

## Calculating the squared errors of line

Let’s apply the 4-datapoint mini example that from the chapter Regression line for cacluating the squared errors of line.

To calculate the total of all errors, we sum the distances for each datapoint to the line. And the distance from the datapoint to the line is the y-value of the datapoint minus the y-value of the line for the given X-value. So, the error is the vertical distance from the point to the line:

As mentioned, each error is the vertical distance from the estimated line to the datapoint: Observed value (the datapoint) minus the line. This can also be expressed as Yi – mxi+b, where Y is the observed datapoint and mx+b the estimated line.

Visualized in graph:

Finally, we sum up the all the squared errors of line and this sum of squared errors is often known as the sum of the squared errors of line and can be seen shortened as SELine.

## Sum of the squared errors of line (SELine)

The sum of the squared error of line (SELine) plays an important role in statistical inference in regression analysis and is, amongst others, applied to calculate the coefficient of determination (r²). SELine is the sum of each of the errors which in our example becomes: (0.6)²  + (-0.8)² + (0.4)² + (-0.2)² = 1.2:

## Sum of the squared errors of the mean (SEӯ)

Another component that is central in making inferences in regression is the error from the mean of y (SEӯ). SEӯ is the distance from ӯ to each datapoint (yi):

So, the sum of the squared errors of mean y (SEӯ) is added up:

## Squared errors of line in MS Excel

Regression analysis in Excel can be run following the path Data >> Data Analysis >> Regression:

From this table, you will expand with the squared values, etc. Another option is to build spreadsheet tables, like in the examples above on this page.

## Learning statistics

Some of my preferred material on squared errors of line:

#### Carsten Grube

Freelance Data Analyst

p
p
p
##### ANOVA & the F-distribution

+34 616 71 29 85

Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga

...........

Denmark: c/o Musvitvej 4, 3660 Stenløse

Drop me a line

What are you working on just now? Can I help you, and can you help me?