Least-Squares Method

Learning project summary edit

Content summary edit

A brief introduction to Least-Squares method, and its statistic meaning.

Goals edit

This learning project offers learning activities and some application for Least-Squares Method. With this project, one should understand the intention of Least-Squares Method, and what it means. Moreover, one should be able to apply some simple Least-Squares methods to find a good approximation for any functions. For more mathematical explanation, one should visit the following page: "Least squares" to obtain more information.

Learning materials edit

Texts edit

[1] Numerical Mathematics and Computing Chapter 12.1

[2] Numerical Method for Engineers: With Software and Programming Applications Chapter 17.3

[3] Statistics for Management and Economics Chapter 17.1

[4] T.Strutz: Data Fitting and Uncertainty. A practical introduction to weighted least squares and beyond. 2nd edition, Springer Vieweg, 2016, ISBN 978-3-658-11455-8.

Lessons edit

Lesson 1: Introduction to Least-Squares Method edit

The goal of Least-Squares Method is to find a good estimation of parameters that fit a function, f(x), of a set of data,  . The Least-Squares Method requires that the estimated function has to deviate as little as possible from f(x) in the sense of a 2-norm. Generally speaking, Least-Squares Method has two categories, linear and non-linear. We can also classify these methods further: ordinary least squares (OLS), weighted least squares (WLS), and alternating least squares (ALS) and partial least squares (PLS).


To fit a set of data best, the least-squares method minimizes the sum of squared residuals (it is also called the Sum of Squared Errors, SSE.)


with,  , the residual, which is the difference between the actual points and the regression line, and is defined as


where the m data pairs are  , and the model function is  .

At here, we can choose n different parameters for f(x), so that the approximated function can best fit the data set.

For example, in the right graph, R3 = Y3 - f(X3), and  , the sum of the square of each red line's length, is what we want to minimize.

Lesson 2: Linear Least-Squares Method edit

Linear Least-Squares (LLS) Method assumes that the data set falls on a straight line. Therefore,  , where a and b are constants. However, due to experimental error, some data might not be on the line exactly. There must be error (residual) between the estimated function and real data. Linear Least-Squares Method (or   approximation) defined the best-fit function as the function that minimizes  

The advantages of LLS:

1. If we assume that the errors have a normal probability distribution, then minimizing S gives us the best approximation of a and b.

2. We can easily use calculus to determine the approximated value of a and b.

To minimize S, the following conditions must be satisfied  , and  

Taking the partial derivatives, we obtain  , and  .

This system actually consists of two simultaneous linear equations with two unknowns a and b. (These two equations are so-called normal equations.)

Based on the simple calculation on summation, we can easily find out that




Thus, the best estimated function for data set  , for i is an integer between [1, n], is

 , where   and  .

Lesson 3: Linear Least-Squares Method in matrix form edit

We can also represent estimated linear function in the following model:  .

It can be also represented in the matrix form:  , where [X] is a matrix containing coefficients that are derived from the data set (It might not be a square matrix based on the number of variables (m), and data point (n).); Vector   contains the value of dependent variable, which is  ; Vector   contains the unknown coefficients that we'd like to solve, which is  ; Vector {R} contains the residuals, which is  .

To minimize  , we follow the same method in lesson 2, obtaining partial derivative for each coefficient, and setting it equal zero. As a result, we have a system of normalized equations, and they can be represented in the following matrix form:  .

To solve the system, we have many options, such as LU method, Cholesky method, inverse matrix, and Gauss-Seidel. (Generally, the equations might not result in diagonal dominated matrices, so Gauss-Seidel method is not recommended.) (discuss) 11:40, 23 August 2014 (UTC)KB

Lesson 4: Least-Squares Method in statistical view edit

From equation  , we can derive the following equation:  .

From this equation, we can determine not only the coefficients, but also the approximated values in statistic.

Using calculus, the following formulas for coefficients can be obtained:







Moreover, the diagonal values and non-diagonal values matrix   represents variances and covariances of coefficient  , respectively.

Assume the diagonal values of   is   and the corresponding coefficient is  , then


where   is called stand error of the estimate, and  .

(Here, lower index, y/x, means that the error of certain x is caused by the inaccurate approximation of corresponding y.)

We have many application on these two information. For example, we can derive the upper and lower bound of intercept and slope.

Assignments edit

To better understand the application of Least-Squares application, the first question will be solved by applying the LLS equations, and the second one will be solved by Matlab program.

Question1: Linear Least-Square Example edit

The following are 8 data points that shows the relationship between the number of fishermen and the amount of fish (in thousand pounds) they can catch a day.

Number of Fishermen Fish Caught
18 39
14 9
9 9
10 7
5 8
22 35
14 36
12 22

According to this data set, what is the function between the number of fishermen and the amount of fish caught? hint: let the number of fisherman be x, and the amount of fish caught be y, and use LLS to find the coefficients.

Calculation edit

By the simple calculation and statistic knowledge, we can easily find out:

  1.   = 13
  2.   = 20.625, and
  3. the following chart
X Y        
18 39 5 18.375 91.875 25
14 9 1     1
9 9     46.5 16
10 7     40.875 9
5 8     101 64
22 35 9 14.375 129.375 81
14 36 1 15.375 15.375 1
12 22   1.375   1

Thus, we have  , and  , so the slope, a, =   .

And last the intercept, b, =  .

Therefore, the linear least-squares line is  .

Question2: Nonpolynomial example edit

We have the following data  , where  , by a function of the form  .

x 0.23 0.66 0.93 1.25 1.75 2.03 2.24 2.57 2.87 2.98
y 0.25       0.28 0.13   0.26 0.58 1.03

Write a Matlab program that uses Least-Squares method to obtain the estimated function. hint: input the data in the matrix form, and solve the system to obtain the coefficients.


The blue spots are the data, the green spots are the estimated nonpolynomial function.

References edit

[1] Cheney, Ward and Kincaid, David. Numerical Mathematics and Computing Fifth Edition. Belmont: Thomson Learning, 2004

[2] Chapra, Steven and Canale, Raymond. Numerical Method for Engineers: With Software and Programming Applications Fourth Edition. McGraw-Hill, 2005

[3] Keller, Gerald. Statistics for Management and Economics Seventh Edition. Thomson Higher Education, 2005

External Links edit

[1] Least Squares Fitting at MathWorld

[2] Least Squares at wikipedia

Active participants edit

Active participants in this Learning Group