Loading and building chapter...

This chapter is about Gaussian Elimination which is a method for solving systems of linear equations. Such systems are often encountered when dealing with real problems, such as this computer vision problem: Given a number of images of an object, calculate a 3D model of the object. Figure 5.1 shows four images of Örebro Castle. Using those, and 667 other similar images, it is possible to calculate the point cloud shown in Figure 5.2. These calculations include solving many systems of equations. Gaussian Elimination is a very good way to learn about such systems. For more information about this example see this YouTube clip.

One way to familiarize oneself with systems of equations is to investigate what happen when two lines cross.

To describe a two-dimensional line in the plane, one of the first notations that many people encounter is

\begin{equation} y = kx + m, \end{equation} | (5.1) |

\begin{equation} ax + by + c = 0 \end{equation} | (5.2) |

\begin{equation} \begin{cases} 2 x + 3 y = -5,\\ 5 x + 2 y = 4. \end{cases} \end{equation} | (5.3) |

Interactive Illustration 5.4 shows how the point of intersection between two lines, i.e., the solution to systems of equations such as (5.3), changes interactively when the two line equations are changed.

This chapter presents a systematic way, called

Example 5.1:
Video Compression

Each pixel in a videoframe is displayed on a computer screen by lighting up three subpixels, one for red, one for green, and one for blue. In the top row of Figure 5.5, these color components are shown independently to the right. As can be seen, these color channels (RGB) are quite similar to each other, which is due to the fact that all three are affected by luminance, i.e., the overall brightness per pixel. Stating the same thing three times is inefficient, and therefore video coding schemes such as H.264 do not store the pixels as $(R, G, B)$. Instead $(Y, Cb, Cr)$ is stored, where

As can be seen in the lower row of Figure 5.5,
the $Y$, $Cb$ and $Cr$ channels are now less similar to each
other. The last two channels also seem more washed-out, and this
translates, it turns out, to that fewer bits can be used to
represent the image. On top of this, the human visual system is
much more sensitive to errors in luminance ($Y$) than in the two
other channels. This is also exploited by representing the chromiance channels less
accurately, which results in even fewer bits.

During decompression, we need to get back from $(Y, Cb, Cr)$ to $(R, G, B)$ in order to display the pixel on the screen. Assume we have computed that the color in the $(Y, Cb, Cr)$-space for a pixel is $Y=180$, $Cb=80$, and $Cr=150$. What $(R, G, B)$ color does that represent? The answer can be given by inserting the values for $Y$, $Cb$ and $Cr$ in the equations above and solving the resulting linear system of equations, i.e.,

Each pixel in a videoframe is displayed on a computer screen by lighting up three subpixels, one for red, one for green, and one for blue. In the top row of Figure 5.5, these color components are shown independently to the right. As can be seen, these color channels (RGB) are quite similar to each other, which is due to the fact that all three are affected by luminance, i.e., the overall brightness per pixel. Stating the same thing three times is inefficient, and therefore video coding schemes such as H.264 do not store the pixels as $(R, G, B)$. Instead $(Y, Cb, Cr)$ is stored, where

\begin{equation} \begin{cases} \begin{array}{rrrl} Y = & \bs 0.299 & \bs R + 0.587 & \bs G + 0.114 & \bs B,\\ Cb = & \bs -0.169 & \bs R - 0.331 & \bs G + 0.500 & \bs B + 128,\\ Cr = & \bs 0.500 & \bs R - 0.419 & \bs G - 0.081 & \bs B + 128. \end{array} \end{cases} \end{equation} | (5.4) |

During decompression, we need to get back from $(Y, Cb, Cr)$ to $(R, G, B)$ in order to display the pixel on the screen. Assume we have computed that the color in the $(Y, Cb, Cr)$-space for a pixel is $Y=180$, $Cb=80$, and $Cr=150$. What $(R, G, B)$ color does that represent? The answer can be given by inserting the values for $Y$, $Cb$ and $Cr$ in the equations above and solving the resulting linear system of equations, i.e.,

\begin{equation} \begin{cases} \begin{array}{rrrl} 0.299 &\bs R + 0.587 &\bs G + 0.114 &\bs B = &\bs 180,\\ -0.169 &\bs R - 0.331 &\bs G + 0.500 &\bs B = &\bs -48,\\ 0.500 &\bs R - 0.419 &\bs G - 0.081 &\bs B = &\bs 22. \end{array} \end{cases} \end{equation} | (5.5) |

Example 5.2:
Electronic Circuit

In the electronic circuit below, we would like to calculate the currents $I_1$, $I_2$, and $I_3$. Kirchoff's voltage law states that the sum of the voltages equals zero when going in a closed loop in the circuit. If we apply this to the left part of the circuit we get $5V - I_2 1\mathrm{k}\Omega - I_2 2\mathrm{k}\Omega = 0$. If we apply it to the outer loop of the circuit we get $5V - I_3 4\mathrm{k}\Omega = 0$. Finally, Kirchoff's current law can be used at the red point in the circuit to get a third equation, $I_1 = I_2 + I_3$. In summary, this yields the following system of equations

where the units have been removed. This can be solved to recover
$I_1$, $I_2$, and $I_3$. Solving such systems of equations is the topic of the rest of this
chapter.

In the electronic circuit below, we would like to calculate the currents $I_1$, $I_2$, and $I_3$. Kirchoff's voltage law states that the sum of the voltages equals zero when going in a closed loop in the circuit. If we apply this to the left part of the circuit we get $5V - I_2 1\mathrm{k}\Omega - I_2 2\mathrm{k}\Omega = 0$. If we apply it to the outer loop of the circuit we get $5V - I_3 4\mathrm{k}\Omega = 0$. Finally, Kirchoff's current law can be used at the red point in the circuit to get a third equation, $I_1 = I_2 + I_3$. In summary, this yields the following system of equations

\begin{equation} \begin{cases} 1000 I_2 + 2000 I_2 \hid{+ 4000 I_3} = 5,\\ \hid{1000 I_1 + 2000 I_2 +} 4000 I_3 = 5,\\ \hid{1000}I_1\hid{2000} -I_2\hid{4000}-I_3 = 0,\\ \end{cases} \end{equation} | (5.6) |

Example 5.3:
Fitting a circle to three points

There are some problems that although non-linear, can be changed into a system of linear equations after a change of variables. Here we illustrate this with the problem of fitting a circle to three points. A circle is given by its center point $(u,v)$ and its radius $r$. A point $(x_1,y_1)$ lies on this circle if

Since there are $3$ unknowns ($u$, $v$ and $r$) one might expect that we need $3$ equations.
So maybe three points ($(x_1,y_1)$, $(x_2,y_2)$, and $(x_3,y_3)$) are needed to specify a circle.
What is the circle that goes through the three points $(-2,3)$, $(-1,1)$, and $(-1,2)$?
This gives a system of 3 non-linear equations

By expanding the squares in the equation we get

After rearrangement of the terms we have

By introducing $w = r^2-u^2-v^2$ we obtain a system of linear equations

Again note that this seemingly non-linear problem could be converted into a system of linear equations.
When this has been solved, we obtain values for $w$, $u$, and $v$, which can then be used to
find out what the radius is, using $w = r^2-u^2-v^2$, i.e., $r^2 = w+u^2+v^2$.

The example above is shown in Interactive Illustration 5.7.

There are some problems that although non-linear, can be changed into a system of linear equations after a change of variables. Here we illustrate this with the problem of fitting a circle to three points. A circle is given by its center point $(u,v)$ and its radius $r$. A point $(x_1,y_1)$ lies on this circle if

\begin{equation} (x_1-u)^2 + (y_1-v)^2 = r^2 . \end{equation} | (5.7) |

\begin{equation} \begin{cases} (-2-u)^2 + (3-v)^2 = r^2 ,\\ (-1-u)^2 + (1-v)^2 = r^2 ,\\ (-1-u)^2 + (2-v)^2 = r^2 .\\ \end{cases} \end{equation} | (5.8) |

\begin{equation} \begin{cases} 4+4u+u^2 + 9-6v+v^2 = r^2 ,\\ 1+2u+u^2 + 1-2v+v^2 = r^2 ,\\ 1+2u+u^2 + 4-4v+v^2 = r^2 .\\ \end{cases} \end{equation} | (5.9) |

\begin{equation} \begin{cases} (r^2 - u^2 - v^2) -4u +6v = 13 ,\\ (r^2 - u^2 - v^2) -2u +2v = 2 ,\\ (r^2 - u^2 - v^2) -2u +4v = 5 .\\ \end{cases} \end{equation} | (5.10) |

\begin{equation} \begin{cases} w -4u +6v = 13 ,\\ w -2u +2v = 2 ,\\ w -2u +4v = 5 .\\ \end{cases} \end{equation} | (5.11) |

The example above is shown in Interactive Illustration 5.7.

Example 5.4:
4000 year old Babylonian problem

Linear systems of equations were described on Babylonian clay tablets. One such problem is the following:

"There are two fields whose total area is 1800 square yards. One produces grain at the rate of 2/3 of a bushel per square yard while the other produces grain at the rate of 1/2 a bushel per square yard. If the total yield is 1100 bushels, what is the size of each field?"

Introduce a variable for the size of each field, i.e. $x_1$ square yards of the first and $x_2$ square yards of the second. Then the problem can be reformulated as

This can be solved to recover $x_1$, $x_2$.
Solving such systems of equations is the topic of the rest of this chapter.

Linear systems of equations were described on Babylonian clay tablets. One such problem is the following:

"There are two fields whose total area is 1800 square yards. One produces grain at the rate of 2/3 of a bushel per square yard while the other produces grain at the rate of 1/2 a bushel per square yard. If the total yield is 1100 bushels, what is the size of each field?"

Introduce a variable for the size of each field, i.e. $x_1$ square yards of the first and $x_2$ square yards of the second. Then the problem can be reformulated as

\begin{equation} \begin{cases} \begin{array}{rrrl} \frac{2}{3} x_1 & \bfm + & \bfm \frac{1}{2} x_2 & \bfm = 1100, \\ x_1 & \bfm + & \bfm x_2 & \bfm = 1800. \\ \end{array} \end{cases} \end{equation} | (5.12) |

There are many possible ways to solve systems of equations, and you have most likely studied some of them before. However, what Gaussian elimination provides is a way that

$\begin{array}{rrrl} 6 & \!\!\!\!\!\! x + &\!\!\!\!\!\!\!y - 4 &\!\!\!\!\!\!z = \hid{-}4 \\ & 2 &\!\!\!\!\!\!y + 3 &\!\!\!\!\!\!z = -5 \\ & & 11 &\!\!\!\!\!\!z = -33 \\ \end{array}$

The goal of Gaussian elimination is to start with any linear system of equations, and then transform it into a triangular system. This is done using only three simple rules, as shown in the following theorem.

Theorem 5.1:
Gaussian Elimination Rules

The solutions to a linear system of equations do not change if we

At this point, we defer the proof of this theorem until Section 5.7.1,
where we also repeat this theorem for the sake of completeness.
The solutions to a linear system of equations do not change if we

- swap the order of two rows,
- multiply a row with a constant $\neq 0$, or
- add a multiple of another row to a row.

As an example, assume we have the following system of equations,

\begin{equation} \begin{cases} \begin{array}{rl} -x + \hid{4} y =& \!\!\!\! 8,\\ 2 x + 4y =& \!\!\!\!-10. \end{array} \end{cases} \end{equation} | (5.13) |

\begin{equation} \begin{cases} -x + \hid{6} y = 8,\\ \hid{-x+}6y = 6, \end{cases} \end{equation} | (5.14) |

The example above was easy to solve since we could use an integer multiple. In general, this is not always possible. Consider for instance,

\begin{equation} \begin{cases} 2 x + 3 y = -5, \\ 5 x + 2 y = 4. \end{cases} \end{equation} | (5.15) |

$\Big\{\begin{array}{l} 2 x + 3 y = -5 \\ 5 x + 2 y = 4 \end{array}$

$2 x + 3 y = -5$

$\Big\{\begin{array}{rl} \class{hidden}{2 x + 3 y} & \class{hidden}{= -5} \\ \class{hidden}{11 y} & \class{hidden}{= -33} \end{array}$

$\class{hidden}{\Big\{}\begin{array}{l} 2 x + 3 y = -5 \\ \class{hidden}{5 x + 2 y = 4} \end{array}$

$2 x + 3 y = -5$

$\Big\{\begin{array}{l} 2 x + 3 y = -5 \\ \textcolor{red}{5} x + 2 y = 4 \end{array}$

$\textcolor{red}{5}\class{hidden}{(2 x + 3 y) = 5 (-5) }$

$\textcolor{red}{5}\class{hidden}{(2 x + 3 y) = 5 (-5) }$

$\class{hidden}{5}(2 x + 3 y) = \class{hidden}{5} (-5)$

$\textcolor{red}{5} (2 x + 3 y) = \textcolor{red}{5} (-5)$

$5 x + 2 y = 4$

$5 x + 2 y = 4$

$\class{hidden}{2}(5 x + 2 y) = \class{hidden}{2}(4)$

$\Big\{\begin{array}{l} \textcolor{#00dd00}{2} x + 3 y = -5 \\ 5 x + 2 y = 4 \end{array}$

$\textcolor{#00dd00}{2}$

$\textcolor{#00dd00}{2}$

$\textcolor{#00dd00}{2}\class{hidden}{(5 x + 2 y) =}\textcolor{#00dd00}{2}\class{hidden}{(4)}$

_____________________

$(\textcolor{red}{5}\cdot 2 - \textcolor{#00dd00}{2}\cdot 5) x + (\textcolor{red}{5} \cdot 3 - \textcolor{#00dd00}{2} \cdot 2) y = (\textcolor{red}{5})(-5) - \textcolor{#00dd00}{2}(4)$

$\Leftrightarrow$

$11 y = -33$

$\class{hidden}{\Big\{}\begin{align*} \class{hidden}{2 x + 3 y} & \class{hidden}{= -5} \\ 11 y & = -33 \end{align*}$

$\class{hidden}{5(2 x + 3 y) = }\textcolor{red}{5}\class{hidden}{(-5)}$

$\class{hidden}{2(5 x + 2 y) =}\textcolor{#00dd00}{2}\class{hidden}{(4)}$

$\class{hidden}{\Big\{}\begin{align*} \class{hidden}{2 x + 3 y} & \class{hidden}{= -5} \\ 11 y & = -33 \end{align*}$

$\class{hidden}{\Big\{}\begin{align*} 2 x + 3 y & = -5 \\ \class{hidden}{11 y} & \class{hidden}{= -33} \end{align*}$

$2 x + 3 y = -5$

$\textcolor{red}{5}\class{hidden}{(2 x + 3 y) = 5 (-5) }$

$\class{hidden}{5(2 x + 3 y) =}\textcolor{red}{5}\class{hidden}{(-5)}$

$5 x + 2 y = 4$

$\textcolor{#00dd00}{2}\class{hidden}{(5 x + 2 y) = 2(4)}$

$\class{hidden}{2(5 x + 2 y) =}\textcolor{#00dd00}{2}\class{hidden}{(4)}$

$\class{hidden}{\Big\{}\begin{align*} \class{hidden}{2 x + 3 y} & \class{hidden}{= -5} \\ 11 y & = -33 \end{align*}$

Sometimes it is not possible to eliminate the first variable in the lower rows, since it is not present in the first row. An example is

\begin{equation} \begin{cases} \begin{array}{rrrl} 3 & \bs x_2 - 6 & \bs x_3 = -21, \\ 2 x_1 + 4 & \bs x_2 - 2 & \bs x_3 = \hid{-}16, \\ - x_1 - 7 & \bs x_2 + 2 & \bs x_3 = -27. \\ \end{array} \end{cases} \end{equation} | (5.16) |

\begin{equation} \begin{cases} \begin{array}{rrrl} 2 & \!\!\!\!\!\! x_1 + 4 &\!\!\!\!\!\!\!x_2 - 2 &\!\!\!\!\!\!x_3 = \hid{-}16, \\ - & \!\!\!\!\!\! x_1 - 7 &\!\!\!\!\!\!x_2 + 2 &\!\!\!\!\!\!x_3 = -27, \\ & 3 &\!\!\!\!\!\!x_2 - 6 &\!\!\!\!\!\!x_3 = -21. \\ \end{array} \end{cases} \end{equation} | (5.17) |

\begin{equation} \begin{cases} \begin{array}{rrrl}\! 2 x_1 + 4&\!\!\!\!\!\!x_2 - 2&\!\!\!\!\!\!x_3 = \hid{-}16, \\ -10&\!\!\!\!\!\!x_2 + 2&\!\!\!\!\!\!x_3 = -38, \\ 3&\!\!\!\!\!\!x_2 - 6&\!\!\!\!\!\!x_3 = -21. \\ \end{array} \end{cases} \end{equation} | (5.18) |

\begin{equation} \begin{cases} \begin{array}{rrrl} 2 x_1 + 4 x_2 - 2 x_3 & = 16, \\ -10 x_2 + 2 x_3 & = -38, \\ -54 x_3 & = -324.\\ \end{array} \end{cases} \end{equation} | (5.19) |

So far, we have only encountered systems of equations which have exactly one solution. This is not always the case, as is demonstrated in

\begin{equation} \begin{cases} \begin{array}{rl} 5 &\bs x + 2 y = 4,\\ 10 &\bs x + 4 y = 13. \end{array} \end{cases} \end{equation} | (5.20) |

\begin{equation} \begin{cases} \begin{array}{rl} 5 x + 2 &\bs y = 4\\ &\bs 0 = 5 \end{array} \end{cases} \end{equation} | (5.21) |

\begin{equation} \begin{cases} 10 x + 4 y = 8,\\ 10 x + 4 y = 13. \end{cases} \end{equation} | (5.22) |

Having no solution is not the only special case. As an example, assume we have the following system of equations,

\begin{equation} \begin{cases} \begin{array}{rll} - &\bs x + &\bs y = \hid{-}8,\\ 3 &\bs x - 3 &\bs y = -24. \end{array} \end{cases} \end{equation} | (5.23) |

\begin{equation} \begin{cases} \begin{array}{rl} -x + y & \!\!\!\!= 8,\\ 0 & \!\!\!\!= 0. \end{array} \end{cases} \end{equation} | (5.24) |

\begin{equation} \begin{cases} \begin{array}{rl} -x + t &\bsfyra = 8,\\ y &\bsfyra = t,\\ \end{array} \end{cases} \end{equation} | (5.25) |

\begin{equation} \begin{cases} x = -8 + t,\\ y = \class{hidden}{-}0 + t.\\ \end{cases} \end{equation} | (5.26) |

$\textcolor{#009600}{S = \begin{pmatrix} -8\\ 0 \end{pmatrix}}$

$\textcolor{#c80000}{\mathrm{\bf d} = \begin{pmatrix} 1\\ 1 \end{pmatrix}}$

$P(t) = S + \mathrm{\bf d}$

$-x + y = 8$

$3x - 3y = -24$

Another way to look at this is to go back to the original equation,

\begin{equation} \begin{cases} \begin{array}{rll} - &\bs x + &\bs y = \hid{-}8,\\ 3 &\bs x - 3 &\bs y = -24. \end{array} \end{cases} \end{equation} | (5.27) |

\begin{equation} \begin{cases} -x + y = 8,\\ -x + y = 8. \end{cases} \end{equation} | (5.28) |

Now, consider a system with three equations and three unknowns, such as

\begin{equation} \begin{cases} \begin{array}{rrrl} 2 & \!\!\!\!\!\! x + 4 &\!\!\!\!\!\!\!y - 2 &\!\!\!\!\!\!z = \hid{-}16, \\ - & \!\!\!\!\!\! x - 7 &\!\!\!\!\!\!y + 2 &\!\!\!\!\!\!z = -27, \\ & 3 &\!\!\!\!\!\!y - 6 &\!\!\!\!\!\!z = -21. \\ \end{array} \end{cases} \end{equation} | (5.29) |

In other cases, two equations represent the same plane. This may be because they have the same coefficients, or one equation is equal to the other times a constant. Then there is an entire plane that satisfy these two equations, as illustrated by the dashed plane in the third step in Interactive Illustration 5.12. The solution to the entire system is then the intersection between this (double) plane and the third plane. This amounts to an entire line of solutions, as illustrated by the solid black line in the figure.

The last case when the system has a solution is when all three equations represent the same plane, such as the system

\begin{equation} \begin{cases} \begin{array}{rrrl} & \bs x + \!\!\!\!&\bs y + &\bs z = 1,\\ 2&\bs x + 2&\bs y + 2&\bs z = 2,\\ 3&\bs x + 3&\bs y + 3&\bs z = 3.\\ \end{array} \end{cases} \end{equation} | (5.30) |

For a system of two equations and two unknowns not to have a solution, the two lines representing the two equations must be parallel as was shown in Interactive Illustration 5.10. This is not the case in three dimensions. Consider, for instance, the following system of equations.

\begin{equation} \begin{cases} \begin{array}{rrrl} 3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6}\\ -3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6}\\ & -6&\bs y - 6&\bs z = \sqrt{6}\\ \end{array} \end{cases} \end{equation} | (5.31) |

\begin{equation} \begin{cases} \begin{array}{rrrl} 3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6},\\ & 6&\bs y + 6&\bs z = 2\sqrt{6},\\ & -6&\bs y - 6&\bs z = \sqrt{6}.\\ \end{array} \end{cases} \end{equation} | (5.32) |

\begin{equation} \begin{cases} \begin{array}{rrrl} 3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6},\\ & 6&\bs y + 6&\bs z = 2\sqrt{6},\\ & & &\bs 0 = 3\sqrt{6}.\\ \end{array} \end{cases} \end{equation} | (5.33) |

A special case of this is when two equations represent the same plane, but the third equation represents a different plane parallel to, but not coinciding with, these. This example is shown in the third step of Figure 5.13, where the dashed plane satisfies two of the equations and the other plane satisfies the third equation. Again, no solution to this system of equations exists.

Finally, the last case when no solution exists is when two of the planes are parallel to each other (and different from each other). The third plane then intersects these two parallel planes along two lines (dashed in step 4 in Interactive Illustration 5.13), but there is not a single point that belongs to all three planes, and hence there exists no solution to the system of equations.

In summary, we can see that a system of equations either have

- exactly one solution,
- no solution, or
- an infinite number of solutions.

A special case is when all the equations equal zero, as in the following case.

\begin{equation} \begin{cases} \begin{array}{rrrl} 2 x_1 + 5 & \bs x_2 + 3 & \bs x_3 = 0, \\ 4 x_1 + 2 & \bs x_2 +\hid{3}& \bs x_3 = 0, \\ 2 x_1 - 3 & \bs x_2 - 2 & \bs x_3 = 0, \\ \end{array} \end{cases} \end{equation} | (5.34) |

\begin{equation} \begin{cases} \begin{array}{rrrl} 2 x_1 + 5 & \bs x_2 + 3 & \bs x_3 = 0, \\ 8 & \bs x_2 + 5 & \bs x_3 = 0, \\ 8 & \bs x_2 + 5 & \bs x_3 = 0. \\ \end{array} \end{cases} \end{equation} | (5.35) |

\begin{equation} \begin{cases} \begin{array}{rrrl} 2 x_1 + 5 & \bs x_2 + 3 & \bs x_3 = 0, \\ 8 & \bs x_2 + 5 & \bs x_3 = 0, \\ & & \bs 0 = 0. \\ \end{array} \end{cases} \end{equation} | (5.36) |

\begin{equation} \begin{cases} \begin{array}{rrrl} 2 x_1 + 5 & \bs x_2 + 3 & \bs t = 0, \\ 8 & \bs x_2 + 5 & \bs t = 0, \\ & & \bs\bs x_3 = t. \\ \end{array} \end{cases} \end{equation} | (5.37) |

\begin{equation} \begin{cases} \begin{array}{rrl} x_1 = & \bs \frac{1}{16} & \bs t, \\ x_2 = & \bs -\frac{5}{8} & \bs t, \\ x_3 = & \bs &\bs t, \\ \end{array} \end{cases} \end{equation} | (5.38) |

As we have seen in Section 3.6, planes and lines can be written on both implicit and explicit form. Using Gaussian elimination, it is possible to convert between these two types. Recall that the plane $P(t_1, t_2) = S + t_1 \vc{d}_1 + t_2 \vc{d}_2$ is on explicit form, while $\vc{n}\cdot(P-S)=0$ is a plane on implicit form. Assume we have the following plane

\begin{equation} \begin{cases} x &= \class{hidden}{0 +} \frac{3}{2} t_1, \\ y &= \class{hidden}{0 + \frac{3}{2}} t_1 + t_2, \\ z &= 1 \class{hidden}{\frac{3}{2}}-t_1 + 2 t_2, \end{cases} \end{equation} | (5.39) |

\begin{equation} \begin{cases} \frac{3}{2} &\bs t_1 \class{hidden}{+2t_2}= x,\\ &\bs t_1 \class{hidden}{2}+ t_2 = y,\\ -&\bs t_1 + 2 t_2 = z - 1.\\ \end{cases} \end{equation} | (5.40) |

\begin{equation} \begin{cases} &\bs t_1 \class{hidden}{2}+ t_2 = y,\\ -&\bs t_1 + 2 t_2 = z - 1,\\ \frac{3}{2} &\bs t_1 \class{hidden}{+2t_2}= x.\\ \end{cases} \end{equation} | (5.41) |

\begin{equation} \begin{cases} \begin{array}{rl} t_1 + & \bsfem t_2 = y,\\ 3 & \bsfem t_2 = y + z - 1,\\ \frac{3}{2} & \bsfem t_2 = \frac{3}{2}y - x.\\ \end{array} \end{cases} \end{equation} | (5.42) |

\begin{equation} \begin{cases} \begin{array}{rl} t_1 \class{hidden}{2}+ t_2 & \bsfem = y,\\ 3 t_2 & \bsfem = y + z - 1,\\ 0 & \bsfem= y + z - 1 - 2(\frac{3}{2}y - x).\\ \end{array} \end{cases} \end{equation} | (5.43) |

\begin{equation} \begin{cases} \begin{array}{rl} 2x - 2t_1 +t_2 + -1 &\bsfem = 0,\\ y &\bsfem = t_1,\\ z &\bsfem = t_2, \end{array} \end{cases} \end{equation} | (5.44) |

\begin{equation} \begin{cases} \begin{array}{rlll} x &\bsfem = \frac{1}{2} + t_1 -\frac{1}{2}t_2, \\ y &\bsfem = \class{hidden}{\frac{1}{2} } +t_1, \\ z &\bsfem = \class{hidden}{\frac{1}{2} + t1 \frac{1}{2}} + t_2. \end{array} \end{cases} \end{equation} | (5.45) |

The sections above have given examples of Gaussian elimination along with geometrical interpretations where this was possible. In the following section, we will go through the theory why Gaussian elimination works.

Let us revisit our example Systems of Equation (5.16), that is,

\begin{equation} \begin{cases} \begin{array}{rrrl} 3 & \bs x_2 - 6 & \bs x_3 = -21, \\ 2 x_1 + 4 & \bs x_2 - 2 & \bs x_3 = \hid{-}16, \\ - x_1 - 7 & \bs x_2 + 2 & \bs x_3 = -27. \\ \end{array} \end{cases} \end{equation} | (5.46) |

\begin{equation} \begin{cases} \begin{array}{rrrrl} 3 \cdot & \bs 5 - 6 \cdot & \bs 6 = & \bstwo 15 - 36 & = -21, \\ 2\cdot 4 + 4 \cdot & \bs 5 - 2 \cdot & \bs 6 = & \bstwo 8 + 20 - 12 & = \hid{-}16, \\ - 4 - 7 \cdot & \bs 5 + 2 \cdot & \bs 6 = & \bstwo -4 -35 + 12 & = -27, \\ \end{array} \end{cases} \end{equation} | (5.47) |

Here it is useful to introduce the notion of implication $\Rightarrow$ and equivalence $\Leftrightarrow$. We say that one system of equations implies a second system of equations if every solution to the first system is also a solution to the second system. When this works both ways we say that the two systems of equations are equivalent. Thus we would write

\begin{equation} \begin{cases} \begin{array}{rrrrrl} 1 x_1 & \bsfem +& \bsfem 2 x_2 & \bsfem = 1, \\ 2 x_1 & \bsfem +& \bsfem 5 x_2 & \bsfem = 7, \\ \end{array} \end{cases} \Leftrightarrow \begin{cases} \begin{array}{rrrrrl} 1 x_1 & \bsfem +& \bsfem 2 x_2 & \bsfem = 1, \\ & \bsfem +& \bsfem 1 x_2 & \bsfem = 5, \\ \end{array} \end{cases} \end{equation} | (5.48) |

Theorem 5.2:
Gaussian Elimination Rules

The solutions to a linear system of equations do not change if we

The solutions to a linear system of equations do not change if we

- swap the order of two equations,
- multiply an equation with a constant $\neq 0$, or
- add a multiple of another equation to an equation.

The first rule of Gaussian elimination in Theorem 5.2 says it is allowed to swap the order of two equations. It should be readily appreciated that simple reordering of the equations does not change the solution of the system of equations, i.e.,

\begin{equation} \begin{cases} \begin{array}{rrrl} a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfem \, = b_1 , \\ a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2 , \end{array} \end{cases} \Leftrightarrow \begin{cases} \begin{array}{rrrl} a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2 , \\ a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfem \, = b_1 . \end{array} \end{cases} \end{equation} | (5.49) |

The next rule allows us to multiply an equation with a constant different from $0$. From algebra, it should be clear that multiplying both sides of an equation by a non-zero constant does not change the solution to the equation, and hence does not affect the solution to the system of equations.

\begin{equation} \begin{cases} \begin{array}{rrrll} a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfem \, = b_1 , & \bsfem \, \, \, (i) \\ a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2 , & \bsfem \, \, \, (ii) \end{array} \end{cases} \Leftrightarrow \begin{cases} \begin{array}{rrrll} ka_{11} x & \bsfem \, + ka_{12} y & \bsfem \, + ka_{13} z & \bsfem \, = kb_1 , & \bsfem \, \, \, (i') = k (i)\\ a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2 . & \bsfem \, \, \, (ii') = (ii) \end{array} \end{cases} \end{equation} | (5.50) |

Multiplying by zero on the other hand can change the solution. As an example, $(x,y) = (2, 3)$ does not satisfy the equation

\begin{equation} x + y = 1, \, \, \, (i) \end{equation} | (5.51) |

\begin{equation} 0\cdot (x + y) = 0 \cdot 1. \, \, \, \, (i') = 0 (i) \end{equation} | (5.52) |

The third rule allows us to add a multiple of another row to a row. Assume we have a solution $(x, y, z)$ that satisfies the system

\begin{equation} \begin{cases} \begin{array}{rrrll} a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfem \, = b_1, & \bsfem \, \, \, (i)\\ a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2, & \bsfem \, \, \, (ii)\\ a_{31} x & \bsfem \, + a_{32} y & \bsfem \, + a_{33} z & \bsfem \, = b_3. & \bsfem \, \, \, (iii)\\ \end{array} \end{cases} \end{equation} | (5.53) |

\begin{equation} \begin{cases} \begin{array}{rrrrll} a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfyra = & \bsfem b_1, & \bsfem \, \, \, (i')=(i) \\ (a_{21}+ka_{11}) x & \bsfem \, + (a_{22}+ka_{12}) y & \bsfem \, + (a_{23}+ka_{13}) z & \bsfyra = & \bsfem b_2 + k b_1, & \bsfem \, \, \, (ii') = (ii)+k(i)\\ a_{31} x & \bsfem \, + a_{32} y & \bsfem \, + a_{33} z & \bsfyra = & \bsfem b_3. & \bsfem \, \, \, (iii') = (iii) \\ \end{array} \end{cases} \end{equation} | (5.54) |

$\square$

This chapter has given examples of systems of linear equations that either has no solution, one solution, or an infinite number of solutions. This section will go through the general case and formally prove that this is the case.

Theorem 5.3:
Solutions to a System of Linear Equations

A system of linear equations has either

A system of linear equations has either

- no solution, or
- exactly one solution,
- an infinite number of solutions.

A general linear system looks like

\begin{equation} \begin{cases} \begin{array}{rrrrrl} a_{11} x_1 + & \bs a_{12} x_2 + & \bs a_{13} x_3 + & \bs \ldots + & \bs a_{1M} x_M & \bs = b_1,\\ a_{21} x_1 + & \bs a_{22} x_2 + & \bs a_{23} x_3 + & \bs \ldots + & \bs a_{2M} x_M & \bs = b_2,\\ a_{31} x_1 + & \bs a_{32} x_2 + & \bs a_{33} x_3 + & \bs \ldots + & \bs a_{3M} x_M & \bs = b_3,\\ & & & & & \bsfem \vdots\\ a_{N1} x_1 + & \bs a_{N2} x_2 + & \bs a_{N3} x_3 + & \bs \ldots + & \bs a_{NM} x_M & \bs = b_N,\\ \end{array} \end{cases} \end{equation} | (5.55) |

\begin{equation} \begin{cases} \begin{array}{rl} a_{11} x_1 + & \bs a_{12} x_2 + & \bs a_{13} x_3 + & \bs \ldots & \bsfem + a_{1M} x_M & \bs = b_1,\\ & \bs a'_{22} x_2 + & \bs a'_{13} x_3 + & \bs \ldots & \bsfem + a'_{2M} x_M & \bs = b'_2,\\ & \bs a'_{32} x_2 + & \bs a'_{33} x_3 + & \bs \ldots & \bsfem + a'_{3M} x_M & \bs = b'_3,\\ & & & & & \bsfem \vdots\\ & \bs a'_{N2} x_2 + & \bs a'_{N3} x_3 + & \bs \ldots & \bsfem + a'_{NM} x_M & \bs = b'_N.\\ \end{array} \end{cases} \end{equation} | (5.56) |

Now the same procedure can be applied to the smaller system of primed constants, namely, all rows in (5.56) that do not involve $x_1$. Again there are two cases. Either all of these equations have a coefficient of zero for the variable $x_2$ that we are trying to eliminate, or there is at least one equation where there is a non-zero coefficient for $x_2$. After reordering and elimination, either all equations have zero coefficient for the variable $x_2$, or there is exactly one equation where there is a non-zero coefficient for $x_2$.

This process is continued until there are either no more equations involving the unknown variables. In the end, there might be equations left of the type $0 = 0$. These can just be removed. There are three possible outcomes.

The first case is if there is at least one equation of type $0 = \hat{b}_{i} \neq 0$, in which case there is no solution to the system. An example of this is shown in the Systems of Equations (5.33). For the continued discussion we assume that this is not the case.

The second case is when the resulting system is triangular, i.e.,

\begin{equation} \begin{cases} \begin{array}{rl} \hat{a}_{11} x_1 + & \bs \hat{a}_{12} x_2 + & \bs \hat{a}_{13} x_3 + & \bs \ldots + & \bsfem \hat{a}_{1M} x_M & \bs = \hat{b}_1,\\ & \bs \hat{a}_{22} x_2 + & \bs \hat{a}_{13} x_3 + & \bs \ldots + & \bsfem \hat{a}_{2M} x_M & \bs = \hat{b}_2,\\ & & \bs \hat{a}_{33} x_3 + & \bs \ldots + & \bsfem \hat{a}_{3M} x_M & \bs = \hat{b}_3,\\ & & & & & \bsfem \vdots\\ & & & & \bsfem \hat{a}_{MM} x_M & \bs = \hat{b}_M,\\ \end{array} \end{cases} \end{equation} | (5.57) |

The third possibility is that there are fewer equations than unknown after elimination, e.g.,

\begin{equation} \begin{cases} \begin{array}{rl} \hat{a}_{11} x_1 + & \bs \hat{a}_{12} x_2 + & \bs \hat{a}_{13} x_3 + & \bs \hat{a}_{14} x_4 + & \bsfem \hat{a}_{15} x_M & \bs = \hat{b}_1,\\ & & \bs \hat{a}_{23} x_3 + & \bs \hat{a}_{24} x_4 + & \bsfem \hat{a}_{25} x_5 & \bs = \hat{b}_2,\\ & & & \bs \hat{a}_{34} x_4 + & \bsfem \hat{a}_{35} x_5 & \bs = \hat{b}_3.\\ \end{array} \end{cases} \end{equation} | (5.58) |

\begin{equation} x_4 = \frac{1}{\hat{a}_{34}}(\hat{b}_3 -\hat{a}_{35} t_{2}) . \\ \end{equation} | (5.59) |

Thus, the system has either no solution, one solution, or an infinite number of solutions.

$\square$

Note that for the case of infinite number of soutions, it is also interesting to understand how many free variables there are in the solution set. Or differently phrased, what is the dimension of the solution set. The is captured by the concept of $\rank$, which will be explained in Chapter 8.

Example 5.5:
Vanishing Coefficients

As mentioned in the proof above, sometimes when performing Gaussian elimination, a certain variable may get a zero coefficient in all remaining equations. In this case one may simply need to skip one step, as is illustrated in this example,

We eliminate $x_1$ in the second line by adding $-2$ times the first equation to the second equation, and we eliminate $x_1$ in the third line by adding $-3$ times the first equation to the last equation. We then get

Normally after having eliminated $x_1$, the next step would
be to try to eliminate $x_2$. But in this case the
coefficient in front of $x_2$ is zero both for the second and third
line. We then simply skip this and go to the next variable
$x_3$. The same happens here, so we skip to the last variable,
$x_4$. Eliminating that just gives the equation $0=0$ so we can
rewrite the system as

From the last equation we can see that $x_4 = -1$. We then set $x_2 = s$ and $x_3 = t$ and can then insert these into the top equation,

from which we can gather that

In summary we therefore have

As mentioned in the proof above, sometimes when performing Gaussian elimination, a certain variable may get a zero coefficient in all remaining equations. In this case one may simply need to skip one step, as is illustrated in this example,

\begin{equation} \begin{cases} \begin{array}{rrrrrrrl} x_1 & \bfm + & \bfm x_2 & \bfm + & \bfm 2 x_3 & \bfm + & \bfm 3 x_4 & \bfm = 1, \\ 2 x_1 & \bfm + & \bfm 2 x_2 & \bfm + & \bfm 4 x_3 & \bfm + & \bfm 7 x_4 & \bfm = 1, \\ 3 x_1 & \bfm + & \bfm 3 x_2 & \bfm + & \bfm 6 x_3 & \bfm + & \bfm 10 x_4 & \bfm = 2. \\ \end{array} \end{cases} \end{equation} | (5.60) |

\begin{equation} \begin{cases} \begin{array}{rrrrrrrl} x_1 & \bfm + & \bfm x_2 & \bfm + & \bfm 2 x_3 & \bfm + & \bfm 3 x_4 & \bfm = 1, \\ & & & & & \bfm & \bfm x_4 & \bfm = -1, \\ & & & & & \bfm & \bfm x_4 & \bfm = -1. \\ \end{array} \end{cases} \end{equation} | (5.61) |

\begin{equation} \begin{cases} \begin{array}{rrrrrrrl} x_1 & \bfm + & \bfm x_2 & \bfm + & \bfm 2 x_3 & \bfm + & \bfm 3 x_4 & \bfm = 1, \\ & & & & & \bfm & \bfm x_4 & \bfm = -1. \\ \end{array} \end{cases} \end{equation} | (5.62) |

\begin{equation} x_1 + s + 2t + 3(-1) = 1, \end{equation} | (5.63) |

\begin{equation} x_1 = -s - 2t + 4. \end{equation} | (5.64) |

\begin{equation} \begin{cases} \begin{array}{rllll} x_1 =& \bstre - & \bs s -2 \bs & t \bs & +4,\\ x_2 =& \bstre & \bs s, \bs & \bs & \\ x_3 =& \bstre & \bs \bs & t, \bs & \\ x_4 =& \bstre & \bs \bs & \bs & -1.\\ \end{array} \end{cases} \end{equation} | (5.65) |

In this section, we will introduce linear dependence and linear independence, which are two important concepts in linear algebra. However, we start with

Definition 5.1:
Linear Combination

The vector, $\vc{u}$, is a linear combination of the vectors, $\vc{v}_1,\dots,\vc{v}_n$ when $\vc{u}$ is expressed as

where $k_1, k_2, \dots, k_n$ are scalar values.

For the column vector notation
(Definition 2.5), we saw, for example,
that $\vc{w}= w_x \vc{e}_1+w_y \vc{e}_2+w_z \vc{e}_3$, That is, the
three-dimensional vector, $\vc{w}$, is a linear combination of the
basis vectors, $\vc{e}_1$, $\vc{e}_2$, and $\vc{e}_3$, and the scalar
values are $w_x$, $w_y$, and $w_z$.
The vector, $\vc{u}$, is a linear combination of the vectors, $\vc{v}_1,\dots,\vc{v}_n$ when $\vc{u}$ is expressed as

\begin{equation} \vc{u} = k_1\vc{v}_1 + k_2\vc{v}_2 + \dots + k_n \vc{v}_n = \sum_{i=1}^{n} k_i \vc{v}_i, \end{equation} | (5.66) |

Using the definition of the linear combination, we will now look into sets of vectors that are linear dependent and linear independent.

Definition 5.2:
Linear Independence and Dependence

The set of vectors, $\vc{v}_1,\dots,\vc{v}_n$, is said to be linearly independent, if the equation

*only* has a single solution which is

If there is at least one other solution, then the set of vectors is linearly dependent.

The set of vectors, $\vc{v}_1,\dots,\vc{v}_n$, is said to be linearly independent, if the equation

\begin{equation} k_1\vc{v}_1 + k_2 \vc{v}_2 + \dots + k_n \vc{v}_n = \vc{0}, \end{equation} | (5.67) |

\begin{equation} k_1 = k_2 = \dots = k_n =0. \end{equation} | (5.68) |

Example 5.6:
Standard Basis

The standard basis in three dimensions consists of the basis vectors

To determine whether they are linearly independent or not, one can set
up Equation (5.67) for this situation,

Now, if we instead write this on column vector notation (Definition 2.5),
we get

So, we arrive at $(k_1, k_2, k_3) = (0,0,0)$, which proves that the standard basis
is linearly independent.

The standard basis in three dimensions consists of the basis vectors

\begin{align} \vc{e}_x & = (1,0,0), \\ \vc{e}_y & = (0,1,0), \\ \vc{e}_z & = (0,0,1). \end{align} | (5.69) |

\begin{equation} k_1\vc{e}_x + k_2\vc{e}_y + k_3\vc{e}_z = \vc{0}. \end{equation} | (5.70) |

\begin{align} \vc{0} &= k_1\vc{e}_x + k_2\vc{e}_y + k_3\vc{e}_z = k_1 \begin{pmatrix}1\\ 0 \\ 0\end{pmatrix} + k_2 \begin{pmatrix}0\\ 1 \\ 0\end{pmatrix} + k_3 \begin{pmatrix}0\\ 0 \\ 1\end{pmatrix} \\ &= \begin{pmatrix}k_1\\ 0 \\ 0\end{pmatrix} + \begin{pmatrix}0\\ k_2 \\ 0\end{pmatrix} + \begin{pmatrix}0\\ 0 \\ k_3\end{pmatrix} = \begin{pmatrix}k_1\\ k_2 \\ k_3\end{pmatrix}. \end{align} | (5.71) |

Theorem 5.4:
Linear combination and linear dependence

One of the vectors $\vc{v}_1, \vc{v}_2, \dots, \vc{v}_n$ can be written as a linear combination of the others*if and only if* the vectors $\vc{v}_1, \vc{v}_2, \dots, \vc{v}_n$ are linearly dependent.

One of the vectors $\vc{v}_1, \vc{v}_2, \dots, \vc{v}_n$ can be written as a linear combination of the others

We know from the definition of linear independence (Definition 5.2) that if the vectors are linearly dependent, then the equation

\begin{equation} k_1\vc{v}_1 + k_2\vc{v}_2 + \dots + k_n \vc{v}_n = \vc{0} \end{equation} | (5.72) |

\begin{equation} k_1\vc{v}_1 + k_2\vc{v}_2 + \dots + k_{n-1} \vc{v}_{n-1} = - k_n \vc{v}_n, \end{equation} | (5.73) |

\begin{equation} \vc{v}_n = -\frac{k_1}{k_n}\vc{v}_1 - \dots - \frac{k_{n-1}}{k_n} \vc{v}_{n-1}. \end{equation} | (5.74) |

The proof in the other direction is simpler. Assume that one of the vectors can be written as a linear combination of the others. After renumbering, assume it is vector $\vc{v}_n$. We then get

\begin{equation} \vc{v}_n = a_1\vc{v}_1 + a_2\vc{v}_2 + \dots + a_{n-1} \vc{v}_{n-1}. \end{equation} | (5.75) |

\begin{equation} 0 = a_1\vc{v}_1 + a_2\vc{v}_2 + \dots + a_{n-1} \vc{v}_{n-1} - 1\vc{v}_n, \end{equation} | (5.76) |

\begin{equation} k_1\vc{v}_1 + k_2\vc{v}_2 + \dots + k_{n-1} \vc{v}_{n-1} - k_n\vc{v}_n = 0 \end{equation} | (5.77) |

$\square$

Example 5.7:

The two vectors $\vc{u} = (1, 2, 3)$ and $\vc{v} = (2, 4, 6)$ are linearly dependent. This is true since $\vc{v}$ can be written as a linear combination of $\vc{u}$ according to $\vc{v} = 2\vc{u}$.

The two vectors $\vc{u} = (1, 2, 3)$ and $\vc{v} = (2, 4, 6)$ are linearly dependent. This is true since $\vc{v}$ can be written as a linear combination of $\vc{u}$ according to $\vc{v} = 2\vc{u}$.

Example 5.8:

The three vectors $\vc{u} = (1, 2, 3)$, $\vc{v} = (2, 4, 6)$, and $\vc{w} = (1, 0, 7)$ are linearly dependent. This is true since $\vc{v}$ can be written as a linear combination of $\vc{u}$ and $\vc{w}$ according to $\vc{v} = 2\vc{u} + 0\vc{w}$. From this one can draw the conclusion that adding a vector to a set of vectors that are already linearly dependent can never make the new set linearly independent.

The three vectors $\vc{u} = (1, 2, 3)$, $\vc{v} = (2, 4, 6)$, and $\vc{w} = (1, 0, 7)$ are linearly dependent. This is true since $\vc{v}$ can be written as a linear combination of $\vc{u}$ and $\vc{w}$ according to $\vc{v} = 2\vc{u} + 0\vc{w}$. From this one can draw the conclusion that adding a vector to a set of vectors that are already linearly dependent can never make the new set linearly independent.

Example 5.9:

The single vector $\vc{v} = (0, 0, 0)$ is linearly dependent. This follows from Definition 5.2 since $k_1\vc{v} = \vc{0}$ for non-zero values of $k_1$, for instance $k_1 = 5$. For the same reason a set that consists of a single nonzero vector is linearly independent.

The single vector $\vc{v} = (0, 0, 0)$ is linearly dependent. This follows from Definition 5.2 since $k_1\vc{v} = \vc{0}$ for non-zero values of $k_1$, for instance $k_1 = 5$. For the same reason a set that consists of a single nonzero vector is linearly independent.

Example 5.10:

Examine if the three vectors $\vc{u} = (1, 2, 3)$, $\vc{v} = (2, 3, 4)$, $\vc{w} = (-1, 0, 1)$ are linearly independent.

There are no obvious way to see that the vectors are linearly dependent. As an example, none of the vectors are equal to $(0, 0, 0)$ and no vector is a simple multiple of another vector. We go back to the definition which states that the vectors are linearly independent if and only if

only has the solution $(k_1, k_2, k_3) = (0, 0, 0)$. Each of the three components now give rise to an equation, resulting in the system

Eliminating $k_1$ from the second and third equation gives

Now if we try to eliminate $k_2$ from the third equation we get

The system of equations is underdetermined and there is an infinite number of solutions. Setting $k_3 = t$ and inserting in the second equation gives $k_2 = 2t$, and inserting these two values in the first equation gives

which can be simplified to $k_1 = -3t$. Since $t$ can be non-zero, we have that the three vectors are linearly dependent. As an example, using $t = 1$, we get $k_1 = -3$, $k_2 = -2$ and $k_3 = 1$ which can be inserted into Equation (5.78) yielding

and by moving the first two vectors over to the right-hand side we see that $\vc{w} = 3\vc{u} + 2\vc{v}$.

If we instead change the last vector to $\vc{w}' = (-1, 0, 2)$, we get the following system of equations

Eliminating $k_1$ gives

The last equation gives $k_3 = 0$, which inserted into the second equation gives $k_2 = 0$, and inserting these two values into the top equation gives $k_1=0$. We thus have that $(k_1, k_2, k_3) = (0, 0, 0)$ is the only solution and the vectors $\vc{u}$, $\vc{v}$, and $\vc{w}'$ are therefore linearly independent.

Examine if the three vectors $\vc{u} = (1, 2, 3)$, $\vc{v} = (2, 3, 4)$, $\vc{w} = (-1, 0, 1)$ are linearly independent.

There are no obvious way to see that the vectors are linearly dependent. As an example, none of the vectors are equal to $(0, 0, 0)$ and no vector is a simple multiple of another vector. We go back to the definition which states that the vectors are linearly independent if and only if

\begin{equation} k_1 \begin{pmatrix}1\\ 2 \\ 3\end{pmatrix} + k_2 \begin{pmatrix}2\\ 3 \\ 4\end{pmatrix} + k_3 \begin{pmatrix}-1\\ 0 \\ 1\end{pmatrix} = \begin{pmatrix}0\\ 0 \\ 0\end{pmatrix} \end{equation} | (5.78) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ k_3 \bsfem & = 0,\\ 2 k_1 \bsfem & + \ 3 k_2 \bsfem & & = 0,\\ 3 k_1 \bsfem & + \ 4 k_2 \bsfem & + \ k_3 \bsfem & = 0.\\ \end{array} \end{cases} \end{equation} | (5.79) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ \hid{2} k_3 \bsfem & = 0,\\ & k_2 \bsfem & - \ 2 k_3 \bsfem & = 0,\\ & 2 k_2 \bsfem & - \ 4 k_3 \bsfem & = 0.\\ \end{array} \end{cases} \end{equation} | (5.80) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ \hid{2} k_3 \bsfem & = 0,\\ & k_2 \bsfem & - \ 2 k_3 \bsfem & = 0,\\ & & 0 \bsfem & = 0.\\ \end{array} \end{cases} \end{equation} | (5.81) |

\begin{equation} k_1 + \ 2 t - \ t = 0, \end{equation} | (5.82) |

\begin{equation} -3 \begin{pmatrix}1\\ 2 \\ 3\end{pmatrix} -2 \begin{pmatrix}2\\ 3 \\ 4\end{pmatrix} + 1 \begin{pmatrix}-1\\ 0 \\ 1\end{pmatrix} = \begin{pmatrix}0\\ 0 \\ 0\end{pmatrix}, \end{equation} | (5.83) |

If we instead change the last vector to $\vc{w}' = (-1, 0, 2)$, we get the following system of equations

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \hid{2}k_3 \bsfem & = 0,\\ 2 k_1 \bsfem & + \ 3 k_2 \bsfem & & = 0,\\ 3 k_1 \bsfem & + \ 4 k_2 \bsfem & + 2k_3 \bsfem & = 0.\\ \end{array} \end{cases} \end{equation} | (5.84) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ \hid{2} k_3 \bsfem & = 0,\\ & k_2 \bsfem & - \ 2 k_3 \bsfem & = 0,\\ & 2 k_2 \bsfem & - \ 5 k_3 \bsfem & = 0.\\ \end{array} \end{cases} \end{equation} | (5.85) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ \hid{2} k_3 \bsfem & = 0,\\ & k_2 \bsfem & - \ 2 k_3 \bsfem & = 0,\\ & & k_3 \bsfem & = 0.\\ \end{array} \end{cases} \end{equation} | (5.86) |

Theorem 5.5:
The Basis Theorem

- Two vectors $\vc{u}$ and $\vc{v}$ in the plane constitute a basis if and only if they are linearly independent.
- Three vectors $\vc{u}$, $\vc{v}$, and $\vc{w}$ in 3D space constitute a basis if and only if they are linearly independent.
- Three or more vectors in the plane are always linearly dependent.
- Four or more vectors in 3D space are always linearly dependent.

The two first theorems go in both directions, so we mark the proof in the forward direction with $\Rightarrow$ and in the backward direction with $\Leftarrow$.

1 $\Rightarrow$: Assume $\vc{u}$ and $\vc{v}$ are linearly independent. Theorem 5.4 then gives that $\vc{u}$ cannot be written as a linear combination of $\vc{v}$, i.e., $\vc{u} \neq \lambda \vc{v}$ and likewise $\vc{v} \neq \lambda \vc{u}$. This means that $\vc{u}$ and $\vc{v}$ are not parallel and Theorem 2.4 then gives that they constitute a basis in the plane.

1 $\Leftarrow$: Assume $\vc{u}$ and $\vc{v}$ constitute a basis in the plane. We will now prove that this means that $\vc{u}$ and $\vc{v}$ are linearly independent. Assume they are not linearly independent, but anyway constitute a basis, i.e., the expression $x_u \vc{u} + x_v \vc{v}$ can reach every point in the plane. Since they are linearly dependent, Theorem 5.4 states that we can write $\vc{u} = \lambda \vc{v}$ or $\vc{v} = \lambda\vc{u}$. Now, assume further that we can write $\vc{u}$ as $\vc{u} = \lambda \vc{v}$ (this can always be done by switching labels of $\vc{u}$ and $\vc{v}$). Assume we want to describe a new vector $\vc{w}$ in this basis using $\vc{w} = x_u \vc{u} + x_v \vc{v}$. This equals $\vc{w} = x_u \lambda \vc{v} + x_v \vc{v} = (x_u \lambda + x_v) \vc{v} = \mu \vc{v}$, where $\mu$ is a scalar equal to $(x_u \lambda + x_v)$. This means that only vectors $\vc{w}$ parallel to $\vc{v}$ can be described using $x_u \vc{u} + x_v \vc{v}$, and hence $\vc{u}$, $\vc{v}$ cannot be a basis for the full three-dimensional space.

2 $\Rightarrow$: Theorem 2.5 states that if there is no plane that is parallel to $\vc{u}$, $\vc{v}$ and $\vc{w}$, then the three vectors constitute a basis in the room. Assume that there is a plane $\pi$ parallel to all three vectors. Since it is parallel to $\vc{u}$ and $\vc{v}$, it can be written as

\begin{equation} \pi: \vc{p} + s\vc{u} + t\vc{v}, \end{equation} | (5.87) |

\begin{equation} \vc{q} = \vc{p} + s_1\vc{u} + t_1\vc{v}, \end{equation} | (5.88) |

\begin{equation} \vc{q} + \vc{w} = \vc{p} + s_2\vc{u} + t_2\vc{v}. \end{equation} | (5.89) |

\begin{equation} \vc{w} = (s_2-s_1)\vc{u} + (t_2-t_1)\vc{v}. \end{equation} | (5.90) |

2 $\Leftarrow$: Assume $\vc{u}$, $\vc{v}$ and $\vc{w}$ constitute a basis in the room. We will now prove that this means that $\vc{u}$, $\vc{v}$ and $\vc{w}$ are linearly independent. Assume they are not linearly independent, but anyway constitute a basis, i.e., the expression $x_u \vc{u} + x_v \vc{v} + x_w \vc{w}$ can reach every point in the room. Since they are linearly dependent, Theorem 5.4 says that we can write at least one of the vectors as a linear combination of the two others. Assume that one of them is $\vc{u}$ (otherwise rename the vectors), we can now write $\vc{u}$ as $\vc{u} = \lambda_v \vc{v} + \lambda_w \vc{w}$. Now we want to describe a $\vc{q}$ in this basis using $\vc{q} = x_u \vc{u} + x_v \vc{v} + x_w \vc{w}$. Inserting the linear expression for $\vc{u}$ gives $\vc{q} = x_u (\lambda_v \vc{v} + \lambda_w \vc{w}) + x_v \vc{v} + x_w \vc{w} = (x_u \lambda + x_v) \vc{v} + (x_u \lambda_w + x_w) \vc{w} = \mu_v \vc{v} + \mu_w \vc{w}$, where $\mu_v$ and $\mu_w$ are scalars. This means that only vectors $\vc{q}$ parallel to the plane spanned by $\vc{v}$ and $\vc{w}$ can be described using $x_u \vc{u} + x_v \vc{v} + x_w \vc{w}$, and hence $\vc{u}$, $\vc{v}, \vc{w}$ cannot be a basis for the full plane. Hence, they are linearly independent.

3: Assume $\vc{u}$, $\vc{v}$ and $\vc{w}$ are vectors in the plane. First study $\vc{u}$ and $\vc{v}$. If they are linearly dependent, i.e., parallel, then the entire set of vectors $\vc{u}$, $\vc{v}$, $\vc{w}$ are also linearly dependent and we are finished. However, if $\vc{u}$ and $\vc{v}$ are linearly independent, then according to (1) they constitue a basis in the plane. This means that $\vc{w}$ can be written as a linear combination

4: The proof for 4 follows that of 3.

\begin{equation} \vc{w} = \lambda_1 \vc{u} + \lambda_2 \vc{v} \end{equation} | (5.91) |

\begin{equation} \lambda_1 \vc{u} + \lambda_2 \vc{v} + (-1)\vc{w} = 0 \end{equation} | (5.92) |

$\square$

In this section, the concept of span will be introduced. It is useful when talking about how much of a vector space a set of vectors can "occupy" as a linear combination. We start with the definition.

Definition 5.3:
Span

The set of vectors $\{\vc{v}_1,\dots,\vc{v}_q\}$ in $\R^n$ is said to be span $\R^n$, if the equation

has at least one solution, for * every * vector $\vc{u}$.

The set of vectors $\{\vc{v}_1,\dots,\vc{v}_q\}$ in $\R^n$ is said to be span $\R^n$, if the equation

\begin{equation} k_1\vc{v}_1 + k_2 \vc{v}_2 + \dots + k_q \vc{v}_q = \vc{u}, \end{equation} | (5.93) |

Example 5.11:

Examine if the three vectors $\vc{u} = (1, 2, 3)$, $\vc{v} = (2, 3, 4)$, and $\vc{w} = (-1, 0, 1)$ span $\R^3$.

We use the definition to see

has at least one solution for every $\vc{u}= (u_1, u_2, u_3)$.
Each of the three components now give rise to an equation, resulting in the system

Eliminating $k_1$ from the second and third equation gives

Now if we try to eliminate $k_2$ from the third equation we get

The system of equations does not have a solution for every $\vc{u} = (u_1, u_2, u_3)$. In fact there is only a solution if the vector lies in the plane defined by $u_1-2 u_2+u_3 = 0$. So the three vectors do not span $\R^3$.

If we instead change the last vector to $\vc{w}' = (-1, 0, 2)$, we get the following system of equations

Eliminating $k_1$ gives

This equation is solvable for all $\vc{u} = (u_1, u_2, u_3)$. So the three vectors do span $\R^3$.
The last equation gives $k_3 = u_1-2 u_2+u_3$, which inserted into the second equation gives $k_2 = 4u_1-5u_2+2u_3$, and inserting these two values into the top equation gives $k_1=-6 u_1 + 8 u_2 - 3 u_3$. We thus have a unique solution

for every $\vc{u} = (u_1, u_2, u_3)$.

Examine if the three vectors $\vc{u} = (1, 2, 3)$, $\vc{v} = (2, 3, 4)$, and $\vc{w} = (-1, 0, 1)$ span $\R^3$.

We use the definition to see

\begin{equation} k_1 \begin{pmatrix}1\\ 2 \\ 3\end{pmatrix} + k_2 \begin{pmatrix}2\\ 3 \\ 4\end{pmatrix} + k_3 \begin{pmatrix}-1\\ 0 \\ 1\end{pmatrix} = \begin{pmatrix}u_1\\ u_2 \\ u_3\end{pmatrix} \end{equation} | (5.94) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ k_3 \bsfem & = u_1,\\ 2 k_1 \bsfem & + \ 3 k_2 \bsfem & & = u_2,\\ 3 k_1 \bsfem & + \ 4 k_2 \bsfem & + \ k_3 \bsfem & = u_3.\\ \end{array} \end{cases} \end{equation} | (5.95) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ \hid{2} k_3 \bsfem & = u_1,\\ & k_2 \bsfem & - \ 2 k_3 \bsfem & = 2u_1-u_2,\\ & 2 k_2 \bsfem & - \ 4 k_3 \bsfem & = 3 u_1-u_3.\\ \end{array} \end{cases} \end{equation} | (5.96) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ \hid{2} k_3 \bsfem & = u_1,\\ & k_2 \bsfem & - \ 2 k_3 \bsfem & = 2 u_1-u_2,\\ & & 0 \bsfem & = u_1-2 u_2+u_3.\\ \end{array} \end{cases} \end{equation} | (5.97) |

If we instead change the last vector to $\vc{w}' = (-1, 0, 2)$, we get the following system of equations

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \hid{2}k_3 \bsfem & = u_1,\\ 2 k_1 \bsfem & + \ 3 k_2 \bsfem & & = u_2,\\ 3 k_1 \bsfem & + \ 4 k_2 \bsfem & + 2k_3 \bsfem & = u_3.\\ \end{array} \end{cases} \end{equation} | (5.98) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ \hid{2} k_3 \bsfem & = u_1,\\ & k_2 \bsfem & - \ 2 k_3 \bsfem & = 2 u_1-u_2,\\ & 2 k_2 \bsfem & - \ 5 k_3 \bsfem & = 3 u_1-u_3.\\ \end{array} \end{cases} \end{equation} | (5.99) |

\begin{equation} \begin{cases} \begin{array}{rrrl} k_1 \bsfem & + \ 2 k_2 \bsfem & - \ \hid{2} k_3 \bsfem & = u_1,\\ & k_2 \bsfem & - \ 2 k_3 \bsfem & = 2 u_1-u_2,\\ & & k_3 \bsfem & = u_1-2 u_2+u_3.\\ \end{array} \end{cases} \end{equation} | (5.100) |

\begin{equation} \begin{cases} \begin{array}{lrrr} k_1 & = -6 u_1 \bsfem & + \ 8 u_2 \bsfem & - \ 3 u_3 & \bsfem\\ k_2 & = 4 u_1 \bsfem & - \ 5 u_2 \bsfem & + \ 2 u_3 & \bsfem\\ k_3 & = \hid{1} u_1 \bsfem & - \ 2 u_2 \bsfem & + \ \hid{1} u_3 & \bsfem\\ \end{array} \end{cases} \end{equation} | (5.101) |

Theorem 5.6:
Requirements of a basis

A set of vectors $\{\vc{u}_1, \vc{u}_2, \ldots, \vc{u}_k\}$ is a basis in $\R^n$ if and only if the vectors are linearly independent and span $\R^n$.

A set of vectors $\{\vc{u}_1, \vc{u}_2, \ldots, \vc{u}_k\}$ is a basis in $\R^n$ if and only if the vectors are linearly independent and span $\R^n$.

Assume that the vectors $\vc{u}_1, \vc{u}_2, \ldots, \vc{u}_q$ are a basis of $\R^n$. Then by Definition 2.10, there is a solution to $k_1\vc{v}_1 + k_2 \vc{v}_2 + \dots + k_q \vc{v}_q= \vc{u}$ for every $\vc{u}$, so the vectors span $\R^n$ according to Definition 5.3. To see that the vectors are linearly independent, we note that the equation $ \sum_{i=1}^q k_i \vc{v}_i = \vc{0}$ has one solution, $k_1=0, k_2=0, \ldots, k_q=0$. According to the definition of a basis (Definition 2.10), there is only one solution. This proves that the vectors are linearly independent.

To complete the proof in the other direction, we assume that the vectors $\vc{u}_1, \vc{u}_2, \ldots, \vc{u}_q$ span $\R^n$ and are linearly independent. We need to show that the $\sum_{i=1}^q k_i \vc{v}_i = \vc{u}$ has exactly one solution for every $\vc{u}$. Since the vectors span $\R^n$, there is at least one solution for every $\vc{u}$. Assume that there are two solutions, $(k_1, k_2, \ldots, k_q)$ and $(k^\prime_1, k^\prime_2, \ldots, k^\prime_q)$. If we subtract the equation $\sum_{i=1}^q k^\prime_i \vc{v}_i = \vc{u}$ from $\sum_{i=1}^q k_i \vc{v}_i = \vc{u}$, we get $\sum_{i=1}^q (k_i-k^\prime_i) \vc{v}_i = \vc{0}$. Since the vectors are linearly independent (Definition 5.2), we have $(k_i-k^\prime_i) = 0$, but this gives $k_i=k^\prime_i$, so the solutions are the same.

$\square$

Assume we have basis in the plane, i.e., two basis vectors, $\vc{e}_1$ and $\vc{e}_2$, that are non-parallel. In addition, assume that we have another pair of non-parallel vectors $\hat{\vc{e}}_1, \hat{\vc{e}}_2$ that also constitute a basis. As we have seen above, these two vectors can now be described using the first two vectors as

\begin{gather} \hat{\vc{e}}_1 = x_{11} \vc{e}_1 + x_{21} \vc{e}_2,\\ \hat{\vc{e}}_2 = x_{12} \vc{e}_1 + x_{22} \vc{e}_2. \end{gather} | (5.102) |

\begin{gather} \vc{u} = u_1 \vc{e}_1 + u_2 \vc{e}_2 .\\ \end{gather} | (5.103) |

\begin{gather} \vc{u} = \hat{u}_1 \hat{\vc{e}}_1 + \hat{u}_2 \hat{\vc{e}}_2.\\ \end{gather} | (5.104) |

\begin{gather} \vc{u} = \hat{u}_1 (x_{11} \vc{e}_1 + x_{21} \vc{e}_2) + \hat{u}_2 (x_{12} \vc{e}_1 + x_{22} \vc{e}_2), \end{gather} | (5.105) |

\begin{gather} \vc{u} = (x_{11} \hat{u}_1 + x_{12} \hat{u}_2 ) \vc{e}_1 + (x_{21}\hat{u}_1 + x_{22}\hat{u}_2 ) \vc{e}_2. \end{gather} | (5.106) |

\begin{gather} u_1 = x_{11} \hat{u}_1 + x_{12} \hat{u}_2, \\ u_2 = x_{21}\hat{u}_1 + x_{22}\hat{u}_2. \end{gather} | (5.107) |

We will return to change of basis in the Chapter 6, which covers the topic of matrices.