Equation 5.33: \begin{equation}
\begin{cases}
\begin{array}{rrrl}
3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6},\\
& 6&\bs y + 6&\bs z = 2\sqrt{6},\\
& & &\bs 0 = 3\sqrt{6}.\\
\end{array}
\end{cases}
\end{equation}
Definition 2.5: Column Vector Notation Given a basis, a column vector, $\vc{v}$, in $n$ dimensions (we have used $n\in [1,2,3]$) is a column of $n$
scalar values. These scalar components, sometimes called vector elements,
of the vector can either be numbered, i.e., $v_1$, $v_2$, and $v_3$,
or we can use $x$, $y$, and $z$ as subscripts when that is more convenient. The notation is:
\begin{gather}
\underbrace{ \vc{u} =
\begin{pmatrix} u_x
\end{pmatrix} =
\begin{pmatrix} u_1
\end{pmatrix}}_{\text{1D vector}},
\spc\spc
\underbrace{ \vc{v} =
\begin{pmatrix} v_x \\
v_y
\end{pmatrix} =
\begin{pmatrix} v_1 \\
v_2
\end{pmatrix}}_{\text{2D vector}},
\spc\spc
\\
\underbrace{ \vc{w} =
\begin{pmatrix} w_x \\
w_y \\
w_z
\end{pmatrix} =
\begin{pmatrix} w_1 \\
w_2 \\
w_3
\end{pmatrix}}_{\text{3D vector}},
\end{gather}
where $\vc{u} = u_x \vc{e}_1$, $\vc{v} = v_x \vc{e}_1 + v_y \vc{e}_2$,
and $\vc{w} = w_x \vc{e}_1 + w_y \vc{e}_2 + w_z \vc{e}_3$.
Definition 5.2: Linear Independence and Dependence The set of vectors, $\vc{v}_1,\dots,\vc{v}_n$, is said to be linearly independent, if the equation
\begin{equation}
k_1\vc{v}_1 + k_2 \vc{v}_2 + \dots + k_n \vc{v}_n = \vc{0},
\end{equation}
only has a single solution which is
\begin{equation}
k_1 = k_2 = \dots = k_n =0.
\end{equation}
If there is at least one other solution, then the set of vectors is linearly dependent.
Theorem 5.4: Linear combination and linear dependence One of the vectors $\vc{v}_1, \vc{v}_2, \dots, \vc{v}_n$ can be written as a linear combination of the others if and only if the vectors $\vc{v}_1, \vc{v}_2, \dots, \vc{v}_n$ are linearly dependent.
Theorem 2.4: Coordinates in Two Dimensions Let $\vc{e}_1$ and $\vc{e}_2$ be two non-parallel vectors (which both lie in a plane).
For every vector, $\vc{v}$, in this plane, there is a single coordinate pair, $(x,y)$, such that
\begin{equation}
\vc{v} = x\vc{e}_1 + y\vc{e}_2.
\end{equation}
(The vectors $\vc{v}$, $\vc{e}_1$, and $\vc{e}_2$ can be moved around in the figure.)
Theorem 2.5: Coordinates in Three Dimensions Let $\vc{e}_1$, $\vc{e}_2$, and $\vc{e}_3$ be three non-zero basis vectors,
and that there is no plane that is parallel with all three vectors.
For every vector, $\vc{v}$, in the three-dimensional space, there is a single coordinate triplet, $(x,y,z)$, such that
\begin{equation}
\vc{v} = x\vc{e}_1 + y\vc{e}_2 + z\vc{e}_3.
\end{equation}
Definition 2.10: Basis in $\R^n$ A basis in $\R^n$ is a set of vectors $\{\vc{e}_1, \ldots, \vc{e}_m\}$ in so that for every vector $\vc{u}\in\R^n$,
there is a unique set of coordinates $(u_1, \ldots, u_m)$ so that
\begin{equation}
\vc{u} = \sum_{i=1}^m u_i \vc{e}_i.
\end{equation}
Definition 5.3: Span The set of vectors $\{\vc{v}_1,\dots,\vc{v}_q\}$ in $\R^n$ is said to span $\R^n$, if the equation
\begin{equation}
k_1\vc{v}_1 + k_2 \vc{v}_2 + \dots + k_q \vc{v}_q = \vc{u},
\end{equation}
has at least one solution, for every vector $\vc{u}$.
This chapter is about Gaussian Elimination which is a method for solving systems of linear equations. Such systems are often encountered when dealing with real problems, such as this computer vision problem: Given a number of images of an object, calculate a 3D model of the object.
Figure 5.1:
Four out of 671 pictures of Örebro Castle. Images courtesy of Carl Olsson at Lund University.
Figure 5.1:
Four out of 671 pictures of Örebro Castle. Images courtesy of Carl Olsson at Lund University.
Figure 5.1 shows four images of Örebro Castle. Using those, and 667 other similar images, it is possible to calculate the point cloud shown in Figure 5.2. These calculations include solving many systems of equations. Gaussian Elimination is a very good way to learn about such systems. For more information about this example see this YouTube clip.
Interactive Illustration 5.2:
Point cloud calculated from 671 images. Press the rotation icon in the top left corner to rotate the point cloud, or right-hand click plus drag to see the model from any orientation. Try look at the castle from above to see the geometry!
Interactive Illustration 5.2:
Point cloud calculated from 671 images. Press the rotation icon in the top left corner to rotate the point cloud, or right-hand click plus drag to see the model from any orientation. Try look at the castle from above to see the geometry!
One way to familiarize oneself with systems of equations is to investigate what happen when two lines cross.
To describe a two-dimensional line in the plane, one of the first
notations that many people encounter is
\begin{equation}
y = kx + m,
\end{equation}
(5.1)
where $k$ denotes the slope of the line and $m$ its intersection with
the $y$-axis. This notation works well for most lines, but not for
all. No matter what value is used for $k$, it will not be possible to
represent a completely vertical line ($x=4$, for example). Therefore,
the mathematically more convenient notation
\begin{equation}
ax + by + c = 0
\end{equation}
(5.2)
is typically preferred. This works well for all lines of the form $y =kx + m$, just set $a=-k$, $b=1$ and $c=-m$. In addition, it also works
for vertical lines. For instance, setting $a=1$, $b=0$ and $c=-4$
represents the line $x=4$ that goes through the point $(4,0)$ and is
parallel to the $y$-axis. In Figure 5.3, we
show two lines represented by the equations $2x + 3y + 5 = 0$ and
$5x+2y-4=0$.
$2 x + 3 y + 5 = 0$
$5 x + 2 y -4 = 0$
Interactive Illustration 5.3:
An example of two lines described on the form $ax+by+c=0$. These lines intersect at $(2,-3)$.
Interactive Illustration 5.3:
An example of two lines described on the form $\hid{ax+by+c=0}$. These lines intersect at $\hid{(2,-3)}$.
The line $2x+3y+5=0$ consists of all the points $(x,y)$ for which
$2x+3y+5=0$ holds. Likewise, the other line consists of all the
$(x,y)$ pairs for which $5x+2y-4=0$ is satisfied. The intersection
point of the two lines must be a coordinate $(x,y)$ which satisfies
both equations. Hence to find the intersection point, we need to solve
the system of equations
\begin{equation}
\begin{cases}
2 x + 3 y = -5,\\
5 x + 2 y = 4.
\end{cases}
\end{equation}
(5.3)
The system of equations described above is called linear, since it
only contains polynomial terms of the first degree such as $2 x$ and
$3 y$, and not any higher order terms such as $x^2$, $xy$ or $y^3$ or any other non-linear terms such as $\cos x$ or $e^x$. Such
systems have the nice property that they are easy to investigate. For
instance, if there is a solution, it is always possible to find it. In
case there is no solution, such as in the case when the two lines are
parallel and do not overlap, the investigation will show
this. Finally, in the case when there are many solutions, such as in
the case when the two lines are parallel and overlapping, the
investigation will reveal also this.
Interactive Illustration 5.4 shows how the point of intersection between two lines, i.e., the solution
to systems of equations such as (5.3), changes interactively when the two line equations are changed.
Interactive Illustration 5.4:
The reader may move the red and blue points on the lines in order to change the corresponding
line equation. The illustration then uses the techniques in this chapter to find the point
of intersection (green) between the two lines.
Interactive Illustration 5.4:
The reader may move the red and blue points on the lines in order to change the corresponding
line equation. The illustration then uses the techniques in this chapter to find the point
of intersection (green) between the two lines.
This chapter presents a systematic way, called Gaussian
elimination, of analyzing and solving such systems. However,
first we will start out with a few more examples of linear equations.
Example 5.1:Video Compression Each pixel in a videoframe is displayed on a computer screen by
lighting up three subpixels, one for red, one for green, and one
for blue. In the top row of Figure 5.5, these
color components are shown independently to the right.
Figure 5.5:
Top: the red, green, and blue components of an image shown as gray scale values. Bottom: the RGB color components
have been transformed to $Y$ (roughly luminance) and $Cb$ & $Cr$ (chrominance), and also shown as gray scale values.
Figure 5.5:
Top: the red, green, and blue components of an image shown as gray scale values. Bottom: the RGB color components
have been transformed to $\hid{Y}$ (roughly luminance) and $\hid{Cb}$ & $\hid{Cr}$ (chrominance), and also shown as gray scale values.
As can be seen, these color channels (RGB) are quite similar to
each other, which is due to the fact that all three are affected
by luminance, i.e., the overall brightness per pixel. Stating the
same thing three times is inefficient, and therefore video coding
schemes such as H.264 do not store the pixels as $(R, G, B)$. Instead $(Y, Cb, Cr)$ is stored, where
\begin{equation}
\begin{cases}
\begin{array}{rrrl}
Y = & \bs 0.299 & \bs R + 0.587 & \bs G + 0.114 & \bs B,\\
Cb = & \bs -0.169 & \bs R - 0.331 & \bs G + 0.500 & \bs B + 128,\\
Cr = & \bs 0.500 & \bs R - 0.419 & \bs G - 0.081 & \bs B + 128.
\end{array}
\end{cases}
\end{equation}
(5.4)
As can be seen in the lower row of Figure 5.5,
the $Y$, $Cb$ and $Cr$ channels are now less similar to each
other. The last two channels also seem more washed-out, and this
translates, it turns out, to that fewer bits can be used to
represent the image. On top of this, the human visual system is
much more sensitive to errors in luminance (roughly $Y$) than in the two
other channels. This is also exploited by representing the chrominance channels less
accurately, which results in even fewer bits.
During decompression, we need to get back from $(Y, Cb, Cr)$ to
$(R, G, B)$ in order to display the pixel on the screen. Assume we
have computed that the color in the $(Y, Cb, Cr)$-space for a
pixel is $Y=180$, $Cb=80$, and $Cr=150$. What $(R, G, B)$ color
does that represent? The answer can be given by inserting the
values for $Y$, $Cb$ and $Cr$ in the equations above and solving
the resulting linear system of equations, i.e.,
\begin{equation}
\begin{cases}
\begin{array}{rrrl}
0.299 &\bs R + 0.587 &\bs G + 0.114 &\bs B = &\bs 180,\\
-0.169 &\bs R - 0.331 &\bs G + 0.500 &\bs B = &\bs -48,\\
0.500 &\bs R - 0.419 &\bs G - 0.081 &\bs B = &\bs 22.
\end{array}
\end{cases}
\end{equation}
(5.5)
Example 5.2:Electronic Circuit In the electronic circuit below, we would like to calculate the
currents $I_1$, $I_2$, and $I_3$.
Figure 5.6:
An example circuit. The currents $I_1$, $I_2$ and $I_3$ are unknown.
Figure 5.6:
An example circuit. The currents $\hid{I_1}$, $\hid{I_2}$ and $\hid{I_3}$ are unknown.
Kirchoff's voltage law states that the sum of the voltages equals
zero when going in a closed loop in the circuit. If we apply this
to the left part of the circuit we get $5V - I_2 1\mathrm{k}\Omega - I_2 2\mathrm{k}\Omega = 0$. If we apply it to the outer loop of
the circuit we get $5V - I_3 4\mathrm{k}\Omega = 0$. Finally,
Kirchoff's current law can be used at the red point in the circuit
to get a third equation, $I_1 = I_2 + I_3$. In summary, this
yields the following system of equations
where the units have been removed. This can be solved to recover
$I_1$, $I_2$, and $I_3$. Solving such systems of equations is the topic of the rest of this
chapter.
Example 5.3:Fitting a circle to three points There are some problems that although non-linear, can be changed into a system of linear equations after a change of variables. Here we illustrate this with the problem of fitting a circle to three points.
A circle is given by its center point $(u,v)$ and its radius $r$. A point $(x_1,y_1)$ lies on this circle
if
Since there are $3$ unknowns ($u$, $v$ and $r$) one might expect that we need $3$ equations.
So maybe three points ($(x_1,y_1)$, $(x_2,y_2)$, and $(x_3,y_3)$) are needed to specify a circle.
What is the circle that goes through the three points $(-2,3)$, $(-1,1)$, and $(-1,2)$?
This gives a system of 3 non-linear equations
By introducing $w = r^2-u^2-v^2$ we obtain a system of linear equations
\begin{equation}
\begin{cases}
w -4u +6v = 13 ,\\
w -2u +2v = 2 ,\\
w -2u +4v = 5 .\\
\end{cases}
\end{equation}
(5.11)
Again note that this seemingly non-linear problem could be converted into a system of linear equations.
When this has been solved, we obtain values for $w$, $u$, and $v$, which can then be used to
find out what the radius is, using $w = r^2-u^2-v^2$, i.e., $r^2 = w+u^2+v^2$.
Interactive Illustration 5.7:
Given the three points (red, green, blue) denoted by $(x_i,y_i)$, $i\in\{1,2,3\}$,
we can calculate the center, $(u,v)$, of the circle and its radius $r$. The reader
may move the red, green, and blue points.
Interactive Illustration 5.7:
Given the three points (red, green, blue) denoted by $\hid{(x_i,y_i)}$, $\hid{i\in\{1,2,3\}}$,
we can calculate the center, $\hid{(u,v)}$, of the circle and its radius $\hid{r}$. The reader
may move the red, green, and blue points.
$(x_1,y_1)$
$(x_2,y_2)$
$(x_3,y_3)$
$(u,v)$
$r$
Example 5.4:4000 year old Babylonian problem Linearsystems of equations were described on Babylonian clay tablets. One such problem is the following:
"There are two fields whose total area is 1800 square yards. One produces grain at the rate of 2/3 of a bushel per square yard while the other produces grain at the rate of 1/2 a bushel per square yard. If the total yield is 1100 bushels, what is the size of each field?"
Introduce a variable for the size of each field, i.e. $x_1$ square yards of the first and $x_2$ square yards of the second. Then the problem can be reformulated as
There are many possible ways to solve systems of equations, and you
have most likely studied some of them before. However, what Gaussian
elimination provides is a way that always works. You could
think of it as a cookbook recipe that will always give you the same
tasty dish if you just follow the steps in the recipe. In our case the
end result is not edible, but a system of equations that has a
triangular shape, and is hence called a triangular systems of
equations. This can be seen in
Figure 5.8.
Interactive Illustration 5.8:
Here a systems of equations is shown. Press the Forward button to proceed.
Interactive Illustration 5.8:
A triangle is used to highlight the triangular structure of the systems of equations.
Interactive Illustration 5.8:
A triangle is used to highlight the triangular structure of the systems of equations.
Although the system in
Figure 5.8 has three
equations and three unknowns, it is very simple to solve. The reason
is that the bottom-most equation is so simple, $z$ can be recovered by
dividing by $11$ on both sides, and thus $z=\frac{-33}{11} = -3$. This
value of $z$ can then be inserted in the second to last equation,
yielding $2y + 3(-3) = -5$, or $2y = 4$. Again, it is simple to solve
$y=2$ and both $y$ and $z$ can now be inserted in the top equation to
generate a simple equation for $x$.
The goal of Gaussian elimination is to start with any linear system of
equations, and then transform it into a triangular system. This is
done using only three simple rules, as shown in the following
theorem.
Theorem 5.1:Gaussian Elimination Rules
The solutions to a linear system of equations do not change if we
swap the order of two rows,
multiply a row with a constant $\neq 0$, or
add a multiple of another row to a row.
At this point, we defer the proof of this theorem until Section 5.7.1,
where we also repeat this theorem for the sake of completeness.
As an example, assume we have the following system of equations,
\begin{equation}
\begin{cases}
\begin{array}{rl}
-x + \hid{4} y =& \!\!\!\! 8,\\
2 x + 4y =& \!\!\!\!-10.
\end{array}
\end{cases}
\end{equation}
(5.13)
We can now use the second and third rules and add two times the first
row to the last. This brings us to the system
which is a triangular system, and those can easily be solved, as we
have seen earlier in this chapter. From the last row, we can see that
$y=1$, which inserted into the top equation gives us $-x + 1 = 8$,
i.e., $x=-7$.
The example above was easy to solve since we could use
an integer multiple. In general, this is not always possible. Consider for
instance,
\begin{equation}
\begin{cases}
2 x + 3 y = -5, \\
5 x + 2 y = 4.
\end{cases}
\end{equation}
(5.15)
To eliminate the $x$-variable we would need to multiply the top row by
$-\frac{5}{2}$ before adding it to the bottom row. Since it is easier
to do calculations with integers than with fractions, it is simpler to
multiply the top equation by $5$ and the bottom equation by $2$ and
then subtract them. This is explained in detail in
Interactive Illustration 5.9.
Interactive Illustration 5.9:
Typical example of Gaussian elimination. The first step is simply to copy the first equation, which can be seen when clicking/pressing Forward.
Interactive Illustration 5.9:
We start by working on the bottom equation. We now want to eliminate the first variable, $x$ in this case.
To do so, we show some scrap calculations to the right. We copy the first equation to the scrap.
Interactive Illustration 5.9:
Now we multiply it by the coefficient in front of the $x$ in the second equation, here shown in red.
Interactive Illustration 5.9:
Both the left hand side...
Interactive Illustration 5.9:
... and on the right hand side. Next, we copy the second equation to the scrap.
Interactive Illustration 5.9:
This equation should be multiplied by the coefficient shown in green.
Interactive Illustration 5.9:
Now these two lines are subtracted from each other.
Interactive Illustration 5.9:
As can be seen, the coefficient in front of $x$, equals $(\textcolor{#00dd00}{5}\cdot 2 - \textcolor{red}{2} \cdot 5)$, which is $0$. Thus we can rewrite the equation (next step).
Interactive Illustration 5.9:
The rewritten equation is then moved back from the scrap.
Interactive Illustration 5.9:
We have done the first step in Gaussian Elimination.
Interactive Illustration 5.9:
Typically the scrap equations to the right are calculated in the head, so this final rendering show what would be the result when performing Gaussian Elimination on paper.
Interactive Illustration 5.9:
We start by working on the bottom equation. We now want to eliminate the first variable, $\hid{x}$ in this case.
To do so, we show some scrap calculations to the right. We copy the first equation to the scrap.
$\Big\{\begin{array}{l} 2 x + 3 y = -5 \\ 5 x + 2 y = 4 \end{array}$
$\class{hidden}{\Big\{}\begin{array}{l} 2 x + 3 y = -5 \\ \class{hidden}{5 x + 2 y = 4} \end{array}$
$2 x + 3 y = -5$
$\Big\{\begin{array}{l} 2 x + 3 y = -5 \\ \textcolor{red}{5} x + 2 y = 4 \end{array}$
$\textcolor{red}{5}\class{hidden}{(2 x + 3 y) = 5 (-5) }$
$\textcolor{red}{5}\class{hidden}{(2 x + 3 y) = 5 (-5) }$
$\class{hidden}{5}(2 x + 3 y) = \class{hidden}{5} (-5)$
$\textcolor{red}{5} (2 x + 3 y) = \textcolor{red}{5} (-5)$
$5 x + 2 y = 4$
$5 x + 2 y = 4$
$\class{hidden}{2}(5 x + 2 y) = \class{hidden}{2}(4)$
$\Big\{\begin{array}{l} \textcolor{#00dd00}{2} x + 3 y = -5 \\ 5 x + 2 y = 4 \end{array}$
$\textcolor{#00dd00}{2}$
$\textcolor{#00dd00}{2}$
$\textcolor{#00dd00}{2}\class{hidden}{(5 x + 2 y) =}\textcolor{#00dd00}{2}\class{hidden}{(4)}$
_____________________
$(\textcolor{red}{5}\cdot 2 - \textcolor{#00dd00}{2}\cdot 5) x + (\textcolor{red}{5} \cdot 3 - \textcolor{#00dd00}{2} \cdot 2) y = (\textcolor{red}{5})(-5) - \textcolor{#00dd00}{2}(4)$
$\Leftrightarrow$
$11 y = -33$
$\class{hidden}{\Big\{}\begin{align*} \class{hidden}{2 x + 3 y} & \class{hidden}{= -5} \\ 11 y & = -33 \end{align*}$
$\class{hidden}{5(2 x + 3 y) = }\textcolor{red}{5}\class{hidden}{(-5)}$
$\class{hidden}{2(5 x + 2 y) =}\textcolor{#00dd00}{2}\class{hidden}{(4)}$
$\class{hidden}{\Big\{}\begin{align*} \class{hidden}{2 x + 3 y} & \class{hidden}{= -5} \\ 11 y & = -33 \end{align*}$
$\class{hidden}{\Big\{}\begin{align*} 2 x + 3 y & = -5 \\ \class{hidden}{11 y} & \class{hidden}{= -33} \end{align*}$
$2 x + 3 y = -5$
$\textcolor{red}{5}\class{hidden}{(2 x + 3 y) = 5 (-5) }$
$\class{hidden}{5(2 x + 3 y) =}\textcolor{red}{5}\class{hidden}{(-5)}$
$5 x + 2 y = 4$
$\textcolor{#00dd00}{2}\class{hidden}{(5 x + 2 y) = 2(4)}$
$\class{hidden}{2(5 x + 2 y) =}\textcolor{#00dd00}{2}\class{hidden}{(4)}$
$\class{hidden}{\Big\{}\begin{align*} \class{hidden}{2 x + 3 y} & \class{hidden}{= -5} \\ 11 y & = -33 \end{align*}$
Note that the step from the original equation system to the triangular
equation system in Interactive Illustration 5.9 is not a direct
application of one of the rules. For example, we add a multiple of one
row to a multiple of another row, and there is no such rule. However,
this can be seen as applying two rules in succession, namely, first
multiplying the second row by -2 (rule 2), then adding 5 times the
first row to this (rule 3). This means that the transactions in
Interactive Illustration 5.9
are legal after all.
Sometimes it is not possible to eliminate the first variable in the
lower rows, since it is not present in the first row. An example is
The system is now triangular and it is simple to calculate $x_3 =\frac{-324}{-54} = 6$. Inserting this in the second row yields $-10x_2 + 12 = -38$, which is equivalent to $-10 x_2 = -50$, i.e., $x_2 =5$. Finally, $x_2 = 5$ and $x_3 = 6$ are inserted into the top equation,
which gives $2 x_1 + 20 - 12 = 16$, which can be simplified to $2 x_1= 8$, or simply $x_1 = 4$. The system is now solved.
So far, we have only encountered systems of equations which have
exactly one solution. This is not always the case, as is demonstrated
in
\begin{equation}
\begin{cases}
\begin{array}{rl}
5 &\bs x + 2 y = 4,\\
10 &\bs x + 4 y = 13.
\end{array}
\end{cases}
\end{equation}
(5.20)
A normal attempt to solve this system would be to multiply the top
equation by -2 and add to the second equation. This gives
\begin{equation}
\begin{cases}
\begin{array}{rl}
5 x + 2 &\bs y = 4\\
&\bs 0 = 5
\end{array}
\end{cases}
\end{equation}
(5.21)
Zero can never be equal to five, and yet we have followed the Gaussian
elimination rules (Theorem 5.1)
exactly. This is because this system of equations has no
solution. It is easy to see why if we instead multiply the top
equation by 2, i.e.,
\begin{equation}
\begin{cases}
10 x + 4 y = 8,\\
10 x + 4 y = 13.
\end{cases}
\end{equation}
(5.22)
These two equations are simply contradicting each other. The left hand
sides are exactly the same, and yet the right hand sides are
different. Hence this system does not have a solution. Doing Gaussian
elimination on such a system will result in contradictions of the type
$0=5$, which we encountered above. When this happens, it is safe to
say that the system has no solution.
$5 x + 2 y -4 = 0$
$10 x + 4 y -13 = 0$
By interpreting the two equations in System of Equations (5.20) as two lines, it is possible to see
what is going on graphically. The top equation, $5x + 2y = 4$,
represents the left line in Figure 5.10.
The second equation, $10x + 4y = 13$, represent the rightmost line. As
can be seen in the figure, these lines are parallel and will thus
never cross. Since the crossing represents the solution, two
parallel lines in the plane separated from each other is equivalent to
a system of equations with no solution.
Having no solution is not the only special case. As an example, assume
we have the following system of equations,
\begin{equation}
\begin{cases}
\begin{array}{rll}
- &\bs x + &\bs y = \hid{-}8,\\
3 &\bs x - 3 &\bs y = -24.
\end{array}
\end{cases}
\end{equation}
(5.23)
Performing Gaussian elimination on the first variable can be done by
adding three times the first row to the second row. This gives
The second row thus collapses to $0=0$, which is always true, and no
information about the $y$-variable is obtained. When this happens,
there is not only one pair, $(x,y)$, that satisfies the solution, but
many. To express all such pairs, set $y = t$. This gives
\begin{equation}
\begin{cases}
\begin{array}{rl}
-x + t &\bsfyra = 8,\\
y &\bsfyra = t,\\
\end{array}
\end{cases}
\end{equation}
(5.25)
and if $x$ is solved for, we get
\begin{equation}
\begin{cases}
x = -8 + t,\\
y = \class{hidden}{-}0 + t.\\
\end{cases}
\end{equation}
Thus the vector $\begin{pmatrix} x\\ y \end{pmatrix}$ can be written
as $\begin{pmatrix} -8\\ 0 \end{pmatrix} + \begin{pmatrix} 1\\ 1 \end{pmatrix}t$. As we have seen in
Section 3.6.1, this is a line of the form $P(t) = S+ t\mathrm{\bf d}$, where $S = \begin{pmatrix} -8\\ 0 \end{pmatrix}$
and $\mathrm{\bf d} = \begin{pmatrix} 1\\ 1 \end{pmatrix}$. In other
words, there is an entire line of solutions to this system of
equations. This is shown in
Illustration 5.11, where the green dot
equals the point $S$, the red arrow shows the direction vector
$\mathrm{\bf d}$, and the resulting line is shown in black.
Another way to look at this is to go back to the original equation,
\begin{equation}
\begin{cases}
\begin{array}{rll}
- &\bs x + &\bs y = \hid{-}8,\\
3 &\bs x - 3 &\bs y = -24.
\end{array}
\end{cases}
\end{equation}
(5.27)
Now, if the bottom equation is multiplied by $-\frac{1}{3}$, we get
\begin{equation}
\begin{cases}
-x + y = 8,\\
-x + y = 8.
\end{cases}
\end{equation}
(5.28)
In other words, we get the same equation twice. The second equation
can be seen as just repeating what we already knew, i.e., no new
information is obtained. The two lines $-x + y = 8$ and $3x - 3y =-24$ are lying on top of each other, i.e., they are the same
line. This is shown in Interactive Illustration 5.11 if you press
the forward button below that illustration.
In fact, it is the same line, only that $-x +y = 8$ and $3x-3y=-24$
are written in implicit form, whereas $P(t) = S + \mathrm{\bf d}$ is
written in explicit form.
Now, consider a system with three equations and three unknowns, such
as
Just as in the case with the smaller two-equation systems, these
systems can be interpreted geometrically. However, in this case every
equation becomes a plane (see Section 3.6.2) in three-dimensional space. The solution to
the system of equations is the point where all three planes intersect
each other. This is shown in the first step of Interactive Illustration 5.12, where each plane
represents one equation, and they all intersect in a single point,
which is colored black in the figure. Note that you can change the viewpoint of
the interactive illustration by holding the right mouse button down
while moving the mouse (ctrl plus mouse button on mac, or pressing down with two fingers on mac laptop).
On an iPad, change the camera position by dragging using two fingers.
Interactive Illustration 5.12:One solution example:
when all three planes intersect in one point, we have exactly one solution.
Interactive Illustration 5.12:Several solutions example: if the three planes intersect along the same line, then every point along this line is a valid solution.
The line of solutions is marked with solid black.
Interactive Illustration 5.12:Several solutions example: this is another example where there are infinitely many solutions along a line,
because there are two equations representing the same plane (dashed) and the third plane crosses this plane, which
results in a line of intersection that represents the solutions.
Interactive Illustration 5.12:Several solutions example: finally, if all equations represent the same plane, then every point on that plane is a solution.
Interactive Illustration 5.12:Several solutions example: this is another example where there are infinitely many solutions along a line,
because there are two equations representing the same plane (dashed) and the third plane crosses this plane, which
results in a line of intersection that represents the solutions.
Sometimes however, the equations represent planes that all include the
same line, as shown in the second step in Interactive Illustration 5.12. Then there is not only
one solution, but infinitely many solutions to the system of
equations. These solutions are equal to the line shown in black.
In other cases, two equations represent the same plane. This may be
because they have the same coefficients, or one equation is equal to
the other times a constant. Then there is an entire plane that satisfy
these two equations, as illustrated by the dashed plane in the third
step in Interactive Illustration 5.12. The solution to the
entire system is then the intersection between this (double) plane and the
third plane. This amounts to an entire line of solutions, as
illustrated by the solid black line in the figure.
The last case when the system has a solution is when all three
equations represent the same plane, such as the system
\begin{equation}
\begin{cases}
\begin{array}{rrrl}
& \bs x + \!\!\!\!&\bs y + &\bs z = 1,\\
2&\bs x + 2&\bs y + 2&\bs z = 2,\\
3&\bs x + 3&\bs y + 3&\bs z = 3.\\
\end{array}
\end{cases}
\end{equation}
(5.30)
If a point $(x,y,z)$ satisfies the first equation, it also satisfies
the second and third equations since they are just the first equation
multiplied by $2$ or $3$. Therefore this point is also a solution to
the entire system of equations. The solution is therefore visualised
as entire black plane shown in the last step in
Figure 5.12.
For a system of two equations and two unknowns not to have a solution,
the two lines representing the two equations must be parallel as was
shown in Interactive Illustration 5.10. This
is not the case in three dimensions. Consider, for instance, the
following system of equations.
\begin{equation}
\begin{cases}
\begin{array}{rrrl}
3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6}\\
-3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6}\\
& -6&\bs y - 6&\bs z = \sqrt{6}\\
\end{array}
\end{cases}
\end{equation}
(5.31)
It is easy to see that none of these planes are parallel, since their
normals differ (and cannot be made equal by multiplying with a
constant). The three normals are $(3\sqrt{6}, 3, 3)$, $(-3\sqrt{6},3, 3)$, and $(0, -6, -6)$. Yet there exists no solution as is easily
seen when performing Gaussian elimination. Adding the first and second
row eliminates $x$ from the first equation, i.e.,
\begin{equation}
\begin{cases}
\begin{array}{rrrl}
3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6},\\
& 6&\bs y + 6&\bs z = 2\sqrt{6},\\
& -6&\bs y - 6&\bs z = \sqrt{6}.\\
\end{array}
\end{cases}
\end{equation}
(5.32)
Now to eliminate $y$, we add the two last equations together, which
results in
\begin{equation}
\begin{cases}
\begin{array}{rrrl}
3 \sqrt{6}&\bs x + 3&\bs y + 3&\bs z = \sqrt{6},\\
& 6&\bs y + 6&\bs z = 2\sqrt{6},\\
& & &\bs 0 = 3\sqrt{6}.\\
\end{array}
\end{cases}
\end{equation}
(5.33)
This also eliminates $z$ and ends up in the contradiction $0 =3\sqrt{6}$, which means that no solutions exist. How this looks
geometrically is shown in the first step in
Figure 5.13. There are places where
two of the equations are satisfied, but not the third, and these
places are marked by dotted lines. Note that there is no point which
is in all three planes, and thus no solution to the system of
equations exists, although the planes are not parallel.
Interactive Illustration 5.13:No solution example:
three planes that intersect each other pairwise, but
all three never intersect in a single point. The dashed lines are the
intersections between two planes, and represent positions where
two of the equations are satisfied but not the third.
Interactive Illustration 5.13:No solution example:
this is another case where there exists no solution since all three
planes are parallel to each other.
Interactive Illustration 5.13:No solution example:
in this case, two of the planes are identical. The corresponding
area is dashed to illustrate that two of the equations are
satisfied here. However, since no point satisfies also the third
equation, there exists no solution to the system of equations.
Interactive Illustration 5.13:No solution example:
even if only two of the planes are parallel to each other, there
exists no solution to the system of equations. The dashed lines
show where two of the equations are satisfied, while the third
equation is represented by the other parallel plane.
Interactive Illustration 5.13:No solution example:
in this case, two of the planes are identical. The corresponding
area is dashed to illustrate that two of the equations are
satisfied here. However, since no point satisfies also the third
equation, there exists no solution to the system of equations.
If all three planes are parallel and non-coinciding, then there is no
solution. This is shown in the second step of
Figure 5.13.
A special case of this is when two equations represent the same plane,
but the third equation represents a different plane parallel to, but not coinciding with, these.
This example is shown in the third step of
Figure 5.13, where the dashed plane
satisfies two of the equations and the other plane satisfies the third
equation. Again, no solution to this system of equations exists.
Finally, the last case when no solution exists is when two of the
planes are parallel to each other (and different from each other).
The third plane then intersects these two parallel planes along two
lines (dashed in step 4 in Interactive Illustration 5.13), but there is not a single
point that belongs to all three planes, and hence there exists no
solution to the system of equations.
In summary, we can see that a system of equations either has % a system ... has
Such a system of equations is called homogeneous, and it always has
the solution $x_1 = x_2 = x_3 = 0$. Since this solution is so 'easy' to find, it is often denoted the 'trivial' solution. Sometimes however, the system of equations has more
than the trivial solution. We use Gaussian elimination to see if this
is the case. The first step is to eliminate $x_1$ from the two
lower-most equations. For the middle equation we take two times the
first equation minus the second equation. For the last equation we
take the first equation minus the last equation, resulting in
Solving for $x_2$ gives $x_2 = -\frac{5}{8}t$, and inserting this in
the first equation gives $2 x_1 + 5 (-\frac{5}{8})t + 3 t = 0$.
This can be simplified to $x_1 = \frac{1}{16}t$. We have
As we have seen in Section 3.6, planes
and lines can be written on both implicit and explicit form. Using
Gaussian elimination, it is possible to convert between these two
types. Recall that the plane $P(t_1, t_2) = S + t_1 \vc{d}_1 + t_2 \vc{d}_2$ is
on explicit form, while $\vc{n}\cdot(P-S)=0$ is a plane on implicit form.
Assume we have the following plane
\begin{equation}
\begin{cases}
x &= \class{hidden}{0 +} \frac{3}{2} t_1, \\
y &= \class{hidden}{0 + \frac{3}{2}} t_1 + t_2, \\
z &= 1 \class{hidden}{\frac{3}{2}}-t_1 + 2 t_2,
\end{cases}
\end{equation}
(5.39)
where $S = (0, 0, 1)$, $\vc{d}_1 = (\frac{3}{2}, 1, -1)$, and $\vc{d}_2 = (0, 1, 2)$.
This can be transformed to explicit form by using Gaussian
elimination to remove the variables $t_1$ and $t_2$. To see how this
can be done, the system of equations is first rewritten so that the
variables to be eliminated are in the leftmost position, i.e.,
To eliminate $t_1$ in the second row, the two top rows are added. To
eliminate $t_1$ in the last row, the first row is multiplied by
$\frac{3}{2}$ and then the last row is subtracted, resulting in
It is now possible to eliminate $t_2$ from the last equation by
subtracting two times the last row from the middle row, i.e.,
\begin{equation}
\begin{cases}
\begin{array}{rl}
t_1 \class{hidden}{2}+ t_2 & \bsfem = y,\\
3 t_2 & \bsfem = y + z - 1,\\
0 & \bsfem= y + z - 1 - 2(\frac{3}{2}y - x).\\
\end{array}
\end{cases}
\end{equation}
(5.43)
The last row simplifies to $2x - 2y +z - 1 = 0$, which is the implicit
form of the plane equation. To go from implicit to explicit form is
easier. Setting $y=t_1$ and $z = t_2$ gives the equation system
\begin{equation}
\begin{cases}
\begin{array}{rlll}
x &\bsfem = \frac{1}{2} + t_1 -\frac{1}{2}t_2, \\
y &\bsfem = \class{hidden}{\frac{1}{2} } +t_1, \\
z &\bsfem = \class{hidden}{\frac{1}{2} + t1 \frac{1}{2}} + t_2.
\end{array}
\end{cases}
\end{equation}
(5.45)
This is now on explicit form, i.e., $P(t_1, t_2) = \hat{S} + t_1\hat{\vc{d}}_1 + t_2 \hat{\vc{d}}_2$, where
$\hat{S}=(\frac{1}{2},0,0)$, $\hat{\vc{d}}_1 = (1, 1, 0)$,
and $\hat{\vc{d}}_2 = (-\frac{1}{2}, 0, 1)$. Note that this explicit
form does not look like the one
(Equation (5.39)) we started out with, i.e.,
$\hat{S} \neq S$, $\hat{\vc{d}_1} \neq \vc{d}_1$, and $\hat{\vc{d}_2}\neq \vc{d}_2$. Yet those two represent the same plane. The
explanation is simply that there exists many different
parameterizations for a particular plane. This is shown in
Figure 5.14, where the point $S$, the vectors
$\vc{d}_1$ and $\vc{d}_2$ are shown in red and represent the plane
used in our example. However, one may also use another point
$\hat{S}$ and two other vectors $\hat{\vc{d}}_1$, $\hat{\vc{d}}_2$,
shown in blue, to represent the same plane.
The sections above have given examples of Gaussian elimination along
with geometrical interpretations where this was possible. In the
following section, we will go through the theory why Gaussian
elimination works.
Using Gaussian elimination, we arrived at a triangular system in
(5.19), which had the solution $\vc{x} =(x_1,x_2,x_3) = (4, 5, 6)$. But how can we be sure that this is a
solution also to the original system above? It is of course possible
try the solution. Inserting the solution, $\vc{x}$, in the above system gives
which matches exactly. However, even though this particular case has been
verified, it would be desirable with a proof that Gaussian elimination
will never change the solutions of a system of equations.
Here it is useful to introduce the notion of implication $\Rightarrow$ and equivalence $\Leftrightarrow$.
We say that one system of equations implies a second system of equations if every solution
to the first system is also a solution to the second system. When this works both ways we say that
the two systems of equations are equivalent.
Thus we would write
to say that the set of solutions to the second system is exactly the same as the solutions to the first system.
But how can we be sure that the solutions to the systems of equations do not change as we are manipulating them?
This is captured by the following theorem, which is repetition
of Theorem 5.1 as a convenience to the reader.
Theorem 5.2:Gaussian Elimination Rules
The solutions to a linear system of equations do not change if we
swap the order of two equations,
multiply an equation with a constant $\neq 0$, or
add a multiple of another equation to an equation.
The first rule of Gaussian elimination in
Theorem 5.2 says it is allowed to swap the
order of two equations. It should be readily appreciated that simple
reordering of the equations does not change the solution of the system
of equations, i.e.,
\begin{equation}
\begin{cases}
\begin{array}{rrrl}
a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfem \, = b_1 , \\
a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2 ,
\end{array}
\end{cases}
\Leftrightarrow
\begin{cases}
\begin{array}{rrrl}
a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2 , \\
a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfem \, = b_1 .
\end{array}
\end{cases}
\end{equation}
(5.49)
The next rule allows us to multiply an equation with a constant different
from $0$. From algebra, it should be clear that multiplying both sides of
an equation by a non-zero constant does not change the solution to the
equation, and hence does not affect the solution to the system of
equations.
\begin{equation}
\begin{cases}
\begin{array}{rrrll}
a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfem \, = b_1 , & \bsfem \, \, \, (i) \\
a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2 , & \bsfem \, \, \, (ii)
\end{array}
\end{cases}
\Leftrightarrow
\begin{cases}
\begin{array}{rrrll}
ka_{11} x & \bsfem \, + ka_{12} y & \bsfem \, + ka_{13} z & \bsfem \, = kb_1 , & \bsfem \, \, \, (i') = k (i)\\
a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2 . & \bsfem \, \, \, (ii') = (ii)
\end{array}
\end{cases}
\end{equation}
(5.50)
Notice that we can obtain the original system of equations by dividing $(i')$ with $k$.
Thus every solution to the equations $(i), (i)$ are solutions to $(i'), (i')$ and vice versa.
Multiplying by zero on the other hand can change the
solution. As an example, $(x,y) = (2, 3)$ does not satisfy the
equation
\begin{equation}
x + y = 1, \, \, \, (i)
\end{equation}
Here every solution to $(i)$ is a solution to $(i')$, but there might be solutions to $(i')$ that are not solutions to $(i)$.
We cannot retrieve the equation $(i)$ from $(i')$.
The third rule allows us to add a multiple of another row to a
row. Assume we have a solution $(x, y, z)$ that satisfies the system
\begin{equation}
\begin{cases}
\begin{array}{rrrll}
a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfem \, = b_1, & \bsfem \, \, \, (i)\\
a_{21} x & \bsfem \, + a_{22} y & \bsfem \, + a_{23} z & \bsfem \, = b_2, & \bsfem \, \, \, (ii)\\
a_{31} x & \bsfem \, + a_{32} y & \bsfem \, + a_{33} z & \bsfem \, = b_3. & \bsfem \, \, \, (iii)\\
\end{array}
\end{cases}
\end{equation}
(5.53)
By adding $k$ times the first row to the
second, we obtain
\begin{equation}
\begin{cases}
\begin{array}{rrrrll}
a_{11} x & \bsfem \, + a_{12} y & \bsfem \, + a_{13} z & \bsfyra = & \bsfem b_1, & \bsfem \, \, \, (i')=(i) \\
(a_{21}+ka_{11}) x & \bsfem \, + (a_{22}+ka_{12}) y & \bsfem \, + (a_{23}+ka_{13}) z & \bsfyra = & \bsfem b_2 + k b_1, & \bsfem \, \, \, (ii') = (ii)+k(i)\\
a_{31} x & \bsfem \, + a_{32} y & \bsfem \, + a_{33} z & \bsfyra = & \bsfem b_3. & \bsfem \, \, \, (iii') = (iii) \\
\end{array}
\end{cases}
\end{equation}
(5.54)
Since only standard algebra rules have been used here (e.g., adding a constant to both sides of an equation, which does not change
the equation itself),
a solution $(x,y,z)$ to the original System (5.53) must
also be a solution to the
System (5.54) above.
It is important to notice that we can obtain the original system
System (5.53) from the new system System (5.54) by
adding $-k$ of the first row (i') to the second row (ii').
Thus every solution to System (5.54) must
also be a solution to the
System (5.53) above.
Thus we have proved that the two systems are equivalent.
This chapter has given examples of systems of linear equations that
either have no solution, one solution, or an infinite number of
solutions. This section will go through the general case and formally
prove that this is the case.
Theorem 5.3:Solutions to a System of Linear Equations
A system of linear equations has either
where $a_{kl}$ and $b_k$ are known constants and where $x_k$ are the
unknown variables. We will first work with the first variable $x_1$. Two things can happen. Either all of the coefficients $a_{i1}$ are zero. Then there is nothing more we can do with this variable. In Example 5.5 we describe this case in more detail. The more typical case is that at least one of the coefficents $a_{i1}$ is non-zero. In this case it is
always possible to reorder the equations, using the first rule of Gaussian elimination, to a system such as the above
where $a_{11}\neq 0$. It is then possible to add $-(a_{21}/a_{11})$
times the first row to the second row, and $-(a_{31}/a_{11})$ times the
first row to the third row and so forth, and hence eliminate $x_1$
from all equations except the first. Calling the new constants
$a'_{kl}$ and $b'_k$ gives
After this step, the coefficients before $x_1$ is non-zero in the first equation only or they are all zero.
Now the same procedure can be applied to the smaller system of primed
constants, namely, all rows in (5.56) that do not involve $x_1$. Again there are two cases. Either all of these equations have a coefficient of zero for the variable $x_2$ that we are trying to eliminate, or there is at least one equation where there is a non-zero coefficient for $x_2$. After reordering and elimination, either all equations have zero coefficient for the variable $x_2$, or there is exactly one equation where there is a non-zero coefficient for $x_2$.
This process is continued until there are either no more equations involving the unknown variables.
In the end, there might be equations left of the type $0 = 0$. These can just be removed.
There are three possible outcomes.
The first case is if there is at least one equation of type $0 = \hat{b}_{i} \neq 0$,
in which case there is no solution to the system. An example of this is shown in the Systems of Equations (5.33).
For the continued discussion we assume that this is not the case.
The second case is when the resulting system is triangular, i.e.,
where $\hat{a}_{kl}$ and $\hat{b}_k$ are the constants after a full pass of Gaussian
elimination. In this case, the variable $x_M$ can be exactly recovered, and
this result can be used to recover $x_{M-1}$ from the second last
equation and so forth. Thus the system has one unique solution.
The third possibility is that there are fewer equations than unknown after elimination, e.g.,
As can be seen in this example, there are $3$ equations with non-zero, leading coefficients before $3$ variables $x_1$, $x_3$, and $x_4$.
There are in fact no constraints on the remaining two variables, i.e., $x_2$ and $x_5$.
One can then set $x_2 = t_1$, $x_5 = t_2$ and solve for the remaining variables. In this case
After this, the remaining variables (in this case $x_1$ and $x_3$) can
be solved. Since the solutions depends on a set of variables $\{t_1, t_2\}$, which can take on any values,
we have an infinite number of
solutions.
Thus, the system has either no solution, one solution, or an infinite number of
solutions.
$\square$
Note that for the case of infinite number of soutions, it is also interesting to understand how many free variables there are in the
solution set. Or differently phrased, what is the dimension of the solution set. The is captured by the concept of $\rank$,
which will be explained in Chapter 8.
Example 5.5:Vanishing Coefficients As mentioned in the proof above, sometimes when performing Gaussian elimination, a certain variable
may get a zero coefficient in all remaining equations. In this case
one may simply need to skip one step, as is illustrated in this
example,
We eliminate $x_1$ in the second line by adding $-2$ times the first equation to the second equation, and we eliminate $x_1$ in the third line by adding $-3$ times the first equation to the last equation. We then get
Normally after having eliminated $x_1$, the next step would
be to try to eliminate $x_2$. But in this case the
coefficient in front of $x_2$ is zero both for the second and third
line. We then simply skip this and go to the next variable
$x_3$. The same happens here, so we skip to the last variable,
$x_4$. Eliminating that just gives the equation $0=0$ so we can
rewrite the system as
In this section, we will introduce linear dependence and linear independence, which are two important
concepts in linear algebra. However, we start with
linear combination, which is another concept that also is often used. The definition follows below.
Definition 5.1:Linear Combination
The vector, $\vc{u}$, is a linear combination of the vectors, $\vc{v}_1,\dots,\vc{v}_n$ when
$\vc{u}$ is expressed as
For the column vector notation
(Definition 2.5), we saw, for example,
that $\vc{w}= w_x \vc{e}_1+w_y \vc{e}_2+w_z \vc{e}_3$, That is, the
three-dimensional vector, $\vc{w}$, is a linear combination of the
basis vectors, $\vc{e}_1$, $\vc{e}_2$, and $\vc{e}_3$, and the scalar
values are $w_x$, $w_y$, and $w_z$.
Using the definition of the linear combination, we will now look into
sets of vectors that are linear dependent and linear independent.
Definition 5.2:Linear Independence and Dependence
The set of vectors, $\vc{v}_1,\dots,\vc{v}_n$, is said to be linearly independent, if the equation
So, we arrive at $(k_1, k_2, k_3) = (0,0,0)$, which proves that the standard basis
is linearly independent.
Theorem 5.4:Linear combination and linear dependence
One of the vectors $\vc{v}_1, \vc{v}_2, \dots, \vc{v}_n$ can be written as a linear combination of the others if and only if the vectors $\vc{v}_1, \vc{v}_2, \dots, \vc{v}_n$ are linearly dependent.
has a solution where at least one of $k_1, k_2, \dots k_n$ is not
zero. Assume that the last value $k_n$ is such a non-zero
value. (We can always reorder the indices so that this is true.)
We can then move the $n$:th term to the other side
As we can see we have now written the vector $\vc{v}_n$ as a linear combination of the other vectors.
The proof in the other direction is simpler. Assume that one of the vectors can be written as a linear combination of the others. After renumbering, assume it is vector $\vc{v}_n$. We then get
where not all $k_j$ are zero, since $k_n$ equals $-1$. This means that the vectors $\vc{v}_1, \vc{v}_2, \dots \vc{v}_n$ are linearly dependent.
$\square$
Example 5.7: The two vectors $\vc{u} = (1, 2, 3)$ and $\vc{v} = (2, 4, 6)$ are linearly dependent. This is true since $\vc{v}$ can be written as a linear combination of $\vc{u}$ according to $\vc{v} = 2\vc{u}$.
Example 5.8: The three vectors $\vc{u} = (1, 2, 3)$, $\vc{v} = (2, 4, 6)$, and $\vc{w} = (1, 0, 7)$ are linearly dependent. This is true since $\vc{v}$ can be written as a linear combination of $\vc{u}$ and $\vc{w}$ according to $\vc{v} = 2\vc{u} + 0\vc{w}$. From this one can draw the conclusion that adding a vector to a set of vectors that are already linearly dependent can never make the new set linearly independent.
Example 5.9: The single vector $\vc{v} = (0, 0, 0)$ is linearly dependent. This follows from Definition 5.2 since $k_1\vc{v} = \vc{0}$ for non-zero values of $k_1$, for instance $k_1 = 5$. For the same reason a set that consists of a single nonzero vector is linearly independent.
Example 5.10: Examine if the three vectors $\vc{u} = (1, 2, 3)$, $\vc{v} = (2, 3, 4)$, $\vc{w} = (-1, 0, 1)$ are linearly independent.
There are no obvious way to see that the vectors are linearly dependent. As an example, none of the vectors are equal to $(0, 0, 0)$ and no vector is a simple multiple of another vector. We go back to the definition which states that the vectors are linearly independent if and only if
The system of equations is underdetermined and there is an infinite number of solutions. Setting $k_3 = t$ and inserting in the second equation gives $k_2 = 2t$, and inserting these two values in the first equation gives
\begin{equation}
k_1 + \ 4 t - \ t = 0,
\end{equation}
(5.82)
which can be simplified to $k_1 = -3t$. Since $t$ can be non-zero, we have that the three vectors are linearly dependent. As an example, using $t = 1$, we get $k_1 = -3$, $k_2 = 2$ and $k_3 = 1$ which can be inserted into Equation (5.78) yielding
The last equation gives $k_3 = 0$, which inserted into the second equation gives $k_2 = 0$, and inserting these two values into the top equation gives $k_1=0$. We thus have that $(k_1, k_2, k_3) = (0, 0, 0)$ is the only solution and the vectors $\vc{u}$, $\vc{v}$, and $\vc{w}'$ are therefore linearly independent.
Theorem 5.5:The Basis Theorem
Two vectors $\vc{u}$ and $\vc{v}$ in the plane constitute a basis if and only if they are linearly independent.
Three vectors $\vc{u}$, $\vc{v}$, and $\vc{w}$ in 3D space constitute a basis if and only if they are linearly independent.
Three or more vectors in the plane are always linearly dependent.
Four or more vectors in 3D space are always linearly dependent.
The two first theorems go in both directions, so we mark the proof in the forward direction with $\Rightarrow$ and in the backward direction with $\Leftarrow$.
1 $\Rightarrow$: Assume $\vc{u}$ and $\vc{v}$ are linearly independent. Theorem 5.4 then gives that $\vc{u}$ cannot be written as a linear combination of $\vc{v}$, i.e., $\vc{u} \neq \lambda \vc{v}$ and likewise $\vc{v} \neq \lambda \vc{u}$. This means that $\vc{u}$ and $\vc{v}$ are not parallel and Theorem 2.4 then gives that they constitute a basis in the plane.
1 $\Leftarrow$: Assume $\vc{u}$ and $\vc{v}$ constitute a basis in the plane. We will now prove that this means that $\vc{u}$ and $\vc{v}$ are linearly independent. Assume they are not linearly independent, but anyway constitute a basis, i.e., the expression $x_u \vc{u} + x_v \vc{v}$ can reach every point in the plane. Since they are linearly dependent, Theorem 5.4 states that we can write $\vc{u} = \lambda \vc{v}$ or $\vc{v} = \lambda\vc{u}$. Now, assume further that we can write $\vc{u}$ as $\vc{u} = \lambda \vc{v}$ (this can always be done by switching labels of $\vc{u}$ and $\vc{v}$). Assume we want to describe a new vector $\vc{w}$ in this basis using $\vc{w} = x_u \vc{u} + x_v \vc{v}$. This equals $\vc{w} = x_u \lambda \vc{v} + x_v \vc{v} = (x_u \lambda + x_v) \vc{v} = \mu \vc{v}$, where $\mu$ is a scalar equal to $(x_u \lambda + x_v)$. This means that only vectors $\vc{w}$ parallel to $\vc{v}$ can be described using $x_u \vc{u} + x_v \vc{v}$, and hence $\vc{u}$, $\vc{v}$ cannot be a basis for the full three-dimensional space.
2 $\Rightarrow$: Theorem 2.5 states that if there is no plane that is parallel to $\vc{u}$, $\vc{v}$ and $\vc{w}$, then the three vectors constitute a basis in the room.
Assume that there is a plane $\pi$ parallel to all three vectors. Since it is parallel to $\vc{u}$ and $\vc{v}$, it can be written as
where $\vc{p}$ is a point in the plane and $s$ and $t$ are scalars. If $\pi$ is parallel to $\vc{w}$, it means that if we have a point $\vc{q}$ in the plane, then the point $\vc{q} + \vc{w}$ will also be in the plane. That $\vc{q}$ is in the plane means that there are two scalars $s_1$ and $t_1$ such that
Thus $\vc{w}$ can be written as a linear combination of $\vc{u}$ and $\vc{v}$, and according to Theorem 5.4 this means that the vectors are not linearly independent. Hence there cannot be a plane that is parallel to all three vectors $\vc{u}$, $\vc{v}$ and $\vc{w}$ if they are linearly independent. And thus, by Theorem 2.5, three linearly independent vectors $\vc{u}$, $\vc{v}$, $\vc{w}$ constitute a basis in the room.
2 $\Leftarrow$: Assume $\vc{u}$, $\vc{v}$ and $\vc{w}$ constitute a basis in the room. We will now prove that this means that $\vc{u}$, $\vc{v}$ and $\vc{w}$ are linearly independent. Assume they are not linearly independent, but anyway constitute a basis, i.e., the expression $x_u \vc{u} + x_v \vc{v} + x_w \vc{w}$ can reach every point in the room. Since they are linearly dependent, Theorem 5.4 says that we can write at least one of the vectors as a linear combination of the two others. Assume that one of them is $\vc{u}$ (otherwise rename the vectors), we can now write $\vc{u}$ as $\vc{u} = \lambda_v \vc{v} + \lambda_w \vc{w}$. Now we want to describe a $\vc{q}$ in this basis using $\vc{q} = x_u \vc{u} + x_v \vc{v} + x_w \vc{w}$. Inserting the linear expression for $\vc{u}$ gives $\vc{q} = x_u (\lambda_v \vc{v} + \lambda_w \vc{w}) + x_v \vc{v} + x_w \vc{w} = (x_u \lambda + x_v) \vc{v} + (x_u \lambda_w + x_w) \vc{w} = \mu_v \vc{v} + \mu_w \vc{w}$, where $\mu_v$ and $\mu_w$ are scalars. This means that only vectors $\vc{q}$ parallel to the plane spanned by $\vc{v}$ and $\vc{w}$ can be described using $x_u \vc{u} + x_v \vc{v} + x_w \vc{w}$, and hence $\vc{u}$, $\vc{v}, \vc{w}$ cannot be a basis for the full plane.
Hence, they are linearly independent.
3: Assume $\vc{u}$, $\vc{v}$ and $\vc{w}$ are vectors in the plane. First study $\vc{u}$ and $\vc{v}$. If they are linearly dependent, i.e., parallel, then the entire set of vectors $\vc{u}$, $\vc{v}$, $\vc{w}$ are also linearly dependent and we are finished. However, if $\vc{u}$ and $\vc{v}$ are linearly independent, then according to (1) they constitue a basis in the plane. This means that $\vc{w}$ can be written as a linear combination
which, according to Definition 5.2 means that the three vectors are linear dependent since $k_1 = \lambda_1$, $k_2 = \lambda_2$, $k_3 = -1$ satisfy Equation (5.67).
In this section, the concept of span will be introduced. It is useful when talking about how much of a vector space
a set of vectors can "occupy" as a linear combination. We start with the definition.
Definition 5.3:Span
The set of vectors $\{\vc{v}_1,\dots,\vc{v}_q\}$ in $\R^n$ is said to span $\R^n$, if the equation
The system of equations does not have a solution for every $\vc{u} = (u_1, u_2, u_3)$. In fact there is only a solution if the vector lies in the plane defined by $u_1-2 u_2+u_3 = 0$. So the three vectors do not span $\R^3$.
If we instead change the last vector to $\vc{w}' = (-1, 0, 2)$, we get the following system of equations
This equation is solvable for all $\vc{u} = (u_1, u_2, u_3)$. So the three vectors do span $\R^3$.
The last equation gives $k_3 = u_1-2 u_2+u_3$, which inserted into the second equation gives $k_2 = 4u_1-5u_2+2u_3$, and inserting these two values into the top equation gives $k_1=-6 u_1 + 8 u_2 - 3 u_3$. We thus have a unique solution
Theorem 5.6:Requirements of a basis
A set of vectors $\{\vc{u}_1, \vc{u}_2, \ldots, \vc{u}_k\}$ is a basis in $\R^n$ if and only if the vectors are linearly independent and span $\R^n$.
Assume that the vectors $\vc{u}_1, \vc{u}_2, \ldots, \vc{u}_q$ are a basis of $\R^n$. Then by
Definition 2.10, there is a solution to
$k_1\vc{v}_1 + k_2 \vc{v}_2 + \dots + k_q \vc{v}_q= \vc{u}$ for every $\vc{u}$, so the vectors span $\R^n$
according to
Definition 5.3.
To see that the vectors are linearly independent, we note that the equation
$ \sum_{i=1}^q k_i \vc{v}_i = \vc{0}$ has one solution, $k_1=0, k_2=0, \ldots, k_q=0$. According to the definition of a basis
(Definition 2.10), there is only one solution. This proves that the vectors are linearly independent.
To complete the proof in the other direction, we assume that the vectors $\vc{u}_1, \vc{u}_2, \ldots, \vc{u}_q$
span $\R^n$ and are linearly independent. We need to show that the
$\sum_{i=1}^q k_i \vc{v}_i = \vc{u}$ has exactly one solution for every $\vc{u}$. Since the vectors span $\R^n$, there is at least one solution for every $\vc{u}$. Assume that there are two solutions, $(k_1, k_2, \ldots, k_q)$ and
$(k^\prime_1, k^\prime_2, \ldots, k^\prime_q)$. If we subtract the equation
$\sum_{i=1}^q k^\prime_i \vc{v}_i = \vc{u}$ from $\sum_{i=1}^q k_i \vc{v}_i = \vc{u}$, we get
$\sum_{i=1}^q (k_i-k^\prime_i) \vc{v}_i = \vc{0}$. Since the vectors are linearly independent (Definition 5.2), we have
$(k_i-k^\prime_i) = 0$, but this gives $k_i=k^\prime_i$, so the solutions are the same.
Assume we have a basis in the plane, i.e., two basis vectors, $\vc{e}_1$ and $\vc{e}_2$, that are non-parallel.
In addition, assume that we have another pair of non-parallel vectors $\hat{\vc{e}}_1, \hat{\vc{e}}_2$ that also constitute a basis.
As we have seen above, these two vectors can now be described using the first two vectors as
However, since the coordinates of a vector are uniquely determined for a given basis, the number $(x_{11} \hat{u}_1 + x_{12} \hat{u}_2)$ in
front of $\vc{e}_1$ in Equation (5.106) must be the same as the number $u_1$ in front
of $\vc{e}_1$ in Equation (5.103), and the same goes for $u_2$. This is summarized as
Note thus that the equations for the relation between the basis vectors in Equation (5.102) are similar to
the ones between the coordinates above, with two important distinctions: the hats are on the other side, and the rows have
become columns (for instance the row $x_{11} x_{12}$ has become a column).
We will return to change of basis in the Chapter 6, which covers the topic of matrices.
Popup-help:
A basis (bases in plural) is a set of linearly independent (basis) vectors, such that each vector in the space
can be described as a linear combination of the basis vectors.
If the basis is $\{\vc{e}_1,\vc{e}_2,\vc{e}_3\}$, then any vector $\vc{v}$ in the space can be described by
three numbers $(x,y,z)$ such that $\vc{v} = x\vc{e}_1 +y\vc{e}_2 +z\vc{e}_3$. Often the basis is implicit,
in which case we write $\vc{v} = (x,y,z)$, or to make it possible to distinguish between vector elements from
one vector $\vc{v}$ to another $\vc{u}$, we may use $\vc{v}=(v_x,v_y,v_z)$ and $\vc{u}=(u_x,u_y,u_z)$ as notation.
Popup-help:
An $n$-dimensional column vector, $\vc{v}$, is represented with respect to a basis and consists of a column of $n$ scalar values.
The vector elements are sometimes denoted $v_1$, $v_2$,..., $v_n$. For two- and three-dimensional vectors, we sometimes
also use $v_x$, $v_y$, and $v_z$.
The notation is
where $\vc{u} = u_x \vc{e}_1$, $\vc{v} = v_x \vc{e}_1 + v_y \vc{e}_2$,
and $\vc{w} = w_x \vc{e}_1 + w_y \vc{e}_2 + w_z \vc{e}_3$.
Note that $\vc{e}_i$ are the basis vectors.
In our text, we also use the shorthand notation $\vc{w} = \bigl(w_1,w_2,w_3\bigr)$, which means the same as above
(notice the commas between the vector elements).
A row vector does not have any commas between, though.
Popup-help:
Gaussian elimination is the process where a system of linear equations reduces the number of free variables
on each line by one until the bottom most row says $kx_n=l$, i.e., $x_n=l/k$. After that, the rest of the
free variables can recovered by substitution.
Popup-help:
A homogeneous system of equations is one such that $\mx{A}\vc{x}=\vc{0}$.
Popup-help:
On parameterized form, a line is defined by a starting point, $S$, and a line direction $\vc{d}$, and a scalar
parameter, $t$, such that any point on the line can be obtained from the line equation, i.e.,
$P(t) = S + t\vc{d}$. This representation works in any dimension.
On implicit form, a line is defined by a starting point, $S$, and a normal vector, $\vc{n}$. For any point, $P$,
on the line, it holds that $\vc{n}\cdot (P-S) = 0$. This representation works only in two dimensions.
Popup-help:
A systems of equations is called linear if it only contains polynomial terms of the zero:th and first order,
that is, either constants or first-order terms, such as $9x$, $-2y$, and $0.5z$.
Popup-help:
The vector, $\vc{u}$, is a linear combination of the vectors, $\vc{v}_1,\dots,\vc{v}_n$ if
$\vc{u}$ is expressed as $\vc{u} = k_1\vc{v}_1 + k_2\vc{v}_2 + \dots + k_n \vc{v}_n$.
Popup-help:
The set of vectors, $\vc{v}_1,\dots,\vc{v}_n$, is said to be linearly independent, if the equation
only has a single solution which is $k_1 = k_2 = \dots = k_n =0$.
If there is at least one other solution, then the set of vectors is linearly dependent.
Popup-help:
A mapping $F$ is non-linear if it violates one or both of these rules:
Popup-help:
A normal, $\vc{n}$, is part of the description of the two-dimensional line and the three-dimensional plane on implicit form.
A plane equation is
where $\vc{n}$ is a vector that is orthogonal to the plane, $P$ is any point on the plane,
and $S$ is a known point on the plane.
Similarly, a line is $\vc{n}\cdot (P-S)=0$, where $\vc{n}$ is a vector orthogonal the line, $P$ is any point on the line, and
$S$ is a particular point on the plane. The plane can also be written as $ax+by+cz+d=0$ and a line can be written
as $ax+by+c=0$.
Popup-help:
On parameterized form (also called explicit form), a plane equation is given by $P(t_1, t_2) = S + t_1 \vc{d}_1 + t_2\vc{d}_2$,
where $S$ a point in the plane and $\vc{d}_1$ and $\vc{d}_2$ are two non-parallel non-zero vectors
on the plane. Using the parameters $t_1$ and $t_2$, any point $P(t_1,t_2)$ can be reached.
On implicit form, a plane equation is given by $\vc{n}\cdot(P-S)=0$, where $\vc{n}$ is the normal of
the plane and $S$ is a point on the plane. Only for points, $P$, that lie in the plane is the
formula equal to $0$.
Popup-help:
The standard basis in $n$ dimensions has basis vectors $\vc{e}_i$, where the vector elements are all
all zeroes, except the $i$:th element, which is a one. For example, for $n=2$, we have $\vc{e}_1=(1,0)$ and $\vc{e}_2=(0,1)$,
and for $n=3$, we have $\vc{e}_1=(1,0,0)$, $\vc{e}_2=(0,1,0)$, and $\vc{e}_3=(0,0,1)$. As usual, we sometimes
use $\vc{e}_x$, $\vc{e}_y$, and $\vc{e}_z$ instead of $\vc{e}_1$, $\vc{e}_2$, and $\vc{e}_3$.
Popup-help:
A system of (linear) equations is simply a number of (linear) equations that one wants to solve together so
that the solution holds for each and everyone of the equations.
Popup-help:
A system of equations with $n$ equations is called triangular if row $i$, where $i\in\{1,\dots,n\}$,
has $n-i+1$ free variables.
Popup-help:
The zero vector, $\vc{0}$, has length 0, and can be created from one point, $P$, as $\vc{0}=\overrightarrow{PP}.$