Chapter 7: Determinants (Immersive Linear Algebra)

Example 7.1: The Fundamental Matrix
In Interactive Illustration 5.2, we saw a 3D reconstruction of a number of so called feature points from a set of images. The problem of calculating both 3D points and at the same time calculating how the camera has moved between images is an interesting problem. A first step of solving this involves understanding how to solve the problem for just two images. One technique involves the so called fundamental matrix that relates corresponding points in two images. The determinant is used to derive this constraint. Assume that two different images of the same scene are given. A point in the left image with coordinates $(x_1,y_1)$ and a point in the right image with coordinates $(x_2,y_2)$ can only be in correspondence if

\begin{equation} \vc{u}_1^\T \mx{F} \vc{u}_2 = 0, \end{equation}

(7.1)

where

\begin{equation} \vc{u}_1 = \begin{pmatrix} x_1\\ y_1\\ 1 \end{pmatrix} \ \ \mathrm{and}\ \ \vc{u}_2 = \begin{pmatrix} x_2\\ y_2\\ 1 \end{pmatrix} \end{equation}

(7.2)

and $\mx{F}$ is a $3 \times 3$ matrix related to the relative motion between the two camera positions. In the example figure, $\mx{F}= \left(\begin{array}{ccc} 1.055 & 58.10 & -58.96 \\ 58.60 & -0.044 & -336.0 \\ -55.52 & 228.3 & 110.5 \end{array}\right)$.

Interactive Illustration 7.1: A point in the left image with coordinates $(x_1,y_1)$ and a point in the right image with coordinates $(x_2,y_2)$ could be in correspondence only if $\vc{u}_1^\T \mx{F} \vc{u}_2 = 0$. Here $u_1 = (x_1,y_1,1)$ and $u_2 = (x_2,y_2,1)$. Try moving the image points to get them in correspondence. Images courtesy of Carl Olsson at Lund University.

Interactive Illustration 7.1: A point in the left image with coordinates $\hid{(x_1,y_1)}$ and a point in the right image with coordinates $\hid{(x_2,y_2)}$ could be in correspondence only if $\hid{\vc{u}_1^\T \mx{F} \vc{u}_2 = 0}$. Here $\hid{u_1 = (x_1,y_1,1)}$ and $\hid{u_2 = (x_2,y_2,1)}$. Try moving the image points to get them in correspondence. Images courtesy of Carl Olsson at Lund University.

$\vc{u}_1 = \left(\begin{array}{l} \hid{0..9} \\ \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{0..} \\ \hid{1} \\ \hid{1} \end{array}\right)$

$\vc{u}_2 = \left(\begin{array}{l} \hid{0..9} \\ \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{0..} \\ \hid{1} \\ \hid{1} \end{array}\right)$

$\vc{u}_1^\T \mx{F} \vc{u}_2 = $

The determinant is a function from a matrix $\mx{A}$ to a value $\det(\mx{A})$. We will later see that the determinant can be defined using three simple properties. However, before delving into the details we will first look at explicit expressions for the determinant in dimensions one, two and three. For $1 \times 1$ matrices

\begin{equation} \mx{A} = \left( \begin{array}{r} a \end{array} \right) \end{equation}

(7.3)

the determinant is simply $\det(\mx{A}) = a$. This is the length, with sign, of the vector

\begin{equation} \vc{a} = \left( \begin{array}{r} a \end{array} \right) . \end{equation}

(7.4)

Here it is straightforward to see that the equation $ \mx{A} \vc{x} = \vc{b} $ has a unique solution when $\det(\mx{A}) = a \neq 0$. When the determinant is zero, there could be no solutions, if $\vc{b} \neq \vc{0}$. As an example for $1\times 1$ matrices, i.e., where the equation reduces to $ax=b$, assume that $a x = 3$ has no solutions if $a=0$. If $b = 0 $, there are infinitely many solutions $x$ since the equation now reads $0 x = 0$.

For a $2 \times 2$ matrix

\begin{equation} \mx{A} = \left( \begin{array}{rr} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{array} \right) \end{equation}

(7.5)

the determinant is $\det(\mx{A}) = a_{11} a_{22} - a_{12} a_{21}$. The determinant is sometimes written with a single straight line as left and right brackets, thus we would write the above as

\begin{equation} \det(\mx{A}) = \begin{vmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{vmatrix} = a_{11} a_{22} - a_{12} a_{21} . \end{equation}

(7.6)

If we think of the two columns as the coordinates of two vectors in the plane, i.e.,

\begin{equation} \mx{A} = \begin{pmatrix} \vc{a}_{,1} & \vc{a}_{,2} \end{pmatrix} \end{equation}

(7.7)

then the determinant is the area (with sign) of the parallelogram spanned by $\vc{a}_{,1}$ and $\vc{a}_{,2}$. Recall the notation of column and row vectors from Definition 6.2. This is visualized in Interactive Illustration 7.2. The sign of the determinant is positive (negative) if the two vectors are positively (negatively) oriented. This is illustrated with a green (red) parallelogram in the illustration. Try changing the sign of the determinant in the illustration.

Interactive Illustration 7.2: In this interactive illustration, we visualize a $2\times 2$ matrix, $\mx{A}$, as the two column vectors, $\vc{a}_{,1}$ and $\vc{a}_{,2}$, that it consists of, i.e., $\mx{A} = \bigl(\textcolor{#aa0000}{\vc{a}_{,1}}\,\, \textcolor{#00aa00}{\vc{a}_{,2}} \bigr)$. Note that the column vectors can be moved in this illustration.

Interactive Illustration 7.2: In this interactive illustration, we visualize a $\hid{2\times 2}$ matrix, $\hid{\mx{A}}$, as the two column vectors, $\hid{\vc{a}_{,1}}$ and $\hid{\vc{a}_{,2}}$, that it consists of, i.e., $\hid{\mx{A} = \bigl(\textcolor{#aa0000}{\vc{a}_{,1}}\,\, \textcolor{#00aa00}{\vc{a}_{,2}} \bigr)}$. Note that the column vectors can be moved in this illustration.

$\mx{A} = \left(\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right)$

$\textcolor{#aa0000}{\vc{a}_{,1}}$

$\textcolor{#009000}{\vc{a}_{,2}}$

$\det(\mx{A})=$

For a $3 \times 3$ matrix

\begin{equation} \mx{A} = \left( \begin{array}{rr} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33} \end{array} \right) \end{equation}

(7.8)

the determinant is $\det(\mx{A}) = a_{11} a_{22} a_{33} + a_{12} a_{23} a_{31} + a_{13} a_{21} a_{32} - a_{11} a_{23} a_{32} - a_{12} a_{21} a_{33} - a_{13} a_{22} a_{31}$. If we think of the three columns as the coordinates of three vectors in space, i.e.,

\begin{equation} \mx{A} = \left( \begin{array}{rrr} \vc{a}_{,1} & \vc{a}_{,2} & \vc{a}_{,3} \end{array} \right) \end{equation}

(7.9)

then the determinant is the volume of the parallelepiped spanned by $\vc{a}_1$, $\vc{a}_2$, and $\vc{a}_3$. In Interactive Illustration 7.3, the volume of the parallelepiped spanned by $\vc{a}_{,1}$, $\vc{a}_{,2}$, and $\vc{a}_{,3}$ is visualized.

Interactive Illustration 7.3: Here, we visualize the three column vectors, $\textcolor{#aa0000}{\vc{a}_{,1}}$, $\textcolor{#00aa00}{\vc{a}_{,2}}$, and $\textcolor{#0000aa}{\vc{a}_{,3}}$ of a $3\times 3$ matrix, $\mx{A}$. Note that the gray, dashed lines are only in this illustration to help reveal the three-dimensional locations of the vectors. Again, as an exercise, it may be useful to attempt to move the vectors so that they form the identity matrix, $\Bigl( \begin{smallmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{smallmatrix} \Bigr)$, for example. Recall that you can change the viewpoint by right-clicking, pressing and moving the mouse, or slide with two fingers pressed on a tablet. It is difficult to percieve the volume of the parallelepiped from a single viewpoint. Try changing the vectors so that the determinant is zero without making any of the vectors to be of zero length. Then rotate the viewpoint. Does it look like the vectors are in a plane?

Interactive Illustration 7.3: Here, we visualize the three column vectors, $\hid{\textcolor{#aa0000}{\vc{a}_{,1}}}$, $\hid{\textcolor{#00aa00}{\vc{a}_{,2}}}$, and $\hid{\textcolor{#0000aa}{\vc{a}_{,3}}}$ of a $\hid{3\times 3}$ matrix, $\hid{\mx{A}}$. Note that the gray, dashed lines are only in this illustration to help reveal the three-dimensional locations of the vectors. Again, as an exercise, it may be useful to attempt to move the vectors so that they form the identity matrix, $\hid{\Bigl( \begin{smallmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{smallmatrix} \Bigr)}$, for example. Recall that you can change the viewpoint by right-clicking, pressing and moving the mouse, or slide with two fingers pressed on a tablet. It is difficult to percieve the volume of the parallelepiped from a single viewpoint. Try changing the vectors so that the determinant is zero without making any of the vectors to be of zero length. Then rotate the viewpoint. Does it look like the vectors are in a plane?

$\mx{A} = \left(\begin{array}{l} \hid{1} \\ \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \\ \hid{1} \end{array}\right)$

$\textcolor{#aa0000}{\vc{a}_{,1}}$

$\textcolor{#009000}{\vc{a}_{,2}}$

$\textcolor{#0000aa}{\vc{a}_{,3}}$

$\det(\mx{A})=$

In the following, we relax our notation a little so that a column vector $\vc{a}_{,i}$ of a matrix $\mx{A}$ also can be denoted $\vc{a}_i$. There are several different ways one could define the determinant. What definitions one use is a technicality and the end result is the same. Here we define the determinant for a general $n \times n$ matrix $\mx{A}$ using three basic properties according to the following definition.

Definition 7.1: Determinant
The determinant $\det$ is a scalar function of a square matrix $\mx{A}$ fulfilling the following three properties:

\begin{equation} \begin{array}{llr} (i) & \det( \mx{I}) = 1 & \spc\text{(determinant of unit matrix)} \\ (ii) & \begin{vmatrix} \ldots & \vc{a}_i & \ldots & \vc{a}_i & \ldots \end{vmatrix} = 0 & \spc\text{(zero if two columns are equal)} \\ (iii) & \begin{vmatrix} \ldots & \lambda_1 \vc{a}_1 + \lambda_2 \vc{a}_2 & \ldots \end{vmatrix} = \lambda_1 \begin{vmatrix} \ldots & \vc{a}_1 & \ldots \end{vmatrix} + \lambda_2 \begin{vmatrix} \ldots & \vc{a}_2 & \ldots \end{vmatrix} & \spc\text{(linear in each column)} \\ \end{array} \end{equation}

(7.10)

It turns out that there is one and only one function that fulfills these three properties. The proof of the existence and uniqueness of the determinant is a bit technical and is of less importance than the properties of the determinant. In order not to obscure the view we leave these proofs for Section 7.3. From these three basic properties follow many other useful properties which we summarize in the Theorem 7.1.

Theorem 7.1: Properties of the determinant

\begin{equation} \begin{array}{llr} (iv) & \begin{vmatrix} \ldots & \vc{0} & \ldots \end{vmatrix} = 0 & \spc\text{(zero if one column is zero)} \\ (v) & \begin{vmatrix} \ldots & \vc{a}_i & \ldots & \vc{a}_j & \ldots \end{vmatrix} = - \begin{vmatrix} \ldots & \vc{a}_j & \ldots & \vc{a}_i & \ldots \end{vmatrix} & \spc\text{(swapping columns)} \\ (vi) & \begin{vmatrix} \ldots & \vc{a}_i & \ldots & \vc{a}_j & \ldots \end{vmatrix} = \begin{vmatrix} \ldots & \vc{a}_i+\lambda \vc{a}_j & \ldots & \vc{a}_j & \ldots \end{vmatrix} & \spc\text{(adding to another column)} \\ (vii) & \det( \mx{A}) = \det( \mx{A}^\T) & \spc\text{(transpose)} \\ (viii) & \det( \mx{A} \mx{B}) = \det( \mx{A}) \det( \mx{B}) & \spc\text{(product)} \\ (ix) & \det( \mx{A}^{-1}) = \frac{1}{\det( \mx{A})} & \spc\text{(inverse)} \\ \end{array} \end{equation}

(7.11)

Proof:

Property $(iv)$ follows the linearity (property $(iii)$ in the definition). Setting $\lambda_2 = 0$ in property $(iii)$ gives

\begin{equation} \begin{vmatrix} \ldots & \lambda_1 \vc{a}_1 & \ldots \end{vmatrix} = \lambda_1 \begin{vmatrix} \ldots & \vc{a}_1 & \ldots \end{vmatrix} \end{equation}

(7.12)

Since the zero vector $\vc{0}$ can be written as the scalar $0$ times any nonzero vector $\vc{a_1}$, we get that

\begin{equation} \begin{vmatrix} \ldots & \vc{0} & \ldots \end{vmatrix} = \begin{vmatrix} \ldots & 0\ \vc{a}_1 & \ldots \end{vmatrix} = 0 \ \begin{vmatrix} \ldots & \vc{a}_1 & \ldots \end{vmatrix} = 0. \end{equation}

(7.13)

Property $(v)$ can be proved as follows. Form the determinant

\begin{equation} \begin{vmatrix} \ldots & \vc{a}_i + \vc{a}_j & \ldots & \vc{a}_i + \vc{a}_j & \ldots \end{vmatrix} , \end{equation}

(7.14)

which is zero since it has two identical columns. Using the linearity of each column we get

\begin{equation} \begin{array}{ll} 0 = & \begin{vmatrix} \ldots & \vc{a}_i + \vc{a}_j & \ldots & \vc{a}_i + \vc{a}_j & \ldots \end{vmatrix} = \\ & \begin{vmatrix} \ldots & \vc{a}_i & \ldots & \vc{a}_i & \ldots \end{vmatrix} + \begin{vmatrix} \ldots & \vc{a}_i & \ldots & \vc{a}_j & \ldots \end{vmatrix} + \\ + & \begin{vmatrix} \ldots & \vc{a}_j & \ldots & \vc{a}_i & \ldots \end{vmatrix} + \begin{vmatrix} \ldots & \vc{a}_j & \ldots & \vc{a}_j & \ldots \end{vmatrix} . \end{array} \end{equation}

(7.15)

The first and last of these terms are zero, since they have two identical columns. We thus have

\begin{equation} 0 = \begin{vmatrix} \ldots & \vc{a}_i & \ldots & \vc{a}_j & \ldots \end{vmatrix} + \begin{vmatrix} \ldots & \vc{a}_j & \ldots & \vc{a}_i & \ldots \end{vmatrix} , \end{equation}

(7.16)

which proves the statement. Property $(vi)$ can be proved as follows. Using the linearity we get

\begin{equation} \begin{array}{ll} & \begin{vmatrix} \ldots & \vc{a}_i + \lambda \vc{a}_j & \ldots & \vc{a}_j & \ldots \end{vmatrix} = \\ & \begin{vmatrix} \ldots & \vc{a}_i & \ldots & \vc{a}_j & \ldots \end{vmatrix} + \lambda \begin{vmatrix} \ldots & \vc{a}_j & \ldots & \vc{a}_j & \ldots \end{vmatrix} = \\ & \begin{vmatrix} \ldots & \vc{a}_i & \ldots & \vc{a}_j & \ldots \end{vmatrix} \end{array} , \end{equation}

(7.17)

which proves the statement. We leave the proofs of statements $(vii)$, $(viii)$ and $(ix)$ for later.

$\square$

On the surface, statement $(vi)$ can seem counterintuitive since the determinant is connected to the area or volume spanned by the column vectors, and adding something to a column vector will make it longer. How come the area stays the same even though one of the vectors become longer? The reason is illustrated in Interactive Illustration 7.4.

Interactive Illustration 7.4: In this interactive illustration, we visualize what happens to the area spanned by the column vectors when we add a multiple of one column vector to the other column vector. In step one we have the matrix $\mx{A} = \bigl(\textcolor{#0000aa}{\vc{a}_{,1}}\,\, \textcolor{#00aa00}{\vc{a}_{,2}} \bigr)$ that consists of the two column vectors $\textcolor{#0000aa}{\vc{a}_{,1}}$ and $\textcolor{#00aa00}{\vc{a}_{,2}}$. The area spanned by the two column vectors is marked with blue.

Interactive Illustration 7.4: In the second matrix $\hid{\mx{A'} = \bigl(\textcolor{#aa0000}{\vc{a}_{,1} + \lambda \vc{a}_{,2}}\,\, \textcolor{#00aa00}{\vc{a}_{,2}} \bigr)}$, the first column is the old column plus a multiple $\hid{\lambda}$ times the second column. The second column is left unchanged. The area spanned by the columns of $\hid{\mx{A'}}$ is marked with pink (and purple where the areas overlap). Note that when increasing $\hid{\lambda}$ (using the slider), the vector $\hid{\textcolor{#aa0000}{\vc{a}_{,1} + \lambda \vc{a}_{,2}}}$ becomes longer, but the parallelogram shears increasingly, couteracting the length increase, and the area stays the same. Note how the pink triangle where there is no overlap matches exactly the blue triangle.

$\mx{A} = \left(\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right)$

$\textcolor{#0000aa}{\vc{a}_{,1}}$

$\textcolor{#009000}{\vc{a}_{,2}}$

$\det(\mx{A})=$

$\lambda=$

$\lambda \vc{a}_{,2}$

$\textcolor{#aa0000}{\vc{a}_{,1} + \lambda \vc{a}_{,2}}$

$\det(\mx{A'})=$

$\mx{A'} = \left(\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right)$

The definitions and rules above can be used to calculate determinants. Here follows an example.

Example 7.2:
What is the determinant of

\begin{equation} \begin{vmatrix} 1 & 23 & 2 \\ 5 & 12 & 10 \\ 4 & 72 & 8 \end{vmatrix} ? \end{equation}

(7.18)

Notice that the third column is two times $(1,5,4)$. Using the linearity on column three, we have

\begin{equation} \begin{vmatrix} 1 & 23 & 2 \\ 5 & 12 & 10 \\ 4 & 72 & 8 \end{vmatrix} = 2 \begin{vmatrix} 1 & 23 & 1 \\ 5 & 12 & 5 \\ 4 & 72 & 4 \end{vmatrix} . \end{equation}

(7.19)

Now since the last determinant has two equal columns, we see that

\begin{equation} \begin{vmatrix} 1 & 23 & 2 \\ 5 & 12 & 10 \\ 4 & 72 & 8 \end{vmatrix} = 2 \begin{vmatrix} 1 & 23 & 1 \\ 5 & 12 & 5 \\ 4 & 72 & 4 \end{vmatrix} = 0 . \end{equation}

(7.20)

This subsection contains the proof that there actually is a function that is fulfilling the properties in Definition 7.1, that there is only one such function, and also a formula for calculating the determinant. To complete this proof we need to introduce the notion of a permutation of $n$ numbers. The main idea of permutations is really quite simple and is best exemplified by using properties $(i)-(vi)$ on a general $2 \times 2 $ matrix.

Example 7.3:
We study the determinant of a general $2 \times 2$ matrix

\begin{equation} \begin{vmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{vmatrix} . \end{equation}

(7.21)

Since the first column can be written as the following linear combination

\begin{equation} \begin{bmatrix} a_{11} \\ a_{21} \end{bmatrix} = a_{11} \begin{bmatrix} 1 \\ 0 \end{bmatrix} + a_{21} \begin{bmatrix} 0 \\ 1\end{bmatrix} \end{equation}

(7.22)

it follows from the linearity of the determinant on the first column that

\begin{equation} \begin{vmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{vmatrix} = a_{11} \begin{vmatrix} 1 & a_{12}\\ 0 & a_{22} \end{vmatrix} + a_{21} \begin{vmatrix} 0 & a_{12}\\ 1 & a_{22} \end{vmatrix} . \end{equation}

(7.23)

Using the analogous trick on the second column gives

\begin{equation} \begin{vmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{vmatrix} = a_{11} \left (a_{12} \underbrace{\begin{vmatrix} 1 & 1\\ 0 & 0 \end{vmatrix}}_{0} + a_{22}\underbrace{\begin{vmatrix} 1 & 0\\ 0 & 1 \end{vmatrix}}_{1} \right )+ a_{21} \left( a_{12}\underbrace{\begin{vmatrix} 0 & 1\\ 1 & 0 \end{vmatrix}}_{-1} + a_{22} \underbrace{\begin{vmatrix} 0 & 0\\ 1& 1 \end{vmatrix}}_{0} \right). \end{equation}

(7.24)

In the above equation, the first and the last determinants are zero because they have two identical columns. The second determinant is $1$ according to property (i) of the definition. The third determinant is $-1$ since

\begin{equation} \begin{vmatrix} 0 & 1\\ 1 & 0 \end{vmatrix} = - \begin{vmatrix} 1 & 0\\ 0 & 1 \end{vmatrix} = -1 , \end{equation}

(7.25)

where we used property $(v)$ to swap two columns and then property $(i)$. Thus we have showed that the definition of the determinant gives that

\begin{equation} \begin{vmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{vmatrix} = a_{11} a_{22} - a_{21} a_{12} , \end{equation}

(7.26)

which coincides with our earlier definition.

In this example we calculated a $2 \times 2$ determinant by expanding each column. By doing this for a general $n \times n$ matrix, one obtains an explicit expression for the determinant. This expression is best expressed using the concept of a permutation of $n$ numbers.

Definition 7.2:
A permutation $p$ of $n$ elements is a bijection from the set $\{ 1, 2, \ldots, n \}$ to itself.

A permutation $p$ is specified by how it acts on the set, i.e., by the numbers $p(1), p(2), \ldots, p(n)$. Since $p$ is a bijection, each number $\{ 1, 2, \ldots, n \}$ occurs precisely once in the list $p(1), p(2), \ldots, p(n)$. In other words the numbers are re-arranged or permuted. Sometimes the notation $p = [p(1), p(2), \ldots, p(n)] $ is used.

Definition 7.3:
A transposition is a permutation where two elements change places and the other elements are fixed.

Example 7.4:
The permutation $p = [1 2 4 3]$ is a transposition, since the elements $3$ and $4$ change place and the other elements, i.e., $1$ and $2$ are fixed.

Definition 7.4:
The inverse $p^{-1}$ of a permutation is defined as the inverse of the corresponding function.

Example 7.5:
Let $p = [2 3 4 1]$ be a permutation of $4$ numbers. This means that $p = [p(1) p(2) p(3) p(4)] = [2 3 4 1]$. What is the inverse $p^{-1}$. Since $p(1)=2$, then $p^{-1}(2) = 1$. Thus $p^{-1} = [? 1 ? ?]$. Since $p(2)=3$, then $p^{-1}(3)=2$. Thus $p^{-1} = [? 1 2 ?]$. Similarily, $p(3)=4$ and $p(4)=1$ gives $p^{-1}(4)=3$ and $p^{-1}(1)=4$. This gives $p^{-1} = [4 1 2 3]$.

If $p$ is a permutation of $n$ elements, there are ${n \choose 2}$ pairs of numbers $(p(i),p(j))$. Some of these are in the correct order, i.e., $((p(i),p(j))$ are in the same order as $(i,j)$. If on the other hand $(p(i),p(j))$ are in the wrong order as compared to $(i,j)$, then the pair is called an inversion.

Definition 7.5:
Here ${n \choose 2}$ is the binomial coefficient, ${n \choose k} = \frac{n!}{k! (n-k)!}$. One of the many interesting properties of the binomial coefficient is that there are ${n \choose k}$ ways to choose $k$ out of a set of $n$ elements.

Definition 7.6:
A permutation is called even if the number of inversions are even and called odd if the number of inversions are odd.

Definition 7.7:
The signature $\sigma$ of a permutation $p$, here denoted $\sigma(p)$ is

\begin{equation} \sigma(p) = \begin{cases} +1 &\mbox{if $n$ is an even permutation} \\ -1 &\mbox{if $n$ is an odd permutation} . \end{cases} \end{equation}

(7.27)

Theorem 7.2: Properties of the signature
For permutations we have

\begin{equation} \begin{array}{llr} (i) & \sigma(p) = \sigma(p^{-1}) & \\ (ii) & \sigma(p \circ q) = \sigma(p) \sigma(q)& \\ (iii) & \sigma(p) = -1 & \spc\text{(if $p$ is a transposition)} \end{array} \end{equation}

(7.28)

Here $\circ$ is used to denote composition, i.e., $p \circ q$ is the permutation that is the result of first permuting according to $q$ and then permuting according to $p$. (More generally composition is used for combining two functions. The composed function $f = g \circ h$ is such that $f(x) = g(h(x))$, i.e., first using function $h$ and then using $g$ on the result.) A permutation $p$ of $n$ elements can be represented as an $n \times n$ matrix $\mx{E}_p$, so that $\mx{E}_p$ is zeroes except for the elements $(p(1),1), (p(2),2) \ldots, (p(n),n)$. These elements are set to one. The determinant of $\mx{E}_p$ is particularly simple to calculate. We have $\det(\mx{E}_p) = \sigma(p)$. We are now ready to state a theorem, that gives an explicit formula for the determinant in general dimensions.

Theorem 7.3:
The determinant of an $n \times n$ matrix $\mx{A}$ is

\begin{equation} \det(\mx{A}) = \sum_p \sigma(p) a_{p(1),1} a_{p(2),2} \ldots a_{p(n),n} , \end{equation}

(7.29)

where the sum is taken over all permutations $p$ and $\sigma(p)$ denotes the signature of the permutation $p$.

Proof:

If we carried out the same expansion as in Example 7.3 for a general $n \times n$ matrix, then each time we expand a column, we would obtain $n$ times as many terms. After expanding all $n$ columns we would get the determinant as a sum of $n^n$ terms. Some of these terms vanish because the determinant have equal columns. In the example the first and the last of the four terms vanished. After removing these we get

\begin{equation} \det(\mx{A}) = \sum_p a_{p(1),1} a_{p(2),2} \ldots a_{p(n),n} \det(\mx{E}_p), \end{equation}

(7.30)

where the sum is taken over $n!$ permutations $p$ of the $n$ numbers. So if there exists a function fulfilling the three criteria $(i)$, $(ii)$, and $(iii)$, it can be calculated as

\begin{equation} \det(\mx{A}) = \sum_p a_{p(1),1} a_{p(2),2} \ldots a_{p(n),n} \sigma(p). \end{equation}

(7.31)

To complete the proof, we need to show that the expression above fulfills the three criteria $(i)$, $(ii)$, and $(iii)$. The expression above is linear in each column, $(i)$, since it is a sum, where each term is linear in each column. The expression is one for the identity matrix $\mx{I}$, $(ii)$. What remains to be shown is $(iii)$ that the function is zero if two columns are equal. Without loss of generality, we can assume that the two columns in question are columns $1$ and $2$. For every permutation $p = [p(1), p(2), p(3), \ldots, p(n)] $, there is another permutation $q = [p(2), p(1), p(3), \ldots, p(n)] $, where the values of the two first elements are transposed. Here we have $\sigma(p) = -\sigma(q)$. The expression is a sum of double terms of type

\begin{equation} a_{p(1),1} a_{p(2),2} \ldots a_{p(n),n} \sigma(p) + a_{p(2),1} a_{p(1),2} \ldots a_{p(n),n} \sigma(q) . \end{equation}

(7.32)

However, since columns 1 and 2 are equal and since $\sigma(p) = -\sigma(q)$ each such term becomes zero

\begin{equation} a_{p(1),1} a_{p(2),2} \ldots a_{p(n),n} \sigma(p) - a_{p(2),2} a_{p(1),1} \ldots a_{p(n),n} \sigma(p) = 0 . \end{equation}

(7.33)

This concludes the proof that there exists one and only one function fulfilling properties $(i)$, $(ii)$, and $(iii)$ and that this function $\det$ follows Equation (7.29).

$\square$

Thus the determinants of orders $1$, $2$, and $3$ consists of $1$, $2$, and $6$ terms in accordance with the previous example. There are relatively easy patterns that one can use to remember the $2 \times 2$ determinant. Write the matrix and think about a cross. The elements on the diagonal should be multiplied and then subtract the product of the other diagonal according to the following illustration.

\begin{equation} \begin{array}{cccccc} \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! + & \,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,\, - \\ \!\!\!\!\!\!\!\!\!\! \searrow & \,\,\,\,\,\,\,\, \swarrow \\ a_{11} & a_{12}\\ a_{21} & a_{22} \end{array}, \end{equation}

(7.34)

i.e., $a_{11} a_{22} - a_{21} a_{12}$. For $3 \times 3$ matrices, there is also a pattern that can be used to remember the determinant. Put the columns of the matrix, $\vc{a}_1$, $\vc{a}_2$, and $\vc{a}_3$ twice, according to the following pattern,

\begin{equation} \begin{array}{cccccc} \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!+ &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! + &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! + &\,\,\,\,\,\,\,\,\,\,\,\, - &\,\, - & - \\ \!\!\!\!\!\!\!\!\!\!\searrow &\!\!\!\!\!\!\!\!\! \searrow &\!\!\!\!\!\!\!\!\!\! \searrow & \!\swarrow &\!\!\!\!\!\!\!\!\! \swarrow &\!\!\!\!\!\!\!\!\!\!\! \swarrow \\ a_{11} & a_{12} & a_{13} &\!\!\!\!\!\!\!\!\! a_{11} &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! a_{12} &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! a_{13} \\ a_{21} & a_{22} & a_{23} &\!\!\!\!\!\!\!\!\! a_{21} &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! a_{22} &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! a_{23} \\ a_{31} & a_{32} & a_{33} &\!\!\!\!\!\!\!\!\! a_{31} &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! a_{32} &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! a_{33} \\ \end{array} \end{equation}

(7.35)

i.e., $+a_{11}a_{22}a_{33} +a_{12}a_{23}a_{31} +a_{13}a_{21}a_{32} -a_{11}a_{23}a_{32} -a_{12}a_{21}a_{33} -a_{13}a_{22}a_{31}.$ Note, however, that there is no similar simple pattern for calculating the determinant of a larger matrix, for example, there is no simple pattern for calculating the determinant of a $4 \times 4$ matrix. Note also that the number of terms grows quickly, e.g., $24$ terms for matrices of size $4 \times 4$ and $120$ terms for a $5 \times 5$ matrix.

In this section, we present theorems about the determinant of a transposed matrix, of a product of matrices, and of the inverse of a matrix.

Theorem 7.4:
The determinant does not change under transposition, i.e.,

\begin{equation} \det(\mx{A}) = \det(\mx{A}^\T) . \end{equation}

(7.36)

Proof:

For a matrix $\mx{A}=[a_{ij}]$, the transposed matrix is $\mx{A}^\T = [b_{ij}]$, where $b_{ij}=a_{ji}$. According to Theorem 7.3, we have

\begin{equation} \det(\mx{A}^T) = \sum_{p} b_{p(1),1} b_{p(2),2} \ldots b_{p(n),n} \sigma(p) = \sum_{p} a_{1,p(1)} a_{2,p(2)} \ldots a_{n,p(n)} \sigma(p) . \end{equation}

(7.37)

Each term can be reordered so that $a_{1,p(1)} a_{2,p(2)} \ldots a_{n,p(n)} = a_{q(1),1} a_{q(2),2} \ldots a_{q(n),n}$, where $q = p^{-1}$. Since $\sigma(p) = \sigma(p^{-1}) = \sigma(q)$ according to Theorem 7.2, we get

\begin{equation} \det(\mx{A}^\T) = \sum_{p} a_{1,p(1)} a_{2,p(2)} \ldots a_{n,p(n)} \sigma(p) = \sum_{q} a_{q(1),1} a_{q(2),2} \ldots a_{q(n),n} \sigma(q) = \det{\mx{A}} , \end{equation}

(7.38)

which concludes the proof.

$\square$

Theorem 7.5:
The determinant of a matrix product is a product of corresponding determinants, i.e.,

\begin{equation} \det(\mx{A} \mx{B}) = \det(\mx{A}) \det(\mx{B}). \end{equation}

(7.39)

Proof:

The matrix product $\mx{A} \mx{B}$ can be written

\begin{equation} \mx{A} \mx{B} = \begin{pmatrix} \mx{A} \vc{b}_1 & \mx{A} \vc{b}_2 & \ldots & \mx{A} \vc{b}_n \end{pmatrix} , \end{equation}

(7.40)

where $\vc{b}_j$ is column $j$ of matrix $\mx{B}$. Now study each column $\mx{A} \vc{b}_j$ of the product $\mx{A} \mx{B}$. Each such column $\mx{A} \vc{b}_j$ is a linear combination of columns $\vc{a}_i$ of $\mx{A}$,

\begin{equation} \mx{A} \vc{b}_j = \sum_i \vc{a}_i b_{ij}, \end{equation}

(7.41)

where $\vc{a}_i$ is column $i$ of matrix $\mx{A}$. Since the determinant is linear in each column, we have

\begin{equation} \det(\mx{A} \mx{B}) = \det( \begin{pmatrix} \sum_i b_{i1} \vc{a}_i & \sum_i b_{i2} \vc{a}_i & \ldots \sum_i b_{in} \vc{a}_i \end{pmatrix} = \sum_p b_{p(1),1} b_{p(2),2} \cdot b_{p(n),n} \det( \begin{pmatrix} \vc{a}_{p(1)} & \vc{a}_{p(2)} & \ldots \vc{a}_{p(n)} \end{pmatrix} ). \end{equation}

(7.42)

The determinant in the sum is zero if two of the columns are equal. Thus the terms are only relevant for permutations $p$. After rearrangements of the columns we get

\begin{equation} \det ( \begin{pmatrix} \vc{a}_{p(1)} & \vc{a}_{p(2)} & \ldots & \vc{a}_{p(n)} \end{pmatrix} ) = \sigma(p) \det{\mx{A}} . \end{equation}

(7.43)

However, this means that

\begin{equation} \det(\mx{A} \mx{B}) = \sum_p b_{p(1),1} b_{p(2),2} \cdot b_{p(n),n} \sigma(p) \det(\mx{A}) = \det(\mx{A}) \det(\mx{B}) , \end{equation}

(7.44)

which proves the theorem.

$\square$

Using the Theorem 7.5 we can prove the following useful result.

Theorem 7.6:
The determinant of the inverse of a matrix is the inverse of the determinant, i.e.,

\begin{equation} \det(\mx{A}^{-1}) = \frac{1}{\det(\mx{A})}. \end{equation}

(7.45)

Proof:

Using Theorem 7.5 on the identity $\mx{A} \mx{A}^{-1} = \mx{I}$ gives

\begin{equation} \det( \mx{A} \mx{A}^{-1} ) = \det( \mx{A} )\det( \mx{A}^{-1} ) = \det( \mx{I} ) = 1 \end{equation}

(7.46)

which proves that

\begin{equation} \det(\mx{A}^{-1}) = \frac{1}{\det(\mx{A})}. \end{equation}

(7.47)

$\square$

In Interactive Illustration 7.5, Theorem 7.6 is shown for $2\times 2$ matrices. For example, if you make the area of the parallelogram of $\mx{A}$ smaller, then the area of the parallelogram of $\mx{A}^{-1}$ becomes larger and vice versa.

Interactive Illustration 7.5: In this interactive illustration, we visualize a $2\times 2$ matrix, $\mx{A}$, as the two column vectors, $\vc{a}_{,1}$ and $\vc{a}_{,2}$, that it consists of, i.e., $\mx{A} = \bigl(\textcolor{#aa0000}{\vc{a}_{,1}}\,\, \textcolor{#00aa00}{\vc{a}_{,2}} \bigr)$. Both $\textcolor{#aa0000}{\vc{a}_{,1}}$ and $\textcolor{#00aa00}{\vc{a}_{,2}}$ can be moved in this illustration. Recall that the area of the parallelogram is essentially the determinant of the matrix (without sign), and due to Theorem 7.6, the parallelogram of the inverse becomes smaller if the parallelogram of $\mx{A}$ becomes bigger and vice versa.

Interactive Illustration 7.5: In this interactive illustration, we visualize a $\hid{2\times 2}$ matrix, $\hid{\mx{A}}$, as the two column vectors, $\hid{\vc{a}_{,1}}$ and $\hid{\vc{a}_{,2}}$, that it consists of, i.e., $\hid{\mx{A} = \bigl(\textcolor{#aa0000}{\vc{a}_{,1}}\,\, \textcolor{#00aa00}{\vc{a}_{,2}} \bigr)}$. Both $\hid{\textcolor{#aa0000}{\vc{a}_{,1}}}$ and $\hid{\textcolor{#00aa00}{\vc{a}_{,2}}}$ can be moved in this illustration. Recall that the area of the parallelogram is essentially the determinant of the matrix (without sign), and due to \linkref{Theorem}{theo_determinant_inverse}, the parallelogram of the inverse becomes smaller if the parallelogram of $\hid{\mx{A}}$ becomes bigger and vice versa.

$\mx{A} = \left(\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right)$

$\textcolor{#aa0000}{\vc{a}_{,1}}$

$\textcolor{#009000}{\vc{a}_{,2}}$

$\mx{A}^{-1} = \left(\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right)$

We start this section with the definition of minor, which is a scalar value that can be computed from a square matrix.

Definition 7.8: Minor
Assume that $\mx{A}$ is an $n \times n$ matrix. The minor, denoted $D_{ij}$, is the determinant of the $(n-1) \times (n-1)$ matrix that you obtain after removing row $i$ and column $j$ in $\mx{A}.$

These determinants are often called minors or first minors of the matrix.

Example 7.6:
Let the matrix $\mx{A}$ be

\begin{equation} \mx{A} = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6\\ 7 & 8 & 9 \end{pmatrix} . \end{equation}

(7.48)

What are the matrices $D_{1,2}$ and $D_{3,1}$? The matrix $D_{1,2}$ is obtained after removing row $1$ and column $2$. Thus

\begin{equation} D_{1,2} = \det \begin{pmatrix} 4 & 6\\ 7 & 9 \end{pmatrix} . \end{equation}

(7.49)

The matrix $D_{3,1}$ is obtained after removing row $3$ and column $1$. Thus

\begin{equation} D_{3,1} = \det \begin{pmatrix} 2 & 3 \\ 5 & 6 \end{pmatrix} . \end{equation}

(7.50)

Theorem 7.7: Expansion Along a Row/Column
The determinant of an $n \times n$ matrix $\mx{A}$ can be computed using expansion by row $i$ according to

\begin{equation} \det(\mx{A}) = \sum_j (-1)^{i+j} a_{ij} D_{ij} . \end{equation}

(7.51)

The determinant can also be computed using expansion by column $j$ according to

\begin{equation} \det(\mx{A}) = \sum_i (-1)^{i+j} a_{ij} D_{ij} . \end{equation}

(7.52)

Proof:

Column $j$ of $\mx{A}$ can be written as a linear combination

\begin{equation} \vc{a}_{,j} = a_{1j} \vc{i}_1 + a_{2j} \vc{i}_2 + \ldots + a_{nj} \vc{i}_n \end{equation}

(7.53)

of the canonical basis $\vc{i}_1, \ldots, \vc{i}_n$, where $\vc{i}_j$ is all zeroes except in the $j$:th element which is a one. Using the linearity of the determinant (property $(iii)$ in Theorem 7.1), the determinant can be written as

\begin{equation} \det(\mx{A}) = \sum_k a_{kj} \det(\mx{B}_k) , \end{equation}

(7.54)

where $\mx{B}_i$ is a matrix which is identical to $\mx{A}$ except that column $j$ has been exchanged with $\vc{i}_i$. After $i-1$ swaps of rows and $j-1$ swaps of of columns we get $\det(\mx{B}_i) = (-1)^{i+j} \det(\mx{C}_i)$, where

\begin{equation} \mx{C}_i = \begin{pmatrix} 1 & 0\\ 0 & D_{ij} \end{pmatrix} . \end{equation}

(7.55)

To conclude the proof we need to show that $\det(\mx{C}_i) = \det(D_{ij})$. According to Theorem 7.3, we have

\begin{equation} \det(\mx{C}_i) = \sum_p c_{p(1),1} c_{p(2),2} \ldots c_{p(n),n} \sigma(p). \end{equation}

(7.56)

Since the first column is zero for rows 2 to $n$, it is only the permutations with $p(1)=1$ that will contribute to the sum. Therefore

\begin{equation} \det(\mx{C}_i) = \sum_{p,p(1)=1} c_{p(1),1} c_{p(2),2} \ldots c_{p(n),n} \sigma(p) = \sum_{p,p(1)=1} c_{p(2),2} \ldots c_{p(n),n} \sigma(p) = \sum_{q} d_{q(1),1} \ldots d_{q(n-1),n-1} \sigma(q) = \det{(D_{ij})} . \end{equation}

(7.57)

The proof for expansion along a row can be obtained from the column version using the property $\det(\mx{A}) = \det(\mx{A})^\T$.

$\square$

Example 7.7:
Calculate

\begin{equation} \begin{vmatrix} 1 & 2 & 3 \\ 4 & 5 & 6\\ 7 & 8 & 9 \end{vmatrix} \end{equation}

(7.58)

using expansion along row $1$.

\begin{equation} \begin{vmatrix} 1 & 2 & 3 \\ 4 & 5 & 6\\ 7 & 8 & 9 \end{vmatrix} = 1 \begin{vmatrix} 5 & 6\\ 8 & 9 \end{vmatrix} - 2 \begin{vmatrix} 4 & 6\\ 7 & 9 \end{vmatrix} +3 \begin{vmatrix} 4 & 5 \\ 7 & 8 \end{vmatrix} = 1 (45-48) - 2 (36-42) + 3 (32-35) = -3 + 12 - 9 = 0 \end{equation}

(7.59)

Calculate the same determinant using expansion along column 2.

\begin{equation} \begin{vmatrix} 1 & 2 & 3 \\ 4 & 5 & 6\\ 7 & 8 & 9 \end{vmatrix} = -2 \begin{vmatrix} 4& 6\\ 7 & 9 \end{vmatrix} +5 \begin{vmatrix} 1 & 3\\ 7 & 9 \end{vmatrix} -8 \begin{vmatrix} 1 & 3 \\ 4 & 6 \end{vmatrix} = -2 (36-42) +5 (9-21) -8 (6-12) = 12 -60 +48 = 0. \end{equation}

(7.60)

Expansion along a row or column is particularily useful if there are a lot of zeros in that row or column. This is illustrated in the following example

Example 7.8:
Calculate

\begin{equation} \begin{vmatrix} 1 & 2 & 3 & 3 & \pi \\ 4 & 5 & 0 & 4 & -912\\ 0 & 0 & 0 & 0 & 1 \\ 7 & 8 & 0 & 5 & -\pi^2 \\ 0 & 2 & 0 & 0 & 77 \end{vmatrix}. \end{equation}

(7.61)

We start by expanding along row 3. This gives

\begin{equation} \begin{vmatrix} 1 & 2 & 3 & 3 & \pi \\ 4 & 5 & 0 & 4 & -912\\ 0 & 0 & 0 & 0 & 1 \\ 7 & 8 & 0 & 5 & -\pi^2 \\ 0 & 2 & 0 & 0 & 77 \end{vmatrix} = 0 + 0 + 0 + 0 + 1 \begin{vmatrix} 1 & 2 & 3 & 3 \\ 4 & 5 & 0 & 4 \\ 7 & 8 & 0 & 5 \\ 0 & 2 & 0 & 0 \end{vmatrix}. \end{equation}

(7.62)

We continue by expanding along column 3.

\begin{equation} \begin{vmatrix} 1 & 2 & 3 & 3 & \pi \\ 4 & 5 & 0 & 4 & -912\\ 0 & 0 & 0 & 0 & 1 \\ 7 & 8 & 0 & 5 & -\pi^2 \\ 0 & 2 & 0 & 0 & 77 \end{vmatrix} = \begin{vmatrix} 1 & 2 & 3 & 3 \\ 4 & 5 & 0 & 4 \\ 7 & 8 & 0 & 5 \\ 0 & 2 & 0 & 0 \end{vmatrix} = 3 \begin{vmatrix} 4 & 5 & 4 \\ 7 & 8 & 5 \\ 0 & 2 & 0 \end{vmatrix} + 0 + 0 + 0 . \end{equation}

(7.63)

Now use expansion along row 3.

\begin{equation} \begin{vmatrix} 1 & 2 & 3 & 3 & \pi \\ 4 & 5 & 0 & 4 & -912\\ 0 & 0 & 0 & 0 & 1 \\ 7 & 8 & 0 & 5 & -\pi^2 \\ 0 & 2 & 0 & 0 & 77 \end{vmatrix} = 3 \begin{vmatrix} 4 & 5 & 4 \\ 7 & 8 & 5 \\ 0 & 2 & 0 \end{vmatrix} = 3 \cdot (-2) \begin{vmatrix} 4 & 4 \\ 7 & 5 \\ \end{vmatrix} = -6 (20-28) = 48 \end{equation}

(7.64)

Example 7.9: The determinant of a triangular matrix
The determinant of a triangular matrix is the product of the diagonal elements. This is easy to see using expansion along rows or columns. For example, what is the determinant of

\begin{equation} \begin{vmatrix} 11 & 0 & 0 \\ 1 & 17 & 0 \\ 1 & 2 & 42 \end{vmatrix} ? \end{equation}

(7.65)

Using repeated expansions along the first row we get

\begin{equation} \begin{vmatrix} 11 & 0 & 0 \\ 1 & 17 & 0 \\ 1 & 2 & 42 \end{vmatrix} = 11 \begin{vmatrix} 17 & 0 \\ 2 & 42 \end{vmatrix} = 11 \cdot 17 \begin{vmatrix} 42 \end{vmatrix} = 11 \cdot 17 \cdot 42 . \end{equation}

(7.66)

Thus the determinant is the product of the diagonal elements.

Example 7.10: Adding one column to another
Adding (or subtracting) a multiple of one column to another column does not change the determinant according to property (vi). This can sometimes be useful to simplify the calculations. Here is an example

\begin{equation} \begin{vmatrix} 1 & 1 & 1\\ 1 & 2 & 2 \\ 1 & 3 & 9 \end{vmatrix} = \begin{vmatrix} 1 & 1 & 0 \\ 1 & 2 & 0 \\ 1 & 3 & 6 \end{vmatrix} = \begin{vmatrix} 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 2 & 6 \end{vmatrix} = 1 \cdot 1 \cdot 6 = 6 . \end{equation}

(7.67)

In the first step we subtracted column 2 from column 3. Then we subtracted column 1 from the new column 2. In the final step we used the fact that the determinant of a triangular matrix is the product of the diagonal elements (see Example 7.9).

In this section, we will introduce the adjoint matrix $\adj( \mx{A} )$ to a square matrix $\mx{A}$.

Definition 7.9: Adjoint matrix
The adjoint matrix $\adj( \mx{A} )$ is a matrix whose elements at position $(i,j)$ is $(-1)^{i+j} D_{ji}$, where similar to the previous section $D_{ij}$ is the determinant of the matrix $A$ after removing row $i$ and column $j$.

Notice that we used $D_{ji}$ and not $D_{ij}$ in the definition.

Example 7.11:
When calculating the adjoint of a $2 \times 2$ matrix

\begin{equation} \begin{pmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{pmatrix} , \end{equation}

(7.68)

we first need to calculate the determinants $D_{ij}$. Here $D_{11}$ is the determinant of the matrix obtained when removing row $1$ and column $1$. This gives $D_{11}=a_{22}$. In the same way $D_{12} = a_{21}$, $D_{21} = a_{12}$ and $D_{22} = a_{11}$. Thus the adjoint is

\begin{equation} \adj(\mx{A}) = \begin{pmatrix} a_{22} & -a_{12}\\ -a_{21} & a_{11} \end{pmatrix} . \end{equation}

(7.69)

Here for example $\adj(\mx{A})_{12} = -D_{21} = -a_{12}$. Notice that the two elements on the diagonal changed place and that the two elements on the off diagonal (i.e., not on the main diagonal) changed sign.

The adjoint matrix is closely related to the inverse, through the following theorem.

Theorem 7.8:
If $\mx{A}$ is a square matrix and $\adj(\mx{A})$ is its adjoint matrix, then

\begin{equation} \mx{A} (\adj(\mx{A})) = (\adj(\mx{A})) \mx{A} = \det(\mx{A}) \mx{I} . \end{equation}

(7.70)

If $\det(\mx{A}) \neq 0$ then $\mx{A}$ is invertible and the inverse is given by

\begin{equation} \mx{A}^{-1} = \frac{1}{\det \mx{A}} \adj \mx{A} . \end{equation}

(7.71)

Proof:

Study the product $\mx{B} = (\adj (\mx{A})) \mx{A}$. The diagonal elements $b_{jj}$ of $\mx{B}$ are then

\begin{equation} b_{jj} = \sum_i (\adj (\mx{A}))_{ji} a_{ij} = \sum_i (-1)^{i+j} D_{ij} a_{ij} = \det \mx{A}, \end{equation}

(7.72)

where $(\adj \mx{A})_{ji}$ is the element (scalar value) of the adjoint of $\mx{A}$ at row $j$ and column $i$, and $a_{ij}$ is the element of $\mx{A}$ at row $i$ and column $j$ as usual. Here the last equation comes from Theorem 7.7, concerning expansion of the determinant. For the off diagonal elements $b_{ij}$ with $i\neq j$, we have

\begin{equation} b_{ij} = \sum_k (\adj (\mx{A}))_{ik} a_{kj} = \sum_i (-1)^{i+k} D_{ij} a_{ki} a_{kj} = 0. \end{equation}

(7.73)

Here the last equality comes from the fact that $\sum_i (-1)^{i+k} D_{ij} a_{ki} a_{kj}$ is the expression for calculating the determinant of the matrix $\mx{A}$ where column $j$ is replaced by column $i$. Since $i \neq j$ there are two equal columns in this modified matrix and therefore the determinant is zero.

$\square$

Example 7.12:
The inverse of a $2 \times 2$ matrix is particularly easy to calculate using Theorem 7.8. We saw in Example 7.11 that for a general $2 \times 2$ matrix

\begin{equation} \mx{A} = \begin{pmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{pmatrix} , \end{equation}

(7.74)

the adjoint is

\begin{equation} \adj(\mx{A}) = \begin{pmatrix} a_{22} & -a_{12}\\ -a_{21} & a_{11} \end{pmatrix} . \end{equation}

(7.75)

According to Theorem 7.8, we have

\begin{equation} \mx{A}^{-1} = \frac{1}{\det \mx{A}} \adj \mx{A} = \frac{1}{a_{11} a_{22} - a_{12} a_{21}} \begin{pmatrix} a_{22} & -a_{12}\\ -a_{21} & a_{11} \end{pmatrix}, \end{equation}

(7.76)

which is the same as in Theorem 6.3 from the previous chapter.

In this section, Cramer's rule will be introducted. It is useful when you want the solution $\vc{x}$ to $\mx{A}\vc{x}=\vc{y}$ without needing to know the inverse matrix of $\mx{A}$.

Theorem 7.9:
If $\mx{A}$ is a square matrix and $\det \mx{A} \neq 0$ then the solution $\vc{x}$ to the matrix equation $\mx{A}\vc{x} = \vc{y}$ has

\begin{equation} x_i =\frac{ \det \begin{pmatrix} \vc{a}_1 & \ldots & \vc{a}_{i-1} & \vc{y} & \vc{a}_{i+1} & \ldots & \vc{a}_n \end{pmatrix} }{\det \mx{A}} . \end{equation}

(7.77)

In other words by exchanging column $i$ in $\mx{A}$ with the right hand side $\vc{y}$, and then calculating the determinant of this matrix and dividing by the determinant of $\mx{A}$, we directly obtain coordinate $i$ of the solution vector $\vc{x}$. It almost seems too easy.

Proof:

Study the determinant

\begin{equation} \det \begin{pmatrix} \vc{a}_1 & \ldots & \vc{a}_{i-1} & \vc{y} & \vc{a}_{i+1} & \ldots & \vc{a}_n \end{pmatrix} . \end{equation}

(7.78)

According to the matrix equation $\mx{A}\vc{x} = \vc{y}$, the column vector $\vc{y}$ is a linear combination of the columns of $\mx{A}$, i.e.,

\begin{equation} \vc{y} = \sum_j x_j \vc{a}_j . \end{equation}

(7.79)

Since the determinant is linear in each column, we have

\begin{equation} \det \begin{pmatrix} \vc{a}_1 & \ldots & \vc{a}_{i-1} & \vc{y} & \vc{a}_{i+1} & \ldots & \vc{a}_n \end{pmatrix} = \sum_j x_j \det \begin{pmatrix} \vc{a}_1 & \ldots & \vc{a}_{i-1} & \vc{a}_{j} & \vc{a}_{i+1} & \ldots & \vc{a}_n \end{pmatrix} = x_i \det \mx{A} . \end{equation}

(7.80)

In the sum above, whenever $j \neq i$, then the determinant is zero, because there are two identical columns. But for $j=i$ we get the end result $x_i \det \mx{A}$. This concludes the proof

$\square$

Example 7.13:
Study the matrix equation

\begin{equation} \begin{pmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} x_1\\ x_2 \end{pmatrix} = \begin{pmatrix} y_1\\ y_2 \end{pmatrix} . \end{equation}

(7.81)

If $\det(\mx{A}) = a_{11} a_{22} - a_{12} a_{21} \neq 0 $, then the solution is unique and

\begin{equation} x_1 = \frac{ \begin{vmatrix} y_1 & a_{12}\\ y_2 & a_{22} \end{vmatrix} }{ \begin{vmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{vmatrix} } = \frac{y_1 a_{22} - y_2 a_{12}}{a_{11} a_{22} - a_{12} a_{21}} \end{equation}

(7.82)

and

\begin{equation} x_2 = \frac{ \begin{vmatrix} a_{11} & y_1 \\ a_{21} & y_2 \end{vmatrix} }{ \begin{vmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{vmatrix} } = \frac{y_2 a_{11} - y_1 a_{21}}{a_{11} a_{22} - a_{12} a_{21}}. \end{equation}

(7.83)

In Interactive Illustration 7.6, we solve for $\vc{z}$ in a system of equations $\mx{A}\vc{z}=\vc{b}$ using Cramer's rule.

Interactive Illustration 7.6: In this interactive illustration, we visualize a $2\times 2$ matrix, $\mx{A}$, as the two column vectors, $\vc{a}_{,1}$ and $\vc{a}_{,2}$, that it consists of, i.e., $\mx{A} = \bigl(\textcolor{#aa0000}{\vc{a}_{,1}}\,\, \textcolor{#00aa00}{\vc{a}_{,2}} \bigr)$. We want to solve $\mx{A}\vc{z} = \vc{b}$ and this is being done as $\vc{z} = \mx{A}^{-1}\vc{b}$. To make it clearer in the figure, we use $\vc{z}=(x,y)$. Note that the column vectors can be moved in this illustration and that $\vc{b}$ is the black arrow. The solution $\vc{z}=(x,y)$ is solved by using Cramer's rule.

Interactive Illustration 7.6: In this interactive illustration, we visualize a $\hid{2\times 2}$ matrix, $\hid{\mx{A}}$, as the two column vectors, $\hid{\vc{a}_{,1}}$ and $\hid{\vc{a}_{,2}}$, that it consists of, i.e., $\hid{\mx{A} = \bigl(\textcolor{#aa0000}{\vc{a}_{,1}}\,\, \textcolor{#00aa00}{\vc{a}_{,2}} \bigr)}$. We want to solve $\hid{\mx{A}\vc{z} = \vc{b}}$ and this is being done as $\hid{\vc{z} = \mx{A}^{-1}\vc{b}}$. To make it clearer in the figure, we use $\hid{\vc{z}=(x,y)}$. Note that the column vectors can be moved in this illustration and that $\hid{\vc{b}}$ is the black arrow. The solution $\hid{\vc{z}=(x,y)}$ is solved by using Cramer's rule.

$\mx{A} = \left(\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right)$

$\textcolor{#aa0000}{\vc{a}_{,1}}$

$\textcolor{#009000}{\vc{a}_{,2}}$

$\det(\mx{A})=$

$\left(\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right)\begin{pmatrix}x\\y\end{pmatrix}=\left(\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right.$

$\left.\begin{array}{l} \hid{1} \\ \hid{1} \end{array}\right)$

$x=$

$y=$

$\vc{b}$

Theorem 7.10:
The following equivalence holds for all square matrices $\mx{A}$,

\begin{equation} \begin{array}{ll} (i) & \spc\text{The column vectors of}\spc \mx{A} \spc\text{is a basis} \\ (ii) & \spc\text{The row vectors of}\spc \mx{A} \spc\text{is a basis} \\ (iii) & \spc\text{The matrix equation} \spc \mx{A} \vc{x} = 0 \spc\text{has only one solution} \spc \vc{x}=0 \\ (iv) & \spc\text{The matrix equation}\spc \mx{A} \vc{x} = \vc{y} \spc\text{has a solution for every} \spc \vc{y} \\ (v) & \spc\text{The matrix} \spc \mx{A} \spc\text{is invertible} \\ (vi) & \spc\det \mx{A} \neq 0 \\ \end{array} \end{equation}

(7.84)

Proof:

From Theorem 6.9 we know that $(i), (ii), (iii), (iv),(v)$ are equivalent. Here we only need to prove that these are also equivalent to $(vi)$.

If $\det \mx{A} \neq 0$ then the matrix is invertible due to Theorem 7.8. On the other hand if the matrix $\mx{A}$ is invertible then we know from Theorem 7.8 that $\det \mx{A} \det \mx{A}^{-1} = 1$ that both $\det \mx{A} \neq 0$ and $\det \mx{A}^{-1} \neq 0$.

$\square$

Example 7.14: Three points on a line
Use the determinant to derive a constraint to check that three points $(x_1,y_1)$, $(x_2,y_2)$ and $(x_3,y_3)$ are co-linear, i.e. that they lie on a line.

If the three points are co-linear, then there exists a line, say $ax+by+c = 0$ such that the three points all lie on that line. In other words we have

\begin{equation} a x_1+b y_1+c = 0 \\ a x_2+b y_2+c = 0 \\ a x_3+b y_3+c = 0 . \end{equation}

(7.85)

Rewrite this as a matrix-vector equation

\begin{equation} \begin{pmatrix} x_1 & y_1 & 1 \\ x_2 & y_2 & 1 \\ x_3 & y_3 & 1 \end{pmatrix} \begin{pmatrix} a\\ b\\ c \end{pmatrix} = 0 . \end{equation}

(7.86)

This is an equation of type $\mx{A}\vc{s}=0$, where $\vc{s} = (a, b, c)$. According to Theorem 7.10 this has only the solution $a=b=c=0$ if $\det{\mx{A}} \neq 0$, which means that the points do not lie on a line. If on the other hand $\det{\mx{A}} = 0$, then is at least one non-zero solution to the problem. Each solution $\vc{x}$ corresponds to a line. Notice, however, that rescaling a solution $\lambda \vc{s}$ corresponds to the same line as $\vc{s}$.

Answer: Three points are co-linear if and only if

\begin{equation} \begin{vmatrix} x_1 & y_1 & 1 \\ x_2 & y_2 & 1 \\ x_3 & y_3 & 1 \end{vmatrix} = 0 . \end{equation}

(7.87)

Example 7.15: The Fundamental Matrix
In Interactive Illustration 5.2, we saw that the position of a 2D point in the first image is constrained by the position of the corresponding 2D point in the other image. The reason is that these 2D points are both projections of the same 3D point.

There is a relatively simple model for the projection of a 3D point with coordinates $(u_1,u_2,u_3)$ onto the 2D image point $\vc{v} = (v_1,v_2)$. The model becomes simpler by using so called homogeneous coordinates. Points are then represented as column vectors with an additional coordinate set to one. In this representation the 3D point is represented as

\begin{equation} \vc{u} = \begin{pmatrix} u_1\\ u_2\\ u_3\\ 1 \end{pmatrix} \end{equation}

(7.88)

and the image points is represented as

\begin{equation} \vc{v} = \begin{pmatrix} v_1\\ v_2\\ 1 \end{pmatrix} . \end{equation}

(7.89)

The model for the projection is then

\begin{equation} \lambda \begin{pmatrix} v_1\\ v_2\\ 1 \end{pmatrix} = \lambda \begin{pmatrix} p_{11} & p_{12} & p_{13} & p_{14} \\ p_{21} & p_{22} & p_{23} & p_{24} \\ p_{31} & p_{32} & p_{33} & p_{34} \end{pmatrix} \begin{pmatrix} u_1\\ u_2\\ u_3\\ 1 \end{pmatrix} \end{equation}

(7.90)

or

\begin{equation} \lambda \vc{v} = \mx{P} \vc{u} \end{equation}

(7.91)

using matrix and vector notation. In this model equation, $\mx{P}$ is the so called camera matrix . It contains the parameters of the position and orientation of the camera and $\lambda$ is a scalar. Because of the $\lambda$ in the equation, the model is non-linear. Nevertheless methods from linear algebra are crucial for the analysis and understanding of the problem.

Assume that two images are given, where $\mx{P}_L$ is the camera matrix corresponding to the first image and $\mx{P}_R$ is the camera matrix for the second image. Assume that one 3D point with homogeneous coordinate vector $\vc{u}$ is seen in both images. In the first image, the homogeneous coordinate vector is $\vc{v}_L$ and in the other image the vector is $\vc{v}_R$. The camera matrix equations for these two images are

\begin{equation} \lambda_L \vc{v}_L = \mx{P}_L \vc{u}\ \ \ \ \mathrm{and} \ \ \ \ \lambda_R \vc{v}_R = \mx{P}_R \vc{u}. \end{equation}

(7.92)

These two matrix equations can be written as a joint matrix equation, i.e.,

\begin{equation} \begin{pmatrix} \mx{P}_L & \vc{v}_L & \vc{0} \\ \mx{P}_R & \vc{0} & \vc{v}_R \end{pmatrix} \begin{pmatrix} \vc{u}\\ -\lambda_L \\ -\lambda_R \end{pmatrix} = \begin{pmatrix} \vc{0}\\ \vc{0} \end{pmatrix} . \end{equation}

(7.93)

Notice that the matrices and vectors are made out of blocks of other matrices or vectors. One often talks of block matrices. The matrix equation is of type $\mx{A} \vc{x} = \vc{0}$. Here $\mx{A}$ is a square matrix of size $6 \times 6$. Notice that the matrix equation has a solution $\begin{pmatrix}\vc{u}\\-\lambda_L \\-\lambda_R \end{pmatrix}$ that is not the zero vector. According to Theorem 7.10, this means that $\det \mx{A} = 0$. If we expand the determinant first along the fifth column and then along the sixth column we obtain

\begin{equation} \vc{v}_L^\T \mx{F} \vc{v}_R = 0. \end{equation}

(7.94)

Here the elements of $\mx{F}$ are determinants of $4\times 4$ matrices involving the elements of the two camera matrices.

Chapter 7: Determinants

7.1 Introduction

7.2 Definition

7.3 Permutations and Determinants

7.4 Transposition, Multiplication, and Inverse

7.5 Expansion Along a Column

7.6 Adjoint Matrix

7.7 Cramer's Rule

7.8 Determinants, Independence, and Invertibility