Orthogonal Projections

Section 7.3 Orthogonal Projections

An inner product provides the tool to decompose vectors into useful components. We have already seen this in Lemma 7.1.25, but in this section we will expand our discussion. The process of orthogonal projection opens the door to many applications.

🔗

Subsection 7.3.1 Orthogonal Complements

In an inner product space, we can collect all vectors orthogonal to any given set of vectors. In particular, we can do this with a subspace.

🔗

Definition 7.3.1.

Let \(U\) be a subspace of an inner product space \(V\text{.}\) Then the orthogonal complement of \(U\text{,}\) denoted \(U^{\perp}\text{,}\) is defined as

\begin{equation*} U^{\perp} = \{ \bfv \in V \mid \ip{\bfu,\bfv} = 0 \text{ for all } \bfu \in U \}\text{.} \end{equation*}

🔗

It is relatively easy to verify that \(U^{\perp}\) is itself a subspace of \(V\text{.}\) We will leave that proof as an exercise.

🔗

Proposition 7.3.2.

If \(U\) is a subspace of an inner product space \(V\text{,}\) then \(U^{\perp}\) is also a subspace of \(V\text{.}\)

🔗

The easiest examples of orthogonal complements to visualize are in \(\rr^2\) and \(\rr^3\text{.}\) If \(L\) is a line through the origin in \(\rr^2\text{,}\) then \(L^{\perp}\) is the line perpendicular to \(L\) which passes through the origin. If \(P\) is a plane through the origin in \(\rr^3\text{,}\) then \(P^{\perp}\) is the line through the origin which is perpendicular to \(P\text{.}\)

🔗

In an inner product space, any vector can be uniquely decomposed with reference to a subspace and its orthogonal complement.

🔗

Theorem 7.3.3.

Let \(V\) be an inner product space and let \(U\) be a finite-dimensional subspace of \(V\text{.}\) Then every vector \(\bfv \in V\) can be uniquely written in the form

\begin{equation*} \bfv = \bfu + \bfw\text{,} \end{equation*}

where \(\bfu \in U\) and \(\bfw \in U^{\perp}\text{.}\)

🔗

Proof.

Since \(U\) is finite-dimensional, there is an orthonormal basis for \(U\text{,}\) \(\{ \bfe_1, \ldots, \bfe_m \}\text{.}\) For any \(\bfv \in V\text{,}\) we define \(\bfu\) by

\begin{equation*} \bfu = \sum_{i=1}^m \ip{\bfv, \bfe_i} \bfe_i\text{,} \end{equation*}

and we let \(\bfw = \bfv - \bfu\text{.}\) Then we have \(\bfv = \bfu + \bfw\) and \(\bfu \in U\text{.}\) For each \(k\text{,}\) we have

\begin{equation*} \ip{\bfu, \bfe_k} = \ip{\bfv, \bfe_k}\text{,} \end{equation*}

\begin{equation*} \ip{\bfw, \bfe_k} = \ip{\bfv, \bfe_k} - \ip{\bfu, \bfe_k} = 0\text{.} \end{equation*}

Since \(\bfw\) is orthogonal to each element of the orthonormal basis of \(U\text{,}\) we have \(\bfw \in U^{\perp}\text{.}\)

🔗

We now need to prove that \(\bfu\) and \(\bfw\) are unique. Suppose that \(\bfu_1, \bfu_2 \in U\) and \(\bfw_1, \bfw_2 \in U^{\perp}\) such that

\begin{equation*} \bfu_1 + \bfw_1 = \bfu_2 + \bfw_2\text{.} \end{equation*}

We consider the vector \(\bfx\text{,}\)

\begin{equation*} \bfx = \bfu_1 - \bfu_2 = \bfw_2 - \bfw_1\text{.} \end{equation*}

Since \(U\) and \(U^{\perp}\) are subspaces, we have \(\bfx \in U\) and \(\bfx \in U^{\perp}\text{,}\) which means that \(\ip{\bfx, \bfx} = 0\text{.}\) This means that \(\bfx = \bfo\text{,}\) so that \(\bfu_1 = \bfu_2\) and \(\bfw_1 = \bfw_2\text{.}\)

🔗

Subsection 7.3.2 Orthogonal Projections

Once we have the sort of decomposition that Theorem 7.3.3 provides, we can properly talk about orthogonal projections.

🔗

Definition 7.3.4.

Let \(U\) be a subspace of an inner product space \(V\text{.}\) The orthogonal projection onto \(U\) is the function \(\proj_U:V \to V\) given by \(\proj_U(\bfv) = \bfu\text{,}\) where \(\bfv = \bfu + \bfw\) for \(\bfu \in U\) and \(\bfw \in U^{\perp}\text{.}\)

🔗

Orthogonal projection has some important properties which we now collect in the following theorem.

🔗

Theorem 7.3.5.

Let \(U\) be a finite-dimensional subspace of an inner product space \(V\text{.}\)

The function \(\proj_U\) is a linear transformation.
🔗

🔗
If \(\{\bfe_1, \ldots, \bfe_n \}\) is an orthonormal basis of \(U\text{,}\) then

\begin{equation*} \proj_U(\bfv) = \sum_{i=1}^n \ip{\bfv, \bfe_i}\bfe_i \end{equation*}

for each \(\bfv \in V\text{.}\)

🔗

🔗
For each \(\bfv \in V\text{,}\) \(\bfv - \proj_U(\bfv) \in U^{\perp}\text{.}\)
🔗

🔗
For each \(\bfv, \bfw \in V\text{,}\)

\begin{equation*} \ip{ \proj_U(\bfv), \bfw} = \ip{\proj_U(\bfv), \proj_U(\bfw)} = \ip{\bfv, \proj_U(\bfw)}\text{.} \end{equation*}

🔗

🔗
If \(\mcb = \{ \bfe_1, \ldots, \bfe_m \}\) is an orthonormal basis for \(V\text{,}\) and \(\{ \bfe_1, \ldots, \bfe_n \}\) is an orthonormal basis of \(U\) (with \(n \le m\)), then \([\proj_U]_{\mcb}\) is a diagonal matrix with the first \(n\) diagonal entries being 1 and the remaining diagonal entries being 0.
🔗

🔗
We have \(\img(\proj_U) = U\) and \(\proj_U(\bfu) = \bfu\) for all \(\bfu \in U\text{.}\)
🔗

🔗
We have \(\kerr(\proj_U) = U^{\perp}\text{.}\)
🔗

🔗
If \(V\) is finite-dimensional, then \(\proj_{U^{\perp}} = I - \proj_U\text{.}\)
🔗

🔗
We have \((\proj_U)^2 = \proj_U\text{.}\)
🔗

🔗

🔗

Proof.

We will prove property 1. Let \(\bfv_1, \bfv_2 \in V\text{,}\) so we can write \(\bfv_1 = \bfu_1 + \bfw_1\) and \(\bfv_2 = \bfu_2 + \bfw_2\text{,}\) for \(\bfu_1, \bfu_2 \in U\) and \(\bfw_1,\bfw_2 \in U^{\perp}\text{.}\) Then

\begin{equation*} \bfv_1 + \bfv_2 = (\bfu_1 + \bfu_2) + (\bfw_1 + \bfw_2)\text{,} \end{equation*}

which tells us that

\begin{equation*} \proj_U(\bfv_1 + \bfv_2) = \bfu_1 + \bfu_2 = \proj_U(\bfv_1) + \proj_U(\bfv_2)\text{.} \end{equation*}

We now let \(c \in \ff\) and \(\bfv \in V\text{.}\) We write \(\bfv = \bfu + \bfw\text{,}\) with \(\bfu \in U\) and \(\bfw \in U^{\perp}\text{.}\) We note that \(c\bfu \in U\) and \(c \bfw \in U^{\perp}\) since \(U\) and \(U^{\perp}\) are subspaces. Then

\begin{equation*} \proj(c\bfv) = \proj(c\bfu + c\bfw) = c\bfu = c\proj_U(\bfv)\text{.} \end{equation*}

This proves that \(\proj_U\) is a linear transformation.

🔗

We will also prove property 3. Let \(\bfv \in V\) and write \(\bfv = \bfu + \bfw\text{,}\) with \(\bfu \in U\) and \(\bfw \in U^{\perp}\text{.}\) Then

\begin{equation*} \bfv - \proj_U(\bfv) = \bfv - \bfu = \bfw \in U^{\perp}\text{.} \end{equation*}

🔗

We leave the proof of the other properties to the exercises.

🔗

We can use part of this theorem to describe the matrix of \(\proj_U\) explicitly.

🔗

Proposition 7.3.6.

Let \(U\) be a subspace of \(\rr^n\) or \(\cc^n\) with orthonormal basis \(\{ \bfv_1, \ldots, \bfv_m \}\text{.}\) Then the matrix of \(\proj_U\) with respect to the standard basis \(\mce\) is

\begin{equation*} [\proj_U]_{\mce} = \sum_{i=1}^m \bfv_i \bfv_i^*\text{,} \end{equation*}

where \(A^*\) denotes the conjugate transpose of a matrix.

🔗

Proof.

This fact follows from part 2 of Theorem 7.3.5 and the fact that the standard inner product in \(\cc^n\) can be written as

\begin{equation*} \ip{\bfu, \bfv} = \bfv^* \bfu\text{,} \end{equation*}

where matrix multiplication is in view on the right side of the equals sign.

🔗

Lest this endeavor become purely speculative, we now carry out an example.

🔗

Example 7.3.7.

We consider the plane through the origin in \(\rr^3\) defined by \(x + 2y - z = 0\text{.}\) This is a subspace of \(\rr^3\text{,}\) let’s call it \(U\text{,}\) and we can identify the following basis:

\begin{equation*} \mcb = \left\{ \begin{bmatrix} -2 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} \right\}\text{.} \end{equation*}

We use the Gram-Schmidt process on this basis to produce this orthonormal basis of \(U\text{:}\)

\begin{equation*} \left\{ \begin{bmatrix} -\frac{2}{\sqrt{5}} \\[6pt] \frac{1}{\sqrt{5}} \\[6pt] 0 \end{bmatrix}, \begin{bmatrix} \frac{1}{\sqrt{30}} \\[6pt] \frac{2}{\sqrt{30}} \\[6pt] \frac{5}{\sqrt{30}} \end{bmatrix} \right\}\text{.} \end{equation*}

Using this orthonormal basis of \(U\text{,}\) we can write the matrix of \(\proj_U\) with respect to the standard basis. We have

\begin{align*} [\proj_U]_{\mce} \amp = \bfe_1\bfe_1^* + \bfe_2\bfe_2^*\\ \amp = \begin{bmatrix} \frac{4}{5} \amp -\frac{2}{5} \amp 0 \\[6pt] -\frac{2}{5} \amp \frac{1}{5} \amp 0 \\[6pt] 0 \amp 0 \amp 0 \end{bmatrix} + \begin{bmatrix} \frac{1}{30} \amp \frac{1}{15} \amp \frac{1}{6} \\[6pt] \frac{1}{15} \amp \frac{2}{15} \amp \frac{1}{3} \\[6pt] \frac{1}{6} \amp \frac{1}{3} \amp \frac{5}{6} \end{bmatrix}\\ \amp = \begin{bmatrix} \frac{5}{6} \amp -\frac{1}{3} \amp \frac{1}{6} \\[6pt] -\frac{1}{3} \amp \frac{1}{3} \amp \frac{1}{3} \\[6pt] \frac{1}{6} \amp \frac{1}{3} \amp \frac{5}{6} \end{bmatrix}\text{.} \end{align*}

🔗

To finish this example, we will decompose a specific vector \(\bfv\) into the pieces specified by Theorem 7.3.3. Let

\begin{equation*} \bfv = \begin{bmatrix} 3 \\ -6 \\ 2 \end{bmatrix}\text{.} \end{equation*}

We note that \(\bfv \not \in U\text{.}\) Now we calculate \(\proj_U(\bfv)\text{:}\)

\begin{equation*} \proj_U(\bfv) = \begin{bmatrix} \frac{5}{6} \amp -\frac{1}{3} \amp \frac{1}{6} \\[6pt] -\frac{1}{3} \amp \frac{1}{3} \amp \frac{1}{3} \\[6pt] \frac{1}{6} \amp \frac{1}{3} \amp \frac{5}{6} \end{bmatrix} \begin{bmatrix} 3 \\ -6 \\ 2 \end{bmatrix} = \begin{bmatrix} \frac{29}{6} \\[6pt] -\frac{7}{3} \\[6pt] \frac{1}{6} \end{bmatrix}\text{.} \end{equation*}

Then \(\bfw = \bfv - \proj_U(\bfv)\text{,}\) so

\begin{equation*} \bfw = \begin{bmatrix} -\frac{11}{6} \\[6pt] -\frac{11}{3} \\[6pt] \frac{11}{6} \end{bmatrix}\text{.} \end{equation*}

This completes the decomposition \(\bfv = \bfu + \bfw\text{.}\)

🔗

Proposition 7.3.6 depended on having an orthonormal basis for the subspace \(U\text{.}\) We can always find such a basis through the Gram-Schmidt process, but there is an alternative way to produce the matrix for orthogonal projection.

🔗

Proposition 7.3.8.

Let \(U\) be a subspace of \(\rr^n\) or \(\cc^n\) with basis \(\{ \bfv_1, \ldots, \bfv_k \}\text{.}\) Let \(A\) be the \(n \times k\) matrix with columns \(\bfv_1, \ldots, \bfv_k\text{.}\) Then

\begin{equation*} [\proj_U]_{\mce} = A(A^*A)^{-1}A^*\text{.} \end{equation*}

🔗

Proof.

We note that \(\proj_U(\bfx)\) is an element of \(U\text{,}\) so it can be written as a linear combination of the columns of \(A\text{.}\) In other words, there is a vector \(\bfx'\) which satisfies \(\proj_U(\bfx) = A\bfx'\text{.}\) By part 3 of Theorem 7.3.5,

\begin{equation*} \bfx - \proj_U(\bfx) = \bfx - A\bfx' \in U^{\perp}\text{.} \end{equation*}

Specifically, we have

\begin{equation*} \ip{\bfx - A\bfx', \bfv_j} = \bfv_j^*(\bfx - A\bfx') = 0 \end{equation*}

for each \(\bfv_j\text{.}\) If we rewrite these \(k\) equations in matrix form, we have

\begin{equation*} A^*(\bfx - A\bfx') = \bfo \end{equation*}

or \(A^*A \bfx' = A^*\bfx\text{.}\) If \(A^*A\) is invertible, then we can multiply both sides of this equation by \(A(A^*A)^{-1}\text{,}\) and we get

\begin{equation*} A\bfx' = A(A^*A)^{-1}A^* \bfx\text{.} \end{equation*}

This completes the proof, since \(A\bfx' = \proj_U(\bfx)\text{.}\)

🔗

In the last paragraph we assumed that \(A^*A\) was invertible, so we now prove that fact. We can do this by proving that the null space of \(A^*A\) is trivial. Suppose that \(A^*A\bfx = \bfo\text{,}\) so we have

\begin{equation*} 0 = \ip{A^*A\bfx, \bfx} = \bfx^*(A^*A\bfx) = (A\bfx)^*(A\bfx) = \vnorm{A\bfx}^2\text{.} \end{equation*}

But since the columns of \(A\) are linearly independent (they are basis vectors for \(U\)), \(A\) must have rank \(k\text{.}\) By the Rank-Nullity Theorem (Theorem 5.4.10), this means that \(\dim(\nll(A)) = 0\text{,}\) and since \(\vnorm{A\bfx}^2 = 0\) means \(A\bfx = \bfo\text{,}\) we must have \(\bfx = \bfo\text{.}\) This proves that \(A^*A\) is invertible.

🔗

In the following theorem we capture two important geometric properties of orthogonal projections.

🔗

Theorem 7.3.9.

Let \(U\) be a finite-dimensional subspace of an inner product space \(V\text{.}\)

For each \(\bfv \in V\text{,}\) \(\vnorm{\proj_U(\bfv)} \le \vnorm{\bfv}\text{,}\) with equality if and only if \(\bfv \in U\text{.}\)
🔗

🔗
For each \(\bfv \in V\) and each \(\bfu \in U\text{,}\) we have

\begin{equation*} \vnorm{\bfv - \proj_U(\bfv)} \le \vnorm{\bfv - \bfu}\text{,} \end{equation*}

with equality if and only if \(\bfu = \proj_U(\bfv)\text{.}\)

🔗

🔗

🔗

Proof.

Since \(\bfv - \proj_U(\bfv) \in U^{\perp}\) by part 3 of Theorem 7.3.5, \(\bfv - \proj_U(\bfv)\) is orthogonal to \(\proj_U(\bfv)\text{,}\) so, using Theorem 7.1.24 we have

\begin{align*} \vnorm{\bfv}^2 \amp = \vnorm{\proj_U(\bfv) + (\bfv - \proj_U(\bfv))}^2\\ \amp = \vnorm{\proj_U(\bfv)}^2 + \vnorm{\bfv - \proj_U(\bfv)}^2\\ \amp \ge \vnorm{\proj_U(\bfv)}^2\text{.} \end{align*}

Equality holds here if and only if \(\vnorm{\bfv - \proj_U(\bfv)}^2 = 0\text{,}\) which is true if and only if \(\bfv = \proj_U(\bfv)\text{.}\) This only happens if \(\bfv \in U\text{.}\)

🔗

We now move on to the second part of the theorem. We know that \({\bfv - \proj_U(\bfv) \in U^{\perp}}\) and that \(\proj_U(\bfv) - \bfu \in U\text{,}\) so

\begin{align*} \vnorm{\bfv - \bfu}^2 \amp = \vnorm{(\bfv - \proj_U(\bfv)) + (\proj_U(\bfv) - \bfu)}^2\\ \amp = \vnorm{(\bfv - \proj_U(\bfv))}^2 + \vnorm{(\proj_U(\bfv) - \bfu)}^2\\ \amp \ge \vnorm{(\bfv - \proj_U(\bfv))}^2 \text{.} \end{align*}

We have equality here if and only if \(\vnorm{(\proj_U(\bfv) - \bfu)}^2 = 0\text{,}\) which happens if and only if \(\proj_U(\bfv) = \bfu\text{;}\) that is, if and only if \(\bfv \in U\text{.}\)

🔗

Note 7.3.10.

This theorem says that, first, orthogonal projections result in a shorter vector. That is, orthogonal projection is a type of contraction. Secondly, \(\proj_U(\bfv)\) is the closest vector in \(U\) to the vector \(\bfv\text{.}\)

🔗

Finding the closest vector to \(\bfv\) in a subspace \(U\) can be thought of as giving the best approximation of \(\bfv\) by elements of \(U\text{.}\) This leads to our application of least squares approximation.

🔗

Subsection 7.3.3 Least Squares Approximation

We consider a set of points \(\{(x_i, y_i)\}\) in \(\rr^2\text{;}\) in practice, these are usually the result of data collection, perhaps a sample of two numeric variables from a population. A graph of such points is called a scatterplot, and we often want to find the “line of best fit” for these data. There are many ways to measure “best fit,” and our method here will be the least squares linear regression technique.

🔗

Define a subspace \(U\) of \(\rr^n\) in the following way, where the \(x_i\) in the definition are the \(x\)-coordinates of the data:

\begin{equation*} U = \left\{ \begin{bmatrix} mx_i + b \end{bmatrix} \in \rr^n \mid m, b \in \rr \right\}\text{.} \end{equation*}

If we let \(\mathbf{1}\) denote the \(n\times 1\) vector where each entry is 1, then

\begin{equation*} U = \{ m\bfx + b\mathbf{1} \mid m, b \in \rr \}\text{,} \end{equation*}

where \(\bfx\) is the vector of all of the first coordinates in our data set.

🔗

The points \((x_i,y_i)\) all lie on a single line \(y = mx+b\) if and only if the vector \(\bfy\) of second coordinates of our data lies in \(U\text{.}\) This does not happen often, as \(U\) is only a two-dimensional subspace of \(\rr^n\text{.}\) So, we want to find the closest point in \(U\) to \(\bfy\)—by Theorem 7.3.9, we can find this through orthogonal projection. When we find \(\proj_U(\bfy)\text{,}\) the \(m\) and \(b\) will give us our equation of the regression line.

🔗

This is called a “least squares” regression, because minimizing the distance from \(\bfy\) to \(U\) involves minimizing a distance. This distance in \(\rr^n\text{,}\) when using the dot product, looks like a sum of squares.

🔗

Example 7.3.11.

Consider the following set of five points in \(\rr^2\text{:}\)

\begin{equation*} \{(2,1), (1,0), (4,4), (4,5), (3,2) \}\text{.} \end{equation*}

Our subspace \(U \subseteq \rr^5\) is spanned by \(\bfx\) and \(\mathbf{1}\text{,}\) where \(\bfx\) is the vector of first coordinates

\begin{equation*} \bfx = \begin{bmatrix} 2 \\ 1 \\ 4 \\ 4 \\ 3 \end{bmatrix}\text{.} \end{equation*}

We form the \(5 \times 2\) matrix \(A\) with columns \(\bfx\) and \(\mathbf{1}\text{.}\) Then, by Proposition 7.3.8, we have

\begin{equation*} [\proj_U]_{\mce} \bfy = A(A^*A)^{-1}A^* \bfy\text{,} \end{equation*}

where \(\bfy\) is the \(5\times 1\) vector of the \(y\)-coordinates of our data.

🔗

We calculate the following:

\begin{equation*} A^*A = \begin{bmatrix} 46 \amp 14 \\ 14 \amp 5 \end{bmatrix}, \hspace{12pt} (A^*A)^{-1} = \begin{bmatrix} \frac{5}{34} \amp -\frac{7}{17} \\[6pt] -\frac{7}{17} \amp \frac{23}{17} \end{bmatrix}\text{.} \end{equation*}

Now, we don’t actually want \(\proj_U(\bfy)\text{,}\) because that is a vector in \(\rr^5\text{.}\) We want to know the coefficients \(m\) and \(b\) in the linear combination of the column vectors of \(A\) which produce \(\proj_U(\bfy)\text{.}\) In other words, we want the vector

\begin{equation*} \bfw = (A^*A)^{-1}A^* \bfy = \begin{bmatrix} \frac{26}{17} \\[6pt] -\frac{32}{17} \end{bmatrix}\text{.} \end{equation*}

Since \(\proj_U(\bfy) = A\bfw\text{,}\) this means that \(m = \frac{26}{17}\) and \(b = -\frac{32}{17}\text{.}\) We can see that this is a believable solution by looking at the graph below which contains the five points as well as the line \(y = \frac{26}{17}x - \frac{32}{17}\text{.}\)

🔗

The graph of five data points and the least-squares regression line. See long description. — Figure 7.3.12. Graph of least-squares regression line with data
🔗

🔗

Reading Questions 7.3.4 Reading Questions

1.

Let \(L\) be the line \(y = \frac{1}{2}x\) in \(\rr^2\text{.}\) (This is a subspace of \(\rr^2\text{.}\))

Calculate an orthonormal basis for \(L\text{.}\) (We are considering \(\rr^2\) with the usual dot product.)
🔗

🔗
Let \(\bfv = \begin{bmatrix} 2 \\ 3 \end{bmatrix}\text{.}\) Using part 2 of Theorem 7.3.5, calculate \(\proj_L(\bfv)\text{.}\)
🔗

🔗

🔗

2.

Consider the same situation as in the first reading question. Using Proposition 7.3.6, find the matrix of \(\proj_L\) with respect to the standard basis \(\mce\) of \(\rr^2\text{.}\)

🔗

Exercises 7.3.5 Exercises

1.

Let \(L\) be the line \(y = \frac{3}{5}x\) in \(\rr^2\text{.}\) Write the vector \(\bfv = \begin{bmatrix} -1 \\ 4 \end{bmatrix}\) as the sum of a vector in \(L\) and a vector in \(L^{\perp}\text{.}\) (Use the standard dot product as the inner product in \(\rr^2\text{.}\))

🔗

Answer.

If we let \(\bfw\) and \(\bfw'\) be the following vectors,

\begin{equation*} \bfw = \begin{bmatrix} 35/34\\ 21/34 \end{bmatrix}, \hspace{6pt} \text{and} \hspace{6pt} \bfw' = \begin{bmatrix} -69/34 \\ 115/34 \end{bmatrix}\text{,} \end{equation*}

then \(\bfv = \bfw + \bfw'\) where \(\bfw \in L\) and \(\bfw' \in L^{\perp}\text{.}\)

🔗

2.

Let \(U = \spn\{\bfv_1, \bfv_2 \}\text{,}\) where

\begin{equation*} \bfv_1 = \begin{bmatrix} 1 \\ -2 \\ 4 \end{bmatrix}, \hspace{6pt} \bfv_2 = \begin{bmatrix} 0 \\ 1 \\ 3 \end{bmatrix}\text{.} \end{equation*}

Find the matrix \([\proj_U]_{\mce}\text{.}\) (Use the standard dot product as the inner product in \(\rr^3\text{.}\))
🔗

🔗
Using your work from part (a), find the vector in \(U\) which is closest to \(\bfv\text{,}\) if

\begin{equation*} \bfv = \begin{bmatrix} -2 \\ 0 \\ 5 \end{bmatrix}\text{.} \end{equation*}

🔗

🔗

🔗

Answer.

Our matrix is

\begin{equation*} [\proj_U]_{\mce} = \begin{bmatrix} \frac{1}{11} \amp -\frac{3}{11} \amp \frac{1}{11} \\[2pt] -\frac{3}{11} \amp \frac{101}{110} \amp \frac{3}{110} \\[2pt] \frac{1}{11} \amp \frac{3}{110} \amp \frac{109}{110} \end{bmatrix}\text{.} \end{equation*}

🔗

🔗
The vector in \(U\) which is closest to \(\bfv\) is

\begin{equation*} \begin{bmatrix} \frac{3}{11} \\[2pt] \frac{15}{22} \\[2pt] \frac{105}{22} \end{bmatrix}\text{.} \end{equation*}

🔗

🔗

🔗

3.

Consider the following inner product on \(P_2\text{:}\)

\begin{equation*} \ip{p, q} = p(-1)q(-1) + p(0)q(0) + p(1)q(1)\text{.} \end{equation*}

Let \(U = \spn\{t - t^2, 1 + 2t\}\text{.}\) If \(p = 2 - t + 2t^2\text{,}\) write \(p\) as the sum of a vector in \(U\) and a vector in \(U^{\perp}\text{.}\)

🔗

Answer.

Let \(q\) and \(q'\) be the following polynomials:

\begin{align*} q \amp = \tfrac{11}{10} - \tfrac{17}{20}t + \tfrac{61}{20}t^2, \hspace{3pt} \text{and}\\ q' \amp = \tfrac{9}{10} - \tfrac{3}{20}t - \tfrac{21}{20}t^2\text{.} \end{align*}

Then \(p = q + q'\text{,}\) where \(q \in U\) and \(q' \in U^{\perp}\text{.}\)

🔗

4.

Consider the following four points in \(\rr^2\text{:}\)

\begin{equation*} (-1, -1), (1,2), (2, 0.5), (-0.75, 1)\text{.} \end{equation*}

Find the least-squares regression line for these points.

🔗

Writing Exercises

5.

Let \(U\) be a subspace of an inner product space \(V\text{.}\) Prove that \(U^{\perp}\) is a subspace of \(V\text{.}\)

🔗

Solution.

First, we know that \(\ip{\bfo, \bfu} = 0\) for all \(\bfu \in U\) by Proposition 7.1.17. This proves that \(\bfo \in U^{\perp}\text{.}\)

🔗

Next, we let \(\bfw_1, \bfw_2 \in U^{\perp}\) and \(\bfu \in U\text{.}\) Then, by the properties of the inner product, we have

\begin{equation*} \ip{\bfw_1 + \bfw_2, \bfu} = \ip{\bfw_1, \bfu} + \ip{\bfw_2, \bfu} = 0 + 0 = 0\text{.} \end{equation*}

This shows that \(\bfw_1 + \bfw_2 \in U^{\perp}\text{,}\) meaning that \(U^{\perp}\) is closed under addition.

🔗

Finally, we let \(\bfw \in U^{\perp}\) and \(c \in \ff\text{.}\) (Here, \(\ff\) is either \(\rr\) or \(\cc\text{.}\)) We also let \(\bfu \in U\text{.}\) Then, by the properties of the inner product, we have

\begin{equation*} \ip{c\bfw, \bfu} = c\ip{\bfw, \bfu} = c \cdot 0 = 0\text{.} \end{equation*}

This proves that \(c\bfw \in U^{\perp}\) so that \(U^{\perp}\) is closed under scalar multiplication.

🔗

Since \(U^{\perp}\) contains the zero vector and is closed under addition and scalar multiplication, \(U^{\perp}\) is a subspace of \(V\text{.}\)

🔗

6.

Let \(A \in M_{m,n}(\rr)\text{.}\) Prove that \((\row(A))^{\perp} = \nll(A)\text{.}\)

🔗

Solution.

We will prove that each of these sets is a subset of the other. First, we let \(\bfx \in \nll(A)\text{,}\) so that \(A\bfx = \bfo\text{.}\) The fact that \(A\bfx = \bfo\) means that \(\mathbf{r} \cdot \bfx = 0\) for all rows \(\mathbf{r}\) of \(A\text{.}\) (Entry \(i\) in \(A\bfx\) is the dot product of the \(i\)th row of \(A\) with \(\bfx\text{.}\)) Since the rows of \(A\) span \(\row(A)\text{,}\) the fact that \(\mathbf{r} \cdot \bfx = 0\) for each row \(\mathbf{r}\) of \(A\) means that \(\bfx \cdot \bfv = 0\) for all \(\bfv \in \row(A)\text{.}\) This proves that \(\bfx \in (\row(A))^{\perp}\text{,}\) so \(\nll(A) \subseteq (\row(A))^{\perp}\text{.}\)

🔗

We now let \(\bfx \in (\row(A))^{\perp}\text{.}\) We want to show that \(\bfx \in \nll(A)\text{.}\) Since \(\bfx \in (\row(A))^{\perp}\text{,}\) we know that \(\bfx \cdot \mathbf{r}=0\) for each row \(\mathbf{r}\) of \(A\text{.}\) This shows that \(A\bfx = \bfo\text{,}\) which proves that \(\bfx \in \nll(A)\text{.}\) Therefore, \((\row(A))^{\perp} \subseteq \nll(A)\text{.}\)

🔗

Since we have shown that \(\nll(A) \subseteq (\row(A))^{\perp}\) and \((\row(A))^{\perp} \subseteq \nll(A)\text{,}\) we can conclude that \((\row(A))^{\perp} = \nll(A)\text{,}\) as desired.

🔗

Prev Top Next