Skip to main content

Section 3.1 Linear Transformations

Speaking broadly, mathematicians are often concerned about (mathematical) objects and the right sort of functions between those objects. The structure of specific objects can be illuminated by a look at the functions to and from those objects. In linear algebra, the objects in view are vector spaces (see DefinitionΒ 2.3.1), and the functions between these objects are called linear transformations.

Subsection 3.1.1 Introduction to Linear Transformations

Definition 3.1.1.

If \(V\) and \(W\) are vector spaces over a field \(\ff\text{,}\) then a function \(T:V \to W\) is called a linear transformation if both of the following properties hold.
  • For all \(\bfu, \bfv \in V\text{,}\) we have \(T(\bfu+\bfv) = T(\bfu) + T(\bfv)\text{.}\)
  • For all \(\bfv \in V\) and all \(c \in \ff\text{,}\) we have \(T(c\bfv) = cT(\bfv)\text{.}\)
These functions are sometimes referred to as linear maps or linear operators.
If \(T:V \to W\) is a linear transformation, then \(V\) is the domain of \(T\) and \(W\) is the codomain of \(T\text{.}\)

Note 3.1.2.

Many readers will be more familiar with the idea of the range of a function than the codomain of a function. The range of a linear transformation \(T:V \to W\) is the set \(\{T(\bfv) \in W \mid \bfv \in V\}\text{.}\) In words, the range is the subset of the codomain consisting of the β€œoutputs” of the function for all elements of the domain. We will often use the term image when discussing the range of a linear transformation.
Linear transformations are the β€œright” types of functions to study between vector spaces because they preserve the primary vector space operations. The first property of linear transformations means that such a function respects vector addition, and the second property means that such a function respects scalar multiplication.

Example 3.1.3.

We consider the real vector spaces \(P_5\) and \(P_4\text{,}\) along with the function \(D:P_5 \to P_4\) which takes the derivative. That is, \(D(p) = p'\) for all \(p \in P_5\text{.}\) So if \(p = 3t^5-2t^3+10t\text{,}\) then \(D(p) = 15t^4 - 6t^2+10\text{.}\) We note that \(p \in P_5\) and \(D(p) \in P_4\text{.}\)
The fact that our function \(D\) is a linear transformation between these vector spaces is a consequence of calculus. For all differentiable functions \(f\) and \(g\text{,}\) and all real numbers \(c\text{,}\) it is true that
\begin{align*} [f+g]' \amp = f'+g' \\ [cf]' \amp = cf' \text{.} \end{align*}
(If the reader doubts or has forgotten these facts, the closest textbook on single-variable calculus should be consulted posthaste.)
These calculus facts confirm that \(D(p+q) = D(p) + D(q)\) and \(D(cp) = cD(p)\) for all \(p,q \in P_5\) and all \(c \in \rr\text{.}\) This proves that \(D:P_5 \to P_4\) is a linear transformation.

Example 3.1.4.

Let \(T:\rr^2 \to \rr^2\) be the function which reflects a vector in the Cartesian plane across the \(x\)-axis. So \(T(x,y)=(x,-y)\text{.}\) Additionally, let \(S:\rr^2\to\rr^2\) be the function which rotates a vector counter-clockwise around the origin by \(\frac{\pi}{2}\) radians. So \(S(x,y)=(-y,x)\text{.}\) Then both \(T\) and \(S\) are linear transformations.
We will supply two calculations here to give the sense of these functions. The reader should note that \(T\) takes the vector \((-3,2)\) in the second quadrant and reflects it across the \(x\)-axis to the vector \((-3,-2)\) in the third quadrant. Also, \(S\) rotates the vector \((-3,2)\) counter-clockwise around the origin by \(\frac{\pi}{2}\) radians to the vector \((-2,-3)\text{.}\) (It is fairly obvious that the length of the vectors \((-3,2)\) and \((-2,-3)\) are the same. To check the claim about the angles, one would calculate the angles between the positive \(x\)-axis and both the vectors \((-3,2)\) and \((-2,-3)\text{.}\) The first angle is roughly \(2.55\) radians and the second is \(4.12\text{,}\) giving a difference of \(1.57\) radians, or roughly \(\frac{\pi}{2}\text{.}\))
We first check the additivity condition. Let \((x_1,y_1), (x_2,y_2) \in \rr^2\text{.}\) Then we have
\begin{align*} T((x_1,y_1)+(x_2,y_2)) \amp = T(x_1+x_2,y_1+y_2) = (x_1+x_2,-(y_1+y_2)) \\ T(x_1,y_1) + T(x_2,y_2) \amp = (x_1,-y_1) + (x_2,-y_2) = (x_1+x_2,-y_1-y_2) \text{.} \end{align*}
From the distributive property of the real numbers (in the second coordinate of these calculations), we can see that the additive property holds for \(T\text{.}\) (The calculation for \(S\) is similar.)
We now check the scalar multiplication property. (Again, the calculations for \(T\) and \(S\) are similar, so we will only show one of them.) Let \(c \in \rr\) and let \((x,y) \in \rr^2\text{.}\) Then we have
\begin{align*} S(c(x,y)) \amp = S(cx,cy) = (-cy,cx) \\ cS(x,y) \amp = c(-y,x) = (-cy,cx)\text{.} \end{align*}
Note that we used the commutativity of multiplication in \(\rr\) in this calculation.
These brief calculations show that both \(T\) and \(S\) are linear transformations.

Subsection 3.1.2 Linear Transformations and Matrices

While linear algebra is not only about matrices, matrices are valuable tools and provide a rich source of examples in this subject. In fact, matrices are so central to the notion of linear transformations that we will devote this subsection to their discussion.

Example 3.1.5.

Let \(\ff\) be a field and let \(A\) be an \(m\times n\) matrix with entries from \(\ff\text{.}\) (We will refer to this in what follows as β€œa matrix over \(\ff\text{.}\)”) Then multiplication by \(A\) is a linear transformation from \(\ff^n\) to \(\ff^m\text{.}\) (We will denote the function which is multiplication by \(A\) by \(T_A:\ff^n \to \ff^m\text{.}\))
To justify this claim we must first explain what we mean by β€œmultiplication by \(A\text{.}\)” We will let \(\bfv \in \ff^n\) and denote entry \((i,j)\) in \(A\) by \(a_{ij}\text{.}\) We will further denote the entries of \(\bfv\) by
\begin{equation*} \bfv = \left[\begin{array}{@{}c@{}} v_1 \\ \vdots \\ v_n \end{array}\right]\text{.} \end{equation*}
Then the matrix-vector product \(A\bfv\) is defined to be the following vector in \(\ff^m\text{:}\)
\begin{equation} A\bfv = \left[\begin{array}{@{}c@{}} a_{11}v_1 + \cdots + a_{1n}v_n \\ \vdots \\ a_{m1}v_1 + \cdots + a_{mn}v_n \end{array}\right]\text{.}\tag{3.1} \end{equation}
One way to state this is that entry \(j\) in \(A\bfv\) is the sum of the entry-wise product of row \(j\) in \(A\) with \(\bfv\text{.}\) Since \(A\bfv\) is an element of \(\ff^m\text{,}\) the domain and codomain of \(T_A\) are correct.
What we have defined is the product of a matrix and a vector. However, an alternate description of this product will be more useful in proving that \(T_A\) is a linear transformation.
If the columns of \(A\) are thought of as vectors \(\mathbf{a}_1, \ldots, \mathbf{a}_n\text{,}\) then the product \(A\bfv\) is also
\begin{equation} A\bfv = v_1\mathbf{a}_1 + \cdots + v_n\mathbf{a}_n = \sum_{i=1}^n v_i\mathbf{a}_i\text{.}\tag{3.2} \end{equation}
In words, \(A\bfv\) is a linear combination of the columns of \(A\) with weights coming from the entries of \(\bfv\text{.}\) (We have reserved proving the equivalence of these two formulations to ExerciseΒ 3.1.7.16.)
With this equivalent definition, proving that \(T_A\) is a linear transformation is a snap. Let \(\bfu\) and \(\bfv\) be vectors in \(\ff^n\) and let \(c \in \ff\text{.}\) We will further denote the entries of \(\bfu\) and \(\bfv\) by
\begin{equation*} \bfu = \left[\begin{array}{@{}c@{}} u_1 \\ \vdots \\ u_n \end{array}\right] \hspace{.3in} \text{and} \hspace{.3in} \bfv = \left[\begin{array}{@{}c@{}} v_1 \\ \vdots \\ v_n \end{array}\right]\text{.} \end{equation*}
Then we have the following:
\begin{align*} T_A(\bfu + \bfv) \amp = A(\bfu+\bfv) = \sum_{i=1}^n(u_i+v_i)\mathbf{a}_i\\ T_A(\bfu) + T_A(\bfv) \amp = A\bfu + A\bfv = \sum_{i=1}^n u_i\mathbf{a}_i + \sum_{i=1}^n v_i\mathbf{a}_i\text{.} \end{align*}
These two expressions are equal due to the fact that \(\ff^m\) is a vector space.
We have one final calculation to prove that \(T_A\) is a linear transformation. Let \(\bfv\) be a vector in \(\ff^n\) and let \(c\) be in \(\ff\text{.}\) Then we have
\begin{align*} T_A(c\bfv) \amp = A(c\bfv) = \sum_{i=1}^n (cv_i)\mathbf{a}_i\\ cT_A(\bfv) \amp = c A(\bfv) = c \sum_{i=1}^n v_i\mathbf{a}_i. \end{align*}
Once again, thsse expressions are equal because \(\ff^m\) is a vector space.
These calculations prove that \(T_A\) is a linear transformation.

Note 3.1.6.

To summarize, when \(\ff\) is a field, multiplication by an \(m\times n\) matrix \(A\) is a linear transformation \(T_A:\ff^n \to \ff^m\text{.}\)
General matrices are rectangular, not necessarily square. When a matrix is square, however, we have additional properties to discuss.

Definition 3.1.7.

Let \(A\) be an \(n\times n\) matrix. (So \(A\) is square.) We say that \(A\) is a diagonal matrix if \(a_{ij} = 0\) for all \((i,j)\) such that \(i \neq j\text{.}\) If \(A\) is diagonal and \(a_{ii}=1\) for all \(i = 1,\ldots,n\text{,}\) then \(A\) is called an identity matrix.
The next example shows what the linear transformation \(T_A\) is like when \(A\) is a diagonal or identity matrix.

Example 3.1.8.

Let \(A\) and \(B\) be the following matrices over \(\rr\text{:}\)
\begin{equation*} A = \begin{bmatrix} 2 \amp 0 \amp 0 \\ 0 \amp -1 \amp 0 \\ 0 \amp 0 \amp -3 \end{bmatrix} \hspace{.3in} B = \begin{bmatrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{bmatrix}\text{.} \end{equation*}
We note that both \(A\) and \(B\) are diagonal matrices and that \(B\) is an identity matrix.
Building on ExampleΒ 3.1.5, both \(T_A\) and \(T_B\) are linear transformations from \(\rr^3\) to \(\rr^3\text{.}\) Let \(\bfx\) be an element of \(\rr^3\) where
\begin{equation*} \bfx = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}\text{.} \end{equation*}
Then we have
\begin{equation*} T_A(\bfx) = \begin{bmatrix} 2 \amp 0 \amp 0 \\ 0 \amp -1 \amp 0 \\ 0 \amp 0 \amp -3 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 2x_1 \\ -x_2 \\ -3x_3 \end{bmatrix}\text{.} \end{equation*}
We also have
\begin{equation*} T_B(\bfx) = \begin{bmatrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \bfx\text{.} \end{equation*}
Since \(T_B(\bfx) = \bfx\) for all \(\bfx \in \rr^3\text{,}\) the linear transformation \(T_B\) acts as the identity function on \(\rr^3\text{.}\) The action of \(T_A\) on \(\rr^3\) is slightly more complex, but only by a bit. Because of all the zeros in \(A\text{,}\) the effect of the transformation \(T_A\) is just multiplication of each coordinate of a vector in \(\rr^3\) by the corresponding diagonal entry in \(A\text{.}\) What we observe for multiplication by a diagonal or identity matrix in \(M_3(\rr)\) can easily be extended to \(M_n(\ff)\) for any \(n \in \nn\) and any field \(\ff\text{.}\)

Note 3.1.9.

We often use the notation \(\mathbf{e}_1,\ldots, \mathbf{e}_n\) to refer to the columns of the \(n\times n\) identity matrix. In other words, \(\mathbf{e}_j\) is the vector with a \(1\) in entry \(j\) and zeros elsewhere.

Subsection 3.1.3 Properties of Linear Transformations

Recall that while linear transformations must have special properties, they are first of all functions. And, as functions, we can discuss the composition of linear transformations as well as properties like injectivity and surjectivity.

Definition 3.1.10.

If \(T:U \to V\) and \(S:V \to W\) are linear transformations between vector spaces, then the function \(S \circ T:U \to W\) defined by \({(S \circ T)(\bfu) = S(T(\bfu))}\) for each \(\bfu \in U\) is the composition of the transformations \(T\) and \(S\text{.}\)

Definition 3.1.12.

Let \(T:V \to W\) be a linear transformation between vector spaces. We say that \(T\) is injective if \(T(\bfv_1)=T(\bfv_2)\) implies \(\bfv_1=\bfv_2\) for all \(\bfv_1, \bfv_2 \in V\text{.}\) Injective linear transformations are also referred to as one-to-one since no two distinct elements of the domain may correspond to the same element of the image.
 5 
Recall that by β€œimage” we mean β€œrange”.
.
A linear transformation \(T\) is called surjective if for every \(\bfw \in W\) there exists a vector \(\bfv \in V\) such that \(T(\bfv) = \bfw\text{.}\) For surjective functions, the image is the same as the codomain. (The image is a subset of the codomain for every function, but these sets are equal if and only if the function is surjective.) Sometimes surjective functions are referred to as onto functions.
If a linear transformation is both injective and surjective, we say that it is bijective.

Example 3.1.13.

Let’s reconsider the linear transformation \(D:P_5 \to P_4\) which appeared in ExampleΒ 3.1.3. We observe that \(D\) is surjective but not injective.
The transformation is surjective because we know about the antiderivative. Let \(q \in P_4\) have the form
\begin{equation*} q(t) = a_4t^4 + a_3t^3 + a_2t^2 + a_1t + a_0\text{.} \end{equation*}
This is a generic element of \(P_4\text{,}\) so we only need to supply an element \(p \in P_5\) such that \(D(p)=q\text{,}\) and this will prove that \(D\) is surjective. Consider the element \(p\) defined as
\begin{equation*} p(t) = \tfrac{1}{5}a_4t^5 + \tfrac{1}{4}a_3t^4 + \tfrac{1}{3}a_2t^3 + \tfrac{1}{2}a_1t^2 + a_0t\text{.} \end{equation*}
It is but the work of a Calculus I student to verify that \(D(p)=q\text{,}\) thus showing that \(D\) is surjective. (We note that we could have chosen \(p\) to have any constant term at all; we used the constant term of \(0\text{.}\))
Finally, we will show that \(D\) is not injective by looking at an example of two elements of \(P_5\) which have the same image under \(D\) in \(P_4\text{.}\) Let \({p_1 = t^2 + 10}\) and \(p_2 = t^2 + 20\text{.}\) Then we see that even though \(p_1 \neq p_2\text{,}\) we have \({D(p_1)=D(p_2)=2t}\text{,}\) and this proves that \(D\) is not injective.
We will define one more property of linear transformations here that will resurface in SectionΒ 3.2.

Definition 3.1.14.

Let \(T:V \to W\) be a linear transformation between vector spaces. The identity transformation on \(V\) is the linear transformation \({I_V:V \to V}\) which is \(I_V(\bfv)=\bfv\) for each \(\bfv \in V\text{.}\) (If the vector space we have in mind is clear, we will drop the subscript and use the notation \(I\text{.}\))
We say that the linear transformation \(T\) is invertible if there exists a linear transformation \(S:W \to V\) such that \(S \circ T = I_V\text{.}\)

Subsection 3.1.4 Isomorphisms

Bijective functions are important in almost all settings, and the linear algebra setting is no exception. We have a specific name for bijective linear transformations.

Definition 3.1.15.

A bijective linear transformation \(T\) between vector spaces \(V\) and \(W\) is called an isomorphism. If there exists an isomorphism between vector spaces \(V\) and \(W\text{,}\) then these spaces are said to be isomorphic.
The reader should think of isomorphic vector spaces as essentially the same. Such spaces will not be exactly the same, of course, in the same way that two finite sets of the same size are not necessarily identical. But the presence of an isomorphism means that the vector space operations are compatible in such a way that such spaces share many of the same properties.

Example 3.1.16.

The vector spaces \(\rr^2\) and \(P_1\) are isomorphic. To prove this, we consider the function \(T:\rr^2 \to P_1\) defined by
\begin{equation*} T \left( \begin{bmatrix} a \\ b \end{bmatrix}\right) = a + bt\text{.} \end{equation*}
To justify the claim that \(\rr^2\) and \(P_1\) are isomorphic, we must prove that \(T\) is a bijective linear transformation. We will prove that \(T\) is a linear transformation and ask the reader to prove bijectivity in the exercises.
We let \(\bfx_1\) and \(\bfx_2\) be elements of \(\rr^2\) where
\begin{equation*} \bfx_1 = \begin{bmatrix} a_1 \\ b_1 \end{bmatrix}, \hspace{.3in} \bfx_2 = \begin{bmatrix} a_2 \\ b_2 \end{bmatrix}\text{.} \end{equation*}
Then we have
\begin{equation*} T(\bfx_1 + \bfx_2) = T \left(\begin{bmatrix} a_1 + a_2 \\ b_1 + b_2 \end{bmatrix}\right) = (a_1 + a_2) + (b_1 + b_2)t\text{,} \end{equation*}
and
\begin{equation*} T(\bfx_1) + T(\bfx_2) = (a_1 + b_1t) + (a_2 + b_2)t\text{.} \end{equation*}
The properties \(P_1\) possesses as a vector space show that
\begin{equation*} T(\bfx_1 + \bfx_2) = T(\bfx_1) + T(\bfx_2)\text{.} \end{equation*}
We now suppose that \(\bfx \in \rr^2\) and \(c \in \rr\text{,}\) where
\begin{equation*} \bfx = \begin{bmatrix} a \\ b \end{bmatrix}\text{.} \end{equation*}
Then
\begin{align*} T(c\bfx) \amp = T\left( \begin{bmatrix} ca \\ cb \end{bmatrix}\right) = ca + cbt\\ cT(\bfx) \amp = c(a + bt) = ca + cbt\text{.} \end{align*}
Again, we see that \(T(c\bfx) = cT(\bfx)\) using the properties of \(P_1\) as a vector space.
We have shown that \(T\) is a linear transformation. It is also bijective (see ExerciseΒ 3.1.7.18), so it is an isomorphism, meaning that \(\rr^2\) and \(P_1\) are isomorphic.

Note 3.1.17.

If \(V\) and \(W\) are vector spaces, then the set of all linear transformations from \(V\to W\) is denoted \(L(V,W)\text{.}\) When \(W=V\text{,}\) we will write \(L(V)\) instead of \(L(V,V)\text{.}\)
We can now prove that two concepts we have defined in this section are one and the same for linear transformations.

Proof.

This fact is true for functions without any of the linear transformation properties being involved. (A function is bijective if and only if it has an inverse.)

Proof.

We will check the two properties of a linear transformation. (See DefinitionΒ 3.1.1.) Suppose that \(\bfw_1, \bfw_2 \in W\text{.}\) Since \(T \circ T^{-1} = I_W\text{,}\) we have
\begin{equation*} \bfw_1 + \bfw_2 = T(T^{-1}(\bfw_1)) + T(T^{-1}(\bfw_2)) = T(T^{-1}(\bfw_1) + T^{-1}(\bfw_2))\text{.} \end{equation*}
When we apply \(T^{-1}\) to the beginning and end of this equality, using \({T^{-1}\circ T = I_V}\text{,}\) we get
\begin{equation*} T^{-1}(\bfw_1 + \bfw_2) = T^{-1}(\bfw_1) + T^{-1}(\bfw_2)\text{.} \end{equation*}
We will now check the scalar multiple property in a similar fashion. Let \(\bfw \in W\) and let \(c \in \ff\text{.}\) Then we have
\begin{equation*} c\bfw = cT(T^{-1}(\bfw)) = T(cT^{-1}(\bfw))\text{.} \end{equation*}
Applying \(T^{-1}\) to both sides again we get
\begin{equation*} T^{-1}(c\bfw) = cT^{-1}(\bfw)\text{.} \end{equation*}
This proves that \(T^{-1} \in L(W,V)\text{.}\)
Before we leave this subsection, it is worth pointing out that when \(V\) and \(W\) are vector spaces, the set \(L(V,W)\) itself has some important structure.
When \(V\) and \(W\) are vector spaces over \(\ff\text{,}\) we can define the sum and scalar multiple of linear transformations since both of these operations happen on the level of elements. If \(S, T \in L(V,W)\) and \(c \in \ff\text{,}\) then we define \(S+T\) and \(cT\) in the following way. For all \(\bfv \in V\text{,}\)
These operations as defined make \(L(V,W)\) into its own vector space. We will leave the proof of this theorem for the exercises.

Subsection 3.1.5 The Matrix-Vector Form of a Linear System

Having defined the product of a matrix and a vector in ExampleΒ 3.1.5, we can reformulate one of the foundational (and introductory) matters of this book. We will now put the notion of a linear systemβ€”in particular, the solutions to linear systemsβ€”in a different context.
Let’s consider the following system of linear equations over a field \(\ff\text{,}\) as we saw in SectionΒ 2.2:
\begin{align} a_{11}x_1 + \cdots + a_{1n}x_n \amp = b_1\notag\\ a_{21}x_1 + \cdots + a_{2n}x_n \amp = b_2\notag\\ \vdots \hspace{10pt} \amp\phantom{ = } \hspace{6pt} \vdots \notag\\ a_{m1}x_1 + \cdots + a_{mn}x_n \amp = b_m\text{.}\notag \end{align}
If we let \(A\) be the matrix \(A = [a_{ij}]\text{,}\) \(\bfx\) be the vector of variables \(\bfx=[x_j]\text{,}\) and \(\mathbf{b}\) be the vector of constants \(\mathbf{b} = [b_j]\text{,}\) then this linear system can be written efficiently as \(A\bfx = \bfb\text{.}\)
With this reformulation, the questions of the existence and uniqueness of solutions to a system of equations (see the end of SectionΒ 1.1) can now be stated in the language of the injectivity and surjectivity of linear transformations.

Example 3.1.21.

Consider the linear transformation \(T_A:\rr^3 \to \rr^3\) which is multiplication by this matrix:
\begin{equation*} A = \begin{bmatrix} 2 \amp 3 \amp 2 \\ 1 \amp -2 \amp 8 \\ -1 \amp 4 \amp -12 \end{bmatrix}\text{.} \end{equation*}
We will show that \(T_A\) is neither injective nor surjective.
Let \(\bfu\) and \(\bfv\) be the following vectors in \(\rr^3\text{:}\)
\begin{equation*} \bfu = \begin{bmatrix} -1 \\ -2 \\ 3 \end{bmatrix}, \hspace{12pt} \bfv = \begin{bmatrix} 7 \\ 0 \\ 2 \end{bmatrix}\text{.} \end{equation*}
By forming and row-reducing the augmented matrices \([A \mid \bfu]\) and \([A \mid \bfv]\text{,}\) we can determine how many solutions there are to the equations \(T_A(\bfx) = \bfu\) and \(T_A(\bfx)=\bfv\text{,}\) respectively. Here are the calculations:
\begin{equation} \left[\begin{array}{@{}c|c@{}} A \amp \bfu \end{array}\right] \sim \left[\begin{array}{@{}ccc|c@{}} 1 \amp 0 \amp 4 \amp 0 \\ 0 \amp 1 \amp -2 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \end{array}\right]\text{,}\tag{3.3} \end{equation}
\begin{equation} \left[\begin{array}{@{}c|c@{}} A \amp \bfv \end{array}\right] \sim \left[\begin{array}{@{}ccc|c@{}} 1 \amp 0 \amp 4 \amp 2 \\ 0 \amp 1 \amp -2 \amp 1 \\ 0 \amp 0 \amp 0 \amp 0 \end{array}\right]\text{.}\tag{3.4} \end{equation}
From (3.3), since there is a pivot in the final column of the RREF of \([A \mid \bfu]\text{,}\) we see that \(\bfu\) is not in the image of \(T_A\text{.}\) This means that the matrix equation \(A\bfx = \bfu\) has no solution, so \(T_A\) is not surjective; equivalently, the linear system which corresponds to the augmented matrix \([A \mid \bfu]\) is inconsistent.
From (3.4), we see that \(\bfv\) is in the image of \(T_A\text{.}\) Since there is no pivot in the final column of the RREF of \([A \mid \bfv]\text{,}\) and since there is a free variable in that same RREF, this means that the matrix equation \(A\bfx = \bfv\) has multiple solutions, so \(T_A\) is not injective. Specifically, if
\begin{equation*} \bfx_1 = \begin{bmatrix} 2 \\ 1 \\ 0 \end{bmatrix}, \hspace{12pt} \text{and} \hspace{12pt} \bfx_2 = \begin{bmatrix} -2 \\ 3 \\ 1 \end{bmatrix}\text{,} \end{equation*}
then we have both \(T_A(\bfx_1)=\bfv\) and \(T_A(\bfx_2)=\bfv\text{.}\) (The vector \(\bfx_1\) results from setting the free variable equal to \(0\text{,}\) and we obtain \(\bfx_2\) by setting the free variable equal to \(1\text{.}\)) Finally, we note that the linear system which corresponds to the augmented matrix \([A \mid \bfv]\) is consistent with many solutionsβ€”that is, a solution is not unique.

Reading Questions 3.1.6 Reading Questions

1.

For each of the following, determine the number of rows and columns that a matrix would have if multiplication by that matrix is a linear transformation with the given domain and codomain.
  1. domain: \(\rr^2\text{,}\) codomain: \(\rr^3\)
  2. domain: \(\qq^4\text{,}\) codomain: \(\qq^2\)

2.

Let \(A\text{,}\) \(\bfu\text{,}\) \(\mathbf{b}\text{,}\) and \(\mathbf{c}\) be defined as follows:
\begin{equation*} A = \begin{bmatrix} 2 \amp 0 \amp 6 \\ 1 \amp 3 \amp 6 \\ -1 \amp 5 \amp 2 \end{bmatrix}, \hspace{6pt} \bfu = \begin{bmatrix} 3 \\ -2 \\ -1 \end{bmatrix}, \hspace{6pt} \mathbf{b} = \begin{bmatrix} 3 \\ 6 \\ 6 \end{bmatrix}, \hspace{6pt} \mathbf{c} = \begin{bmatrix} -1 \\ -4 \\ 3 \end{bmatrix}\text{.} \end{equation*}
Define a linear transformation \(T_A:\rr^3 \to \rr^3\) to be multiplication by \(A\text{.}\)
  1. Find \(T_A(\bfu)\text{.}\)
  2. Find an \(\bfx\) in \(\rr^3\) such that \(T_A(\bfx)=\mathbf{b}\text{.}\)
  3. Is there more than one \(\bfx\) whose image under \(T_A\) is \(\mathbf{b}\text{?}\) How do you know?
  4. Determine whether or not \(\mathbf{c}\) is in the image of \(T_A\text{.}\) (Reminder: by β€œimage” here we mean β€œrange”.)

Exercises 3.1.7 Exercises

1.

Consider the function \(T:P_2 \to P_3\) defined by \(T(p) = tp\text{.}\) (So, for example, \(T(2+t)=2t+t^2\text{.}\)) Is \(T\) a linear transformation? Justify your answer.

2.

Consider the function \(T:P_2 \to P_2\) defined by \(T(p) = p(0)+p(1)t+p(2)t^2\text{.}\) Is \(T\) a linear transformation? Justify your answer.
Answer.
Yes, \(T\) is a linear transformation.

3.

Consider the function \(T:P_2 \to P_1\) defined by \(T(p) = p(0) + p'(0)t\text{.}\) Is \(T\) a linear transformation? Justify your answer.

4.

Let \(T:\ff_5^3 \to \ff_5^2\) be the function defined by \(T(x,y,z)=(3x+y-2z, -xy)\text{.}\) Is \(T\) a linear transformation? Justify your answer.
Answer.
No, \(T\) is not a linear transformation. Consider the following example:
\begin{align*} T(2(1,1,0)) \amp = T(2,2,0) = (6+2-0,-4) \equiv (3,1)\\ 2T(1,1,0) \amp = 2(3+1-0,-1) \equiv 2(4,4) \equiv (3,3)\text{.} \end{align*}
This shows that the scalar multiplication property does not hold, as \(2T(1,1,0) \neq T(2(1,1,0))\text{.}\)

5.

Let \(T:\rr^2 \to \rr^3\) be the function defined by \(T(x,y) = (2x, x-3y, 0)\text{.}\) Is \(T\) a linear transformation? Justify your answer.

6.

Consider the following matrix over \(\ff_3\text{:}\)
\begin{equation*} A = \begin{bmatrix} 2 \amp 1 \amp 1 \\ 0 \amp 1 \amp 2 \\ 1 \amp 2 \amp 2 \end{bmatrix}\text{.} \end{equation*}
For each of the following vectors \(\bfv\text{,}\) calculate the matrix-vector product \(A\bfv\text{.}\)
  1. \(\displaystyle \bfv = (1,1,0)\)
  2. \(\displaystyle \bfv = (2,1,2)\)
  3. \(\displaystyle \bfv = (0,2,1)\)
Answer.
  1. \(\displaystyle A\bfv = (0,1,0)\)
  2. \(\displaystyle A\bfv = (1,2,2)\)
  3. \(\displaystyle A\bfv = (0,1,0)\)

7.

Consider the following matrix over \(\rr\text{:}\)
\begin{equation*} A = \begin{bmatrix} -1 \amp 2 \amp 3 \\ -2 \amp 5 \amp 0 \end{bmatrix}\text{.} \end{equation*}
  1. If \(T\) is the linear transformation which is multiplication by \(A\text{,}\) what are the domain and codomain of \(T\text{?}\)
  2. Calculate the image of the vector \(\bfv = (-3, 1, 4)\) under the linear transformation \(T\text{.}\)
  3. Is the vector \(\bfw = (-2,-1)\) in the image of \(T\text{?}\) Explain your answer.

8.

Let \(A\) be the following matrix over \(\rr\text{:}\)
\begin{equation*} A = \begin{bmatrix} 3 \amp -2 \\ 1 \amp 4 \\ -1 \amp 0 \end{bmatrix}\text{.} \end{equation*}
Let \(T\) be the linear transformation which is multiplication by \(A\text{.}\)
  1. Is the vector \((1,1,1)\) in the image of \(T\text{?}\) Explain your answer.
  2. Is \(T\) surjective? Explain your answer.
Answer.
  1. No, since
    \begin{equation*} \left[\begin{array}{@{}cc|c@{}} 3 \amp -2 \amp 1 \\ 1 \amp 4 \amp 1 \\ -1 \amp 0 \amp 1 \end{array}\right] \sim \begin{bmatrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{bmatrix}\text{.} \end{equation*}
    The pivot in the final column shows that there is no solution to the matrix-vector equation \(A\bfx = \bfv\text{,}\) where \(\bfv = (1,1,1)\text{.}\)
  2. No. Since in part a we identified a vector which is not in the image of \(T\text{,}\) this means that \(T\) cannot be surjective.

9.

Let \(A\) be the following matrix over \(\ff_7\text{:}\)
\begin{equation*} A = \begin{bmatrix} 2 \amp 0 \amp 4 \\ 4 \amp 3 \amp 5 \\ 5 \amp 1 \amp 2 \end{bmatrix}\text{.} \end{equation*}
Let \(T\) be the linear transformation which is multiplication by \(A\text{.}\)
  1. Is the vector \((3,1,1)\) in the image of \(T\text{?}\) Explain your answer.
  2. The vector \(\bfw=(5,4,0)\) is in the image of \(T\text{.}\) Find one \(\bfx \in \ff_7^3\) such that \(T(\bfx) = \bfw\text{.}\)
  3. Is there more than one \(\bfx \in \ff_7^3\) such that \(T(\bfx) = \bfw\text{?}\) How do you know?
  4. Is \(T\) injective? Is \(T\) surjective? Explain your answers.

10.

Let \(T:\rr^3 \to \rr^2\) be the linear transformation which is multiplication by the following matrix:
\begin{equation*} A = \begin{bmatrix} 4 \amp -2 \amp 0 \\ 3 \amp 2 \amp 3 \end{bmatrix}\text{.} \end{equation*}
Give a description of all vectors \(\bfx \in \rr^3\) such that \(T(\bfx) = \mathbf{0}\text{.}\)

Writing Exercises

11.
Define the function \(T:C[0,\infty) \to C[0,\infty)\) to be the following:
\begin{equation*} (T(f))(x) = \int_0^x f(y)\;dy\text{.} \end{equation*}
Prove that \(T\) is a linear transformation. (See ExerciseΒ 2.4.5 for a reminder of the definition of \(C[0,\infty)\text{.}\))
Answer.
Let \(V = C[0,\infty)\) and let \(f, g \in V\text{.}\) Then we have
\begin{align*} T(f+g)(x) \amp = \int_0^x (f+g)(y)\;dy = \int_0^x (f(y) + g(y))\;dy\\ \amp = \int_0^x f(y)\;dy + \int_0^x g(y)\;dy = T(f)(x) + T(g)(x)\text{.} \end{align*}
We now let \(c \in \rr\) and \(f \in V\text{.}\) Then
\begin{align*} T(cf)(x) \amp = \int_0^x (cf)(y)\;dy = \int_0^x cf(y)\;dy\\ \amp = c\int_0^x f(y)\;dy = cT(f)(x)\text{.} \end{align*}
Both of these calculations rely on the linear properties of the definite integral.
12.
Let \(T:U \to V\) and \(S:V \to W\) be linear transformations between vector spaces over a field \(\ff\text{.}\) Prove that \(S \circ T\) is also a linear transformation.
Solution.
Let \(\bfu\) and \(\bfv\) be vectors in \(U\text{.}\) Then, since \(S\) and \(T\) are both linear transformations, we have
\begin{align*} (S\circ T)(\bfu + \bfv) \amp = S(T(\bfu + \bfv)) = S(T(\bfu) + T(\bfv))\\ \amp = S(T(\bfu)) + S(T(\bfv)) = (S\circ T)(\bfu) + (S\circ T)(\bfv)\text{.} \end{align*}
This proves that \(S\circ T\) has the first property of a linear transformation.
We now let \(c \in \ff\) and \(\bfu \in U\text{.}\) Then, since \(S\) and \(T\) are both linear transformations, we have
\begin{align*} (S\circ T)(c\bfu) \amp = S(T(c\bfu)) = S(cT(\bfu))\\ \amp = cS(T(\bfu)) = c(S\circ T)(\bfu)\text{.} \end{align*}
This shows that \(S\circ T\) is a linear transformation.
13.
Let \(T:U \to V\) and \(S:V \to W\) be linear transformations between vector spaces over a field \(\ff\text{.}\)
  1. Prove that if \(S\circ T\) is injective, then \(T\) must be injective.
  2. Prove that if \(S\circ T\) is surjective, then \(S\) must be surjective.
Solution.
  1. We assume that \(S\circ T\) is injective, and we let \(\bfu_1\) and \(\bfu_2\) be vectors in \(U\) with \(T(\bfu_1) = T(\bfu_2)\text{.}\) Then, since \(S \circ T\) is injective, we have
    \begin{align*} S(T(\bfu_1)) \amp = S(T(\bfu_2))\\ \bfu_1 \amp = \bfu_2\text{.} \end{align*}
    This proves that \(T\) is injective.
  2. We assume that \(S\circ T\) is surjective, and we let \(\bfw \in W\text{.}\) Since \(S\circ T\) is surjective, there is a vector \(\bfu \in U\) such that \((S\circ T)(\bfu) = \bfw\text{.}\) But this means that \(S(T(\bfu)) = \bfw\text{;}\) in other words, \(S\) sends \(T(\bfu)\) to \(\bfw\text{.}\) This proves that \(S\) is surjective, since we have found a vector \(\bfv=T(\bfu) \in V\) such that \(S(\bfv) = \bfw\text{.}\)
14.
Let \(T:V \to W\) be a function between vector spaces over \(\ff\text{.}\)
  1. If \(T\) is a linear transformation, must it be true that \(T(\mathbf{0}_V) = \mathbf{0}_W\text{?}\) Either prove this is true or produce a counterexample.
  2. If \(T(\mathbf{0}_V) = \mathbf{0}_W\text{,}\) must \(T\) be a linear transformation? Either prove this is true or produce a counterexample.
15.
Let \(\bfv_1, \ldots, \bfv_m\) be vectors which span a vector space \(V\text{.}\) If \(T:V \to V\) is a linear transformation for which \(T(\bfv_i)=\mathbf{0}\) for all \(i=1,\ldots,m\text{,}\) prove that \(T\) is the zero transformation. (In other words, prove that \(T(\bfx)=\mathbf{0}\) for all \(\bfx \in V\text{.}\))
16.
Let \(A\) be an \(m\times n\) matrix over a field \(\ff\text{,}\) and let \(\bfv\) be a vector in \(\ff^n\text{.}\) Prove that the formulations of the matrix-vector product given in (3.1) and (3.2) are equivalent.
18.
Finish the proof in ExampleΒ 3.1.16 that \(\rr^2\) and \(P_1\) are isomorphic. Specifically, prove that the linear transformation given in this example is bijective.